Building statistical models in Python develop useful models for regression, classification, time series, and survival analysis

The ability to proficiently perform statistical modeling is a fundamental skill for data scientists and essential for businesses reliant on data insights. Building Statistical Models with Python is a comprehensive guide that will empower you to leverage mathematical and statistical principles in dat...

Full description

Bibliographic Details
Main Authors: Nguyen, Huy Hoang, Adams, Paul N. (Author), Miller, Stuart J. (Author)
Format: eBook
Language:English
Published: Birmingham, UK Packt Publishing Ltd. 2023
Edition:1st edition
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
Table of Contents:
  • Includes bibliographical references and index
  • Visualizing data types
  • Measuring and describing distributions
  • Measuring central tendency
  • Measuring variability
  • Measuring shape
  • The normal distribution and central limit theorem
  • The Central Limit Theorem
  • Bootstrapping
  • Confidence intervals
  • Standard error
  • Correlation coefficients (Pearson's correlation)
  • Permutations
  • Permutations and combinations
  • Permutation testing
  • Transformations
  • Summary
  • References
  • Chapter 3: Hypothesis Testing
  • The goal of hypothesis testing
  • Overview of a hypothesis test for the mean
  • Scope of inference
  • Spearman's rank correlation coefficient
  • Summary
  • Part 2: Regression Models
  • Chapter 6: Simple Linear Regression
  • Simple linear regression using OLS
  • Coefficients of correlation and determination
  • Coefficients of correlation
  • Coefficients of determination
  • Required model assumptions
  • A linear relationship between the variables
  • Normality of the residuals
  • Homoscedasticity of the residuals
  • Sample independence
  • Testing for significance and validating models
  • Model validation
  • Summary
  • Chapter 7: Multiple Linear Regression
  • Multiple linear regression
  • Cover
  • Copyright
  • Contributors
  • Table of Contents
  • Preface
  • Part 1: Introduction to Statistics
  • Chapter 1: Sampling and Generalization
  • Software and environment setup
  • Population versus sample
  • Population inference from samples
  • Randomized experiments
  • Observational study
  • Sampling strategies
  • random, systematic, stratified, and clustering
  • Probability sampling
  • Non-probability sampling
  • Summary
  • Chapter 2: Distributions of Data
  • Technical requirements
  • Understanding data types
  • Nominal data
  • Ordinal data
  • Interval data
  • Ratio data
  • Tests with more than two groups and ANOVA
  • Multiple tests for significance
  • ANOVA
  • Pearson's correlation coefficient
  • Power analysis examples
  • Summary
  • References
  • Chapter 5: Non-Parametric Tests
  • When parametric test assumptions are violated
  • Permutation tests
  • The Rank-Sum test
  • The test statistic procedure
  • Normal approximation
  • Rank-Sum example
  • The Signed-Rank test
  • The Kruskal-Wallis test
  • Chi-square distribution
  • Chi-square goodness-of-fit
  • Chi-square test of independence
  • Chi-square goodness-of-fit test power analysis
  • Hypothesis test steps
  • Type I and Type II errors
  • Type I errors
  • Type II errors
  • Basics of the z-test
  • the z-score, z-statistic, critical values, and p-values
  • The z-score and z-statistic
  • A z-test for means
  • z-test for proportions
  • Power analysis for a two-population pooled z-test
  • Summary
  • Chapter 4: Parametric Tests
  • Assumptions of parametric tests
  • Normally distributed population data
  • Equal population variance
  • T-test
  • a parametric hypothesis test
  • T-test for means
  • Two-sample t-test
  • pooled t-test
  • Two-sample t-test
  • Welch's t-test
  • Paired t-test