Essential Statistics for Non-STEM Data Analysts Get to Grips with the Statistics and Math Knowledge Needed to Enter the World of Data Science with Python

Put your data science knowledge to work with this practical guide to statistics. You'll understand the working mechanism of each method used and find out how data science algorithms function. This book will help you learn the statistical techniques required for key model building and functionin...

Full description

Bibliographic Details
Main Author: Li, Rongpeng
Format: eBook
Language:English
Published: Birmingham Packt Publishing, Limited 2020
Subjects:
Online Access:
Collection: O'Reilly - Collection details see MPG.ReNa
Table of Contents:
  • Frequencies and proportions
  • Transforming a continuous variable to a categorical one
  • Using bivariate and multivariate descriptive statistics
  • Covariance
  • Cross-tabulation
  • Summary
  • Chapter 3: Visualization with Statistical Graphs
  • Basic examples with the Python Matplotlib package
  • Elements of a statistical graph
  • Exploring important types of plotting in Matplotlib
  • Advanced visualization customization
  • Customizing the geometry
  • Customizing the aesthetics
  • Query-oriented statistical plotting
  • Example 1
  • preparing data to fit the plotting function API
  • Cover
  • Title Page
  • Copyright and Credits
  • About Packt
  • Contributors
  • Table of Contents
  • Preface
  • Section 1: Getting Started with Statistics for Data Science
  • Chapter 1: Fundamentals of Data Collection, Cleaning, and Preprocessing
  • Technical requirements
  • Collecting data from various data sources
  • Reading data directly from files
  • Obtaining data from an API
  • Obtaining data from scratch
  • Data imputation
  • Preparing the dataset for imputation
  • Imputation with mean or median values
  • Imputation with the mode/most frequent value
  • Outlier removal
  • Data standardization
  • when and how
  • Examples involving the scikit-learn preprocessing module
  • Imputation
  • Standardization
  • Summary
  • Chapter 2: Essential Statistics for Data Assessment
  • Classifying numerical and categorical variables
  • Distinguishing between numerical and categorical variables
  • Understanding mean, median, and mode
  • Mean
  • Median
  • Mode
  • Learning about variance, standard deviation, quartiles,percentiles, and skewness
  • Variance
  • Standard deviation
  • Quartiles
  • Skewness
  • Knowing how to handle categorical variables and mixed data types
  • Example 2
  • combining analysis with plain plotting
  • Presentation-ready plotting tips
  • Use styling
  • Font matters a lot
  • Summary
  • Section 2: Essentials of Statistical Analysis
  • Chapter 4: Sampling and Inferential Statistics
  • Understanding fundamental concepts in sampling techniques
  • Performing proper sampling under different scenarios
  • The dangers associated with non-probability sampling
  • Probability sampling
  • the safer approach
  • Understanding statistics associated with sampling
  • Sampling distribution of the sample mean
  • Standard error of the sample mean
  • The central limit theorem
  • Summary
  • Chapter 5: Common Probability Distributions
  • Understanding important concepts in probability
  • Events and sample space
  • The probability mass function and the probability density function
  • Subjective probability and empirical probability
  • Understanding common discrete probability distributions
  • Bernoulli distribution
  • Binomial distribution
  • Poisson distribution
  • Understanding the common continuous probability distribution
  • Uniform distribution
  • Exponential distribution
  • Normal distribution