Statistics

Mathematical statistics links probability models to data: design, inference, prediction, and decision under uncertainty.

Related disciplines

Curriculum overview

Level 3 tabs below group topics by statistical goal. Level 4 pages give definitions, core results, and worked examples in the same style as our algebra deep dives.

Descriptive statistics

Summaries such as mean, median, variance, and quantiles compress samples without yet claiming population truth. Pair every numerical summary with a plot (histogram, ECDF, boxplot).

\[ \bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i \qquad s^2 = \frac{1}{n-1}\sum_{i=1}^{n}(x_i - \bar{x})^2 \]

Probability distributions

Parametric families connect generative stories to formulas. The normal model anchors asymptotic arguments; exponential families unify estimation through sufficient statistics.

\[ f(x) = \frac{1}{\sigma\sqrt{2\pi}} \exp\Bigl(-\frac{(x-\mu)^2}{2\sigma^2}\Bigr) \]

Hypothesis testing

Tests control error rates under explicit assumptions. Report effect sizes and intervals alongside p-values; understand that p-values are not posterior probabilities.

Regression analysis

Linear models express responses as affine functions of covariates; OLS is geometric projection in Euclidean space. Residual analysis validates modeling choices.

\[ \hat{\beta} = (X^\top X)^{-1} X^\top Y \]

Statistics topics

Probability
Stochastic models

Probability

Axioms, conditioning, Bayes' rule, and the law of total probability.

Full article →
Normal distribution
Stochastic models

Normal distribution

Gaussian law, standardization, and central limit heuristics.

Full article →
Random variables
Stochastic models

Random variables

Measurable maps, expectation, variance, and transformations.

Full article →
Statistical distributions
Stochastic models

Statistical distributions

Binomial, Poisson, Gamma, Beta, χ², t, and F in context.

Full article →
Bayesian statistics
Stochastic models

Bayesian statistics

Priors, posteriors, conjugacy, and predictive checks.

Full article →
Descriptive statistics
Data description

Descriptive statistics

Location, scale, robust summaries, and exploratory graphics.

Full article →
Sampling methods
Data description

Sampling methods

SRS, stratified and cluster designs; inclusion probabilities.

Full article →
Inferential statistics
Inference

Inferential statistics

Estimators, sampling distributions, standard errors, and efficiency.

Full article →
Hypothesis testing
Inference

Hypothesis testing

Null and alternative hypotheses, p-values, power, and likelihood tests.

Full article →
Confidence intervals
Inference

Confidence intervals

Coverage, interpretation, Wald and bootstrap intervals.

Full article →
Regression analysis
Modeling

Regression analysis

Linear models, least squares, diagnostics, and extensions.

Full article →
Correlation
Modeling

Correlation

Pearson and Spearman measures; cautions about causality.

Full article →

In Depth

Statistics is the science of collecting, analyzing, interpreting, and presenting data. It divides into descriptive statistics (summarizing data with measures of center, spread, and shape) and inferential statistics (drawing conclusions about populations from samples using probability theory).

The central objects of statistics are random variables and their distributions. A random variable \(X\) assigns a numerical value to each outcome of a random experiment. Its distribution is characterized by the probability mass function (discrete) or probability density function (continuous), along with summary statistics: mean \(\mu=E[X]\), variance \(\sigma^2=E[(X-\mu)^2]\), and higher moments.

Estimation theory asks: given a sample \(X_1,\ldots,X_n\) from a distribution with unknown parameter \(\theta\), how do we estimate \(\theta\)? Maximum likelihood estimation (MLE) chooses \(\hat\theta\) to maximize the likelihood \(L(\theta)=\prod f(X_i;\theta)\). Under regularity conditions, MLE is consistent, asymptotically normal, and efficient (achieves the Cramér–Rao lower bound).

Hypothesis testing formalizes decision-making under uncertainty. A null hypothesis \(H_0\) is tested against an alternative \(H_1\). The p-value is the probability of observing data at least as extreme as the sample, assuming \(H_0\). A small p-value (typically \(<0.05\)) is evidence against \(H_0\). The significance level \(\alpha\) controls the Type I error rate.

Modern statistics increasingly uses computational methods: bootstrap resampling estimates sampling distributions without parametric assumptions; Markov chain Monte Carlo (MCMC) samples from complex posterior distributions in Bayesian analysis; cross-validation estimates predictive performance. These methods extend classical statistics to high-dimensional and complex data settings.

Key Properties & Applications

The law of large numbers (LLN) and central limit theorem (CLT) are the two pillars of classical statistics. The LLN guarantees that sample means converge to population means. The CLT explains why the normal distribution is ubiquitous: sums of independent random variables are approximately normal, regardless of the original distribution.

Bayesian statistics treats parameters as random variables with prior distributions. Bayes' theorem updates the prior to a posterior given data: \(p(\theta|x)\propto p(x|\theta)p(\theta)\). The posterior summarizes all information about \(\theta\) after observing \(x\). Bayesian methods naturally quantify uncertainty and incorporate prior knowledge.

Nonparametric statistics makes fewer distributional assumptions. The Wilcoxon rank-sum test, Kruskal–Wallis test, and Spearman correlation use ranks rather than raw values. The Kolmogorov–Smirnov test compares empirical distribution functions. These methods are robust to outliers and non-normality.