This lesson is being piloted (Beta version)

Statistical Methods '25

This is the course material for the Statistical Methods for the Physical Sciences course, beginning in January 2025. To easily run and modify the code examples and do the challenges and ‘intuition builders’, you will need to open a jupyter notebook (with magic command %matplotlib inline or %matplotlib notebook in the first cell to enable inline or interactive plotting), or another Python interpreter/IDE of your choice.

Prerequisites

The commands in this lesson pertain to Python 3.

Getting Started

To get started, follow the directions on the “Setup” page to download data and install an up-to-date Python installation such as Anaconda if you need to (including a jupyter notebook or another Python interpreter). For this course you will need to ensure that you have Numpy, Scipy and Pandas in your Python installation (Anaconda includes all three)

Schedule

Setup Download files required for the lesson
00:00 1. 1. Introducing probability and discrete random variables How do we describe discrete random variables and what are their common probability distributions?
How do I calculate the means, variances and other statistical quantities for numbers drawn from probability distributions?
02:00 2. 2. Continuous random variables and their probability distributions How are continuous probability distributions defined and described?
What happens to the distributions of sums or means of random data?
04:00 3. 3. Introducing significance tests and comparing means How do I use the sample mean to test whether a set of measurements is drawn from a given population, or whether two samples are drawn from the same population?
06:00 4. 4. Multivariate data - correlation tests How do we determine if two measured variables are significantly correlated?
08:00 5. 5. Maximum likelihood estimation and weighted least-squares model fitting What are the best estimators of the parameters of models used to explain data?
How do we fit models to normally distributed data, to determine the best-fitting parameters and their errors?
10:00 6. 6. Confidence intervals on MLEs, fitting binned Poisson event data and bootstrapping How do we calculate exact confidence intervals and regions for normally distributed model parameters?
How should we fit models to binned univariate data, such as photon spectra?
With minimal assumptions, can we use our data to estimate uncertainties in a variety of measurements obtained from it?
12:00 7. 7. Likelihood ratio: model comparison and confidence intervals How do we compare hypotheses corresponding to models with different parameter values, to determine which is best?
14:00 8. 8. Conditional probability and joint probability distributions How do we calculate with probabilities, taking account of whether some event is dependent on the occurrence of another?
How do we define and describe the joint probability distributions of two or more random variables?
16:00 9. 9. Bayes' Theorem What is Bayes’ theorem and how can we use it to answer scientific questions?
18:00 10. 10. Confidence intervals, errors and bootstrapping How do we quantify the uncertainty in a parameter from its posterior distribution?
20:00 11. 11. MCMC for model-fitting and error estimation How can we use MCMC methods to fit data and obtain MLEs and confidence intervals for models which may have many parameters, non-normal errors or complex posterior distributions?
22:00 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.