This lesson is being piloted (Beta version)

Statistical inference - a practical approach

This is the course material for the Statistical Methods for the Physical Sciences course, beginning in January 2022. To easily run and modify the code examples and do the challenges and ‘intuition builders’, you will need to open a jupyter notebook (with magic command %matplotlib inline or %matplotlib notebook in the first cell to enable inline or interactive plotting), or another Python interpreter/IDE of your choice.

Prerequisites

The commands in this lesson pertain to Python 3.

Getting Started

To get started, follow the directions on the “Setup” page to download data and install an up-to-date Python installation such as Anaconda if you need to (including a jupyter notebook or another Python interpreter). For this course you will need to ensure that you have Numpy, Scipy and Pandas in your Python installation (Anaconda includes all three)

Schedule

Setup Download files required for the lesson
00:00 1. Introducing probability calculus and conditional probability How do we calculate with probabilities, taking account of whether some event is dependent on the occurrence of another?
01:20 2. Discrete random variables and their probability distributions How do we describe discrete random variables and what are their common probability distributions?
How do I calculate the means, variances and other statistical quantities for numbers drawn from probability distributions?
03:20 3. Continuous random variables and their probability distributions How are continuous probability distributions defined and described?
What happens to the distributions of sums or means of random data?
05:20 4. Joint probability distributions How do we define and describe the joint probability distributions of two or more random variables?
07:20 5. Bayes' Theorem What is Bayes’ theorem and how can we use it to answer scientific questions?
09:20 6. Working with and plotting large multivariate data sets How can I easily read in, clean and plot multivariate data?
11:20 7. Introducing significance tests and comparing means How do I use the sample mean to test whether a set of measurements is drawn from a given population, or whether two samples are drawn from the same population?
13:20 8. Multivariate data - correlation tests and least-squares fitting How do we determine if two measured variables are significantly correlated?
How do we carry out simple linear fits to data?
15:20 9. Confidence intervals, errors and bootstrapping How do we quantify the uncertainty in a parameter from its posterior distribution?
With minimal assumptions, can we use our data to estimate uncertainties in a variety of measurements obtained from it?
17:20 10. Maximum likelihood estimation and weighted least-squares model fitting What are the best estimators of the parameters of models used to explain data?
How do we fit models to normally distributed data, to determine the best-fitting parameters and their errors?
19:20 11. Confidence intervals on MLEs and fitting binned Poisson event data How do we calculate exact confidence intervals and regions for normally distributed model parameters?
How should we fit models to binned univariate data, such as photon spectra?
21:20 12. Likelihood ratio: model comparison and confidence intervals How do we compare hypotheses corresponding to models with different parameter values, to determine which is best?
23:20 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.