This lesson is being piloted (Beta version)

Statistical Methods for the Physical Sciences

This is the course material for Statistical Methods in the Physical Sciences. To easily run and modify the code examples and do the challenges and ‘intuition builders’, you will need to open a jupyter notebook (with magic command %matplotlib inline or %matplotlib notebook in the first cell to enable inline or interactive plotting), or another Python interpreter/IDE of your choice.

Prerequisites

The commands in this lesson pertain to Python 3.

Getting Started

To get started, follow the directions on the “Setup” page to download data and install an up-to-date Python installation such as Anaconda if you need to (including a jupyter notebook or another Python interpreter). For this course you will need to ensure that you have Numpy, Scipy and Pandas in your Python installation (Anaconda includes all three)

Schedule

Setup Download files required for the lesson
00:00 1. Looking at some univariate data: summary statistics and histograms How do we visually present univariate data (relating to a single variable)?
How can we quantify the data distribution in a simple way?
01:10 2. Introducing probability distributions How are probability distributions defined and described?
01:50 3. Random variables How do I calculate the means, variances and other statistical quantities for numbers drawn from probability distributions?
How is the error on a sample mean calculated?
02:40 4. The Central Limit Theorem What happens to the distributions of sums or means of random data?
03:10 5. Significance tests: the z-test - comparing with a population of known mean and variance How do I compare a sample mean with an expected value, when I know the true variance that the data are sampled from?
03:50 6. Significance tests: the t-test - comparing means when population variance is unknown How do I compare a sample mean with an expected value, when I don’t know the true variance that the data are sampled from?
How do I compare two sample means, when neither the true mean nor variance that the data are sampled from are known?
04:50 7. Discrete random variables and their probability distributions How do we describe discrete random variables and what are their common probability distributions?
05:40 8. Probability calculus and conditional probability How do we calculate with probabilities, taking account of whether some event is dependent on the occurrence of another?
06:30 9. Reading, working with and plotting multivariate data How can I easily read in, clean and plot multivariate data?
07:20 10. Correlation tests and least-squares fitting How do we determine if two measured variables are significantly correlated?
How do we carry out simple linear fits to data?
08:10 11. Bayes' Theorem What is Bayes’ theorem and how can we use it to answer scientific questions?
09:10 12. Maximum likelihood estimation What is the best way to estimate parameters of hypotheses?
09:40 13. Fitting models to data How do we fit multi-parameter models to data?
10:40 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.