Writing and running code in Jupyter Notebooks
|
Use the Jupyter Notebook for editing and running Python.
The Notebook has Command and Edit modes.
Use the keyboard and mouse to select and edit cells.
The Notebook will turn Markdown into pretty-printed documentation.
Markdown does most of what HTML does.
Keep track of the sequence in which you run cells, and use kernel operations such as Restart and Clear Output to maintain control.
|
Python Fundamentals
|
Basic data types in Python include integers, strings, and floating-point numbers.
Use variable = value to assign a value to a variable in order to record it in memory.
Variables are created on demand whenever a value is assigned to them.
Use print(something) to display the value of something .
|
Data Types and Type Conversion
|
Every value has a type.
Use the built-in function type to find the type of a value.
Types control what operations can be done on values.
Strings can be added and multiplied.
Strings have a length (but numbers don’t).
Strings can be elegantly built up from variables by using f-string formatting.
Must convert numbers to strings or vice versa when operating on them.
Can mix integers and floats freely in operations.
Variables only change value when something is assigned to them.
|
Libraries
|
Most of the power of a programming language is in its libraries.
A program must import a library module in order to use it.
Use help to learn about the contents of a library module.
Import specific items from a library to shorten programs.
Create an alias for a library when importing it to shorten programs.
|
Analyzing Patient Data
|
Import a library into a program using import libraryname .
Use the numpy library to work with arrays in Python.
The expression array.shape gives the shape of an array.
Use array[x, y] to select a single element from a 2D array.
Array indices start at 0, not 1.
Use low:high to specify a slice that includes the indices from low to high-1 .
Use # some kind of explanation to add comments to programs.
Use numpy.mean(array) , numpy.max(array) , and numpy.min(array) to calculate simple statistics.
Use numpy.mean(array, axis=0) or numpy.mean(array, axis=1) to calculate statistics across the specified axis.
|
Visualizing Tabular Data
|
|
Repeating Actions with Loops
|
Use for variable in sequence to process the elements of a sequence one at a time.
The body of a for loop must be indented.
Use len(thing) to determine the length of something that contains other values.
|
Storing Multiple Values in Lists
|
[value1, value2, value3, ...] creates a list.
Lists can contain any Python object, including lists (i.e., list of lists).
Lists are indexed and sliced with square brackets (e.g., list[0] and list[2:9]), in the same way as strings and arrays.
Lists are mutable (i.e., their values can be changed in place).
Strings are immutable (i.e., the characters in them cannot be changed).
|
Analyzing Data from Multiple Files
|
Use glob.glob(pattern) to create a list of files whose names match a pattern.
Use * in a pattern to match zero or more characters, and ? to match any single character.
|
Beyond Lists - Tuples, Sets and Dictionaries
|
(value1, value2, value3, ...) - using parentheses - creates a tuple.
Tuples are iterables, like lists, and may be indexed and sliced in the same way.
Tuples are immutable (their values may not be changed in place) but the values themselves may be mutable (e.g. you can change the contents of a list that is given as a value).
zip() can be used to iterate through pairs or higher multiples of values in separate lists. The iterator produced can only be run through once unless converted to a list or tuple.
Sets contain the unique and unordered elements of an iterable, created using set() . They cannot be indexed or sliced.
Dictionaries contain key/value pairs, defined using {key1:value1, key2:value2, ....} or dict() with key/value pairs given as a list of tuples.
Dictionaries can be used to summarise and access information in a more intuitive way than a simple list of values.
|
Making Choices
|
Use if condition to start a conditional statement, elif condition to provide additional tests, and else to provide a default.
The bodies of the branches of conditional statements must be indented.
Use == to test for equality.
X and Y is only true if both X and Y are true.
X or Y is true if either X or Y , or both, are true.
Zero, the empty string, and the empty list are considered false; all other numbers, strings, and lists are considered true.
True and False represent truth values.
Conditions are tested once, in order.
while loops can be used to continue executing a loop, dependent on a conditional statement.
|
Creating Functions
|
Define a function using def function_name(parameter) .
The body of a function must be indented.
Call a function using function_name(value) .
Numbers are stored as integers or floating-point numbers.
Variables defined within a function can only be seen and used within the body of the function.
If a variable is not defined within the function it is used, Python looks for a definition before the function call
Use help(thing) to view help for something.
Put docstrings in functions to provide help for that function.
Specify default values for parameters when defining a function using name=value in the parameter list.
Parameters can be passed by matching based on name, by position, or by omitting them (in which case the default value is used).
Put code whose parameters change frequently in a function, then call it with different parameter values to customize its behavior.
The scope of a variable is the part of a program that can ‘see’ that variable.
|
Simple Input/Output
|
Use open with the write ('w' ), read ('r' ) and append ('a' ) methods to write, read and append strings to files.
Separate and successive lines can be read in using the readline() function or a for loop.
Remember to close opened files after use, or use with to contain operations on a file to an indented block of code.
Data of any type must be written to a file as complete strings. String formatting can be used to separate different data values in the string using white spaces, commas or other separators.
String formatting methods such as strip() and split() can be used to remove leading or trailing characters (such as newline commands) and split a string into discrete values according to the locations of the separators.
Data values that are read in as strings can be converted back to numerical or integer formats as required using e.g. the float() and int() commands.
|
Programming Style
|
|
Working with Numpy Arrays
|
Numpy arrays can be created from lists using numpy.array or via other numpy functions.
Like lists, numpy arrays are indexed in row-major order, with the last index read out fastest.
Numpy arrays can be edited and selected from using indexing and slicing, or have elements appended, inserted or deleted using using numpy.append , numpy.insert or numpy.delete .
Numpy arrays must be copied using numpy.copy or by operating on the array so that it isn’t changed, not using = which simply assigns another label to the same array, as for lists.
Use numpy.reshape , numpy.transpose (or .T ) to reshape arrays, and numpy.ravel to flatten them to a single dimension. Various numpy stack functions can be used to combine arrays.
numpy.genfromtxt can read data into structured numpy arrays. Columns must be referred to using the field name given to that column when the data is read in.
Conditional statements can be used to select elements from arrays with the same shape, e.g. that correspond to the same data set.
|
Array Calculations with Numpy
|
Numpy ufuncs operate element-wise (item by item) on an array.
Common mathematical operators applied to numpy arrays act as wrappers for fast array calculations.
Binary ufuncs operate on two arrays: if the arrays have different shapes which are compatible, the operation uses broadcasting rules.
Many operations and numerical methods (such as random number generation) can be carried out with numpy functions.
Arrays can be masked to allow unwanted elements (e.g. with nan values) to be ignored in array calculations using special masked array ufuncs.
Define your own functions that carry out complex array operations by combining different numpy functions.
|
Numerical Methods with Scipy
|
Scipy sub-packages need to be individually loaded - import scipy and then referring to the package name is not sufficient. Instead use, e.g. from scipy import fft .
Specific functions can also be loaded separately such as from scipy.interpolate import interp1d .
For model fitting when errors are normally distributed you can use scipy.optimize.curve_fit . For more general function minimization use scipy.optimize.minimize
Be careful with how Scipy’s Fast Fourier Transform results are ordered in the output arrays.
Always be careful to read the documentation for any Scipy sub-packages and functions to see how they work and what is assumed.
|
Introduction to Astropy
|
Astropy includes the core packages plus coordinated sub-packages and affiliated sub-packages (which need to be installed separately).
The astropy.units sub-package enables calculations to be carried out using self-consistent physical units.
astropy.constants enables calculations using physical constants using a whole range of physical units when combined with the units sub-package.
astropy.cosmology allows calculations of fundamental cosmological quantities such as the cosmological age or luminosity distance, for a specified cosmological model.
astropy.coordinates and astropy.time , provide a number of functions that can be combined to determine when a given target object can best be observed from a given location.
|
Working with FITS Data
|
FITS files can be read in and explored using the astropy.io.fits sub-package. The open command is used to open a datafile.
FITS files consist of one or more Header Data Units (HDUs) which include a header and possibly data, in the form of a table or image. The structure can be accessed using the .info() method
Headers contain sets of keyword/value pairs (like a dictionary) and optional comments, which describe the metadata for the data set, accessible using the .header['KEYWORD'] method.
Tables and images can be accessed using the .data method, which assigns table data to a structured array, while image data is assigned to an n-dimensional array which may be plotted with e.g. matplotlib’s imshow function.
|