Glossary

Key Points

Writing and running code in Jupyter Notebooks	Use the Jupyter Notebook for editing and running Python. The Notebook has Command and Edit modes. Use the keyboard and mouse to select and edit cells. The Notebook will turn Markdown into pretty-printed documentation. Markdown does most of what HTML does. Keep track of the sequence in which you run cells, and use kernel operations such as Restart and Clear Output to maintain control.
Python Fundamentals	Basic data types in Python include integers, strings, and floating-point numbers. Use `variable = value` to assign a value to a variable in order to record it in memory. Variables are created on demand whenever a value is assigned to them. Use `print(something)` to display the value of `something`.
Data Types and Type Conversion	Every value has a type. Use the built-in function `type` to find the type of a value. Types control what operations can be done on values. Strings can be added and multiplied. Strings have a length (but numbers don’t). Strings can be elegantly built up from variables by using f-string formatting. Must convert numbers to strings or vice versa when operating on them. Can mix integers and floats freely in operations. Variables only change value when something is assigned to them.
Libraries	Most of the power of a programming language is in its libraries. A program must import a library module in order to use it. Use `help` to learn about the contents of a library module. Import specific items from a library to shorten programs. Create an alias for a library when importing it to shorten programs.
Analyzing Patient Data	Import a library into a program using `import libraryname`. Use the `numpy` library to work with arrays in Python. The expression `array.shape` gives the shape of an array. Use `array[x, y]` to select a single element from a 2D array. Array indices start at 0, not 1. Use `low:high` to specify a `slice` that includes the indices from `low` to `high-1`. Use `# some kind of explanation` to add comments to programs. Use `numpy.mean(array)`, `numpy.max(array)`, and `numpy.min(array)` to calculate simple statistics. Use `numpy.mean(array, axis=0)` or `numpy.mean(array, axis=1)` to calculate statistics across the specified axis.
Visualizing Tabular Data	Use the `pyplot` module from the `matplotlib` library for creating simple visualizations.
Repeating Actions with Loops	Use `for variable in sequence` to process the elements of a sequence one at a time. The body of a `for` loop must be indented. Use `len(thing)` to determine the length of something that contains other values.
Storing Multiple Values in Lists	`[value1, value2, value3, ...]` creates a list. Lists can contain any Python object, including lists (i.e., list of lists). Lists are indexed and sliced with square brackets (e.g., list[0] and list[2:9]), in the same way as strings and arrays. Lists are mutable (i.e., their values can be changed in place). Strings are immutable (i.e., the characters in them cannot be changed).
Analyzing Data from Multiple Files	Use `glob.glob(pattern)` to create a list of files whose names match a pattern. Use `*` in a pattern to match zero or more characters, and `?` to match any single character.
Beyond Lists - Tuples, Sets and Dictionaries	`(value1, value2, value3, ...)` - using parentheses - creates a tuple. Tuples are iterables, like lists, and may be indexed and sliced in the same way. Tuples are immutable (their values may not be changed in place) but the values themselves may be mutable (e.g. you can change the contents of a list that is given as a value). `zip()` can be used to iterate through pairs or higher multiples of values in separate lists. The iterator produced can only be run through once unless converted to a list or tuple. Sets contain the unique and unordered elements of an iterable, created using `set()`. They cannot be indexed or sliced. Dictionaries contain key/value pairs, defined using `{key1:value1, key2:value2, ....}` or `dict()` with key/value pairs given as a list of tuples. Dictionaries can be used to summarise and access information in a more intuitive way than a simple list of values.
Making Choices	Use `if condition` to start a conditional statement, `elif condition` to provide additional tests, and `else` to provide a default. The bodies of the branches of conditional statements must be indented. Use `==` to test for equality. `X and Y` is only true if both `X` and `Y` are true. `X or Y` is true if either `X` or `Y`, or both, are true. Zero, the empty string, and the empty list are considered false; all other numbers, strings, and lists are considered true. `True` and `False` represent truth values. Conditions are tested once, in order. `while` loops can be used to continue executing a loop, dependent on a conditional statement.
Creating Functions	Define a function using `def function_name(parameter)`. The body of a function must be indented. Call a function using `function_name(value)`. Numbers are stored as integers or floating-point numbers. Variables defined within a function can only be seen and used within the body of the function. If a variable is not defined within the function it is used, Python looks for a definition before the function call Use `help(thing)` to view help for something. Put docstrings in functions to provide help for that function. Specify default values for parameters when defining a function using `name=value` in the parameter list. Parameters can be passed by matching based on name, by position, or by omitting them (in which case the default value is used). Put code whose parameters change frequently in a function, then call it with different parameter values to customize its behavior. The scope of a variable is the part of a program that can ‘see’ that variable.
Simple Input/Output	Use `open` with the write (`'w'`), read (`'r'`) and append (`'a'`) methods to write, read and append strings to files. Separate and successive lines can be read in using the `readline()` function or a `for` loop. Remember to close opened files after use, or use `with` to contain operations on a file to an indented block of code. Data of any type must be written to a file as complete strings. String formatting can be used to separate different data values in the string using white spaces, commas or other separators. String formatting methods such as `strip()` and `split()` can be used to remove leading or trailing characters (such as newline commands) and split a string into discrete values according to the locations of the separators. Data values that are read in as strings can be converted back to numerical or integer formats as required using e.g. the `float()` and `int()` commands.
Programming Style	Follow standard Python style in your code. Use docstrings to provide builtin help.
Working with Numpy Arrays	Numpy arrays can be created from lists using `numpy.array` or via other numpy functions. Like lists, numpy arrays are indexed in row-major order, with the last index read out fastest. Numpy arrays can be edited and selected from using indexing and slicing, or have elements appended, inserted or deleted using using `numpy.append`, `numpy.insert` or `numpy.delete`. Numpy arrays must be copied using `numpy.copy` or by operating on the array so that it isn’t changed, not using `=` which simply assigns another label to the same array, as for lists. Use `numpy.reshape`, `numpy.transpose` (or `.T`) to reshape arrays, and `numpy.ravel` to flatten them to a single dimension. Various `numpy` `stack` functions can be used to combine arrays. `numpy.genfromtxt` can read data into structured numpy arrays. Columns must be referred to using the field name given to that column when the data is read in. Conditional statements can be used to select elements from arrays with the same shape, e.g. that correspond to the same data set.
Array Calculations with Numpy	Numpy ufuncs operate element-wise (item by item) on an array. Common mathematical operators applied to numpy arrays act as wrappers for fast array calculations. Binary ufuncs operate on two arrays: if the arrays have different shapes which are compatible, the operation uses broadcasting rules. Many operations and numerical methods (such as random number generation) can be carried out with numpy functions. Arrays can be masked to allow unwanted elements (e.g. with `nan` values) to be ignored in array calculations using special masked array ufuncs. Define your own functions that carry out complex array operations by combining different numpy functions.
Numerical Methods with Scipy	Scipy sub-packages need to be individually loaded - `import scipy` and then referring to the package name is not sufficient. Instead use, e.g. `from scipy import fft`. Specific functions can also be loaded separately such as `from scipy.interpolate import interp1d`. For model fitting when errors are normally distributed you can use `scipy.optimize.curve_fit`. For more general function minimization use `scipy.optimize.minimize` Be careful with how Scipy’s Fast Fourier Transform results are ordered in the output arrays. Always be careful to read the documentation for any Scipy sub-packages and functions to see how they work and what is assumed.
Introduction to Astropy	Astropy includes the core packages plus coordinated sub-packages and affiliated sub-packages (which need to be installed separately). The `astropy.units` sub-package enables calculations to be carried out using self-consistent physical units. `astropy.constants` enables calculations using physical constants using a whole range of physical units when combined with the `units` sub-package. `astropy.cosmology` allows calculations of fundamental cosmological quantities such as the cosmological age or luminosity distance, for a specified cosmological model. `astropy.coordinates` and `astropy.time`, provide a number of functions that can be combined to determine when a given target object can best be observed from a given location.
Working with FITS Data	FITS files can be read in and explored using the `astropy.io.fits` sub-package. The `open` command is used to open a datafile. FITS files consist of one or more Header Data Units (HDUs) which include a header and possibly data, in the form of a table or image. The structure can be accessed using the `.info()` method Headers contain sets of keyword/value pairs (like a dictionary) and optional comments, which describe the metadata for the data set, accessible using the `.header['KEYWORD']` method. Tables and images can be accessed using the `.data` method, which assigns table data to a structured array, while image data is assigned to an n-dimensional array which may be plotted with e.g. matplotlib’s `imshow` function.

additive color model: A way to represent colors as the sum of contributions from primary colors such as red, green, and blue.
argument: A value given to a function or program when it runs. The term is often used interchangeably (and inconsistently) with parameter.
assertion: An expression which is supposed to be true at a particular point in a program. Programmers typically put assertions in their code to check for errors; if the assertion fails (i.e., if the expression evaluates as false), the program halts and produces an error message. See also: invariant, precondition, postcondition.
assign: To give a value a name by associating a variable with it.
body: (of a function): the statements that are executed when a function runs.
call stack: A data structure inside a running program that keeps track of active function calls.
case-insensitive: Treating text as if upper and lower case characters of the same letter were the same. See also: case-sensitive.
case-sensitive: Treating text as if upper and lower case characters of the same letter are different. See also: case-insensitive.
comment: A remark in a program that is intended to help human readers understand what is going on, but is ignored by the computer. Comments in Python, R, and the Unix shell start with a # character and run to the end of the line; comments in SQL start with --, and other languages have other conventions.
compose: To apply one function to the result of another, such as f(g(x)).
conditional statement: A statement in a program that might or might not be executed depending on whether a test is true or false.
comma-separated values: (CSV) A common textual representation for tables in which the values in each row are separated by commas.
default value: A value to use for a parameter if nothing is specified explicitly.
defensive programming: The practice of writing programs that check their own operation to catch errors as early as possible.
delimiter: A character or characters used to separate individual values, such as the commas between columns in a CSV file.
docstring: Short for “documentation string”, this refers to textual documentation embedded in Python programs. Unlike comments, docstrings are preserved in the running program and can be examined in interactive sessions.
documentation: Human-language text written to explain what software does, how it works, or how to use it.
dotted notation: A two-part notation used in many programming languages in which thing.component refers to the component belonging to thing.
empty string: A character string containing no characters, often thought of as the “zero” of text.
encapsulation: The practice of hiding something’s implementation details so that the rest of a program can worry about what it does rather than how it does it.
floating-point number: A number containing a fractional part and an exponent. See also: integer.
for loop: A loop that is executed once for each value in some kind of set, list, or range. See also: while loop.
function: A named group of instructions that is executed when the function’s name is used in the code. Occurrence of a function name in the code is a function call. Functions may process input arguments and return the result back. Functions may also be used for logically grouping together pieces of code. In such cases, they don’t need to return any meaningful value and can be written without the return statement completely. Such functions return a special value None, which is a way of saying “nothing” in Python.
function call: A use of a function in another piece of software.
immutable: Unchangeable. The value of immutable data cannot be altered after it has been created. See also: mutable.
import: To load a library into a program.
in-place operators: An operator such as += that provides a shorthand notation for the common case in which the variable being assigned to is also an operand on the right hand side of the assignment. For example, the statement x += 3 means the same thing as x = x + 3.
index: A subscript that specifies the location of a single value in a collection, such as a single pixel in an image.
inner loop: A loop that is inside another loop. See also: outer loop.
integer: A whole number, such as -12343. See also: floating-point number.
invariant: An expression whose value doesn’t change during the execution of a program, typically used in an assertion. See also: precondition, postcondition.
library: A family of code units (functions, classes, variables) that implement a set of related tasks.
loop variable: The variable that keeps track of the progress of the loop.
member: A variable contained within an object.
method: A function which is tied to a particular object. Each of an object’s methods typically implements one of the things it can do, or one of the questions it can answer.
mutable: Changeable. The value of mutable data can be altered after it has been created. See immutable.”
notebook: Interactive computational environment accessed via your web browser, in which you can write and execute Python code and combine it with explanatory text, mathematics and visualizations. Examples are IPython or Jupyter notebooks.
object: A collection of conceptually related variables (members) and functions using those variables (methods).
outer loop: A loop that contains another loop. See also: inner loop.
parameter: A variable named in the function’s declaration that is used to hold a value passed into the call. The term is often used interchangeably (and inconsistently) with argument.
pipe: A connection from the output of one program to the input of another. When two or more programs are connected in this way, they are called a “pipeline”.
postcondition: A condition that a function (or other block of code) guarantees is true once it has finished running. Postconditions are often represented using assertions.
precondition: A condition that must be true in order for a function (or other block of code) to run correctly.
regression: To re-introduce a bug that was once fixed.
return statement: A statement that causes a function to stop executing and return a value to its caller immediately.
RGB: An additive model that represents colors as combinations of red, green, and blue. Each color’s value is typically in the range 0..255 (i.e., a one-byte integer).
sequence: A collection of information that is presented in a specific order. For example, in Python, a string is a sequence of characters, while a list is a sequence of any variable.
shape: An array’s dimensions, represented as a vector. For example, a 5×3 array’s shape is (5,3).
silent failure: Failing without producing any warning messages. Silent failures are hard to detect and debug.
slice: A regular subsequence of a larger sequence, such as the first five elements or every second element.
stack frame: A data structure that provides storage for a function’s local variables. Each time a function is called, a new stack frame is created and put on the top of the call stack. When the function returns, the stack frame is discarded.
standard input: A process’s default input stream. In interactive command-line applications, it is typically connected to the keyboard; in a pipe, it receives data from the standard output of the preceding process.
standard output: A process’s default output stream. In interactive command-line applications, data sent to standard output is displayed on the screen; in a pipe, it is passed to the standard input of the next process.
string: Short for “character string”, a sequence of zero or more characters.
syntax: The rules that define how code must be written for a computer to understand.
syntax error: A programming error that occurs when statements are in an order or contain characters not expected by the programming language.
test oracle: A program, device, data set, or human being against which the results of a test can be compared.
test-driven development: The practice of writing unit tests before writing the code they test.
traceback: The sequence of function calls that led to an error.
tuple: An immutable sequence of values.
type: The classification of something in a program (for example, the contents of a variable) as a kind of number (e.g. floating-point, integer), string, or something else.
type of error: Indicates the nature of an error in a program. For example, in Python, an IOError to problems with file input/output. See also: syntax error.
variable: A value that has a name associated with it.
while loop: A loop that keeps executing as long as some condition is true. See also: for loop.

Programming for Astronomy and Astrophysics: Glossary

Key Points

Glossary