A quick tour of scientific computing with Python

This quick tour introduces the most fundamental concepts of using Python for scientific computing. By the end, you will have seen all of the fundamentals: how to read data from a file, the rudiments of programming, how to output data, and how to generate simple plots. This is the starting point for all that follows. Once you have finished this tour you will be ready to solve a huge number of problems that require a computer. Let's get started.

Preparations

Start by creating a folder or directory on your computer where you can store all of your program files. In the examples below I created a folder called myPython in my Documents directory.

You can start a programming session in several different ways. One is to use an integrated development environment (IDE) like Spyder. If you are using an IDE that uses IPython you can skip ahead to the section below called What is IPython?.

You can also run an interactive Python session from the command line. Access the command line by running the Terminal application on macOS systems or the Command Prompt program on a Windows computer.

Now use the cd (change directory) command to change to the myPython directory. This command is slightly different on macOS and Windows computers. Commands for both systems are shown below.¹

On macOS Computers

~$ cd Documents/myPython/
myPython$

On Windows Computers

C:\Users\yourusername>cd \Users\yourusername\Documents\myPython
C:\Users\yourusername\Documents\myPython>

The primary difference between macOS and Windows systems is that Windows uses back-slashes '\' to separate directory names on the command line. Unix based operating systems like macOS use a forward-slash '/'.

Running Python

We can run Python two ways. We can either write a program (sometimes called a script) and then tell Python to execute it or we can run Python interactively and enter commands to be interpreted one at a time. Let's start by running an interactive Python session. To start an interactive session, type python at the system command line. You should see something like this:

~$ python
Python 3.5.1 |Anaconda 4.0.0 (x86_64)| (default, Dec  7 2015, 11:24:55) 
[GCC 4.2.1 (Apple Inc. build 5577)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>>

You can now type Python commands at the >>> prompt. As soon as you press return after entering a statement, Python executes the statement and displays the result. (If you get the message command not found when you try to run python it probably means that location of the python command isn't specified in your PATH variable. (For more information about setting your PATH see Tutorial Eight in the UNIX Tutorial for Beginners.)

Here is what you get when you use the Python print() command:

>>> print('Hello World!')
Hello World!
>>> x = 2*2
>>> print(x)
4
>>> x
4
>>>

The x = 2*2 command multiplies the two numbers and assigns the result to x. The * is the Python multiplication symbol; +, -, and / are the addition, subtraction, and division operators. Typing x by itself displays the value of x. Run Python and try typing in a few commands. Don't be afraid to experiment. What does the ** operator do? When you are done using Python you can exit it and move back to the system command line by holding the control key down and pressing d (^d) or typing exit().

Of course, the print() command isn't very useful here. It just prints out what you just typed, but you can also save commands in a text file for execution later. Try this by opening your favorite text editor and typing a few Python commands. For example, you could type:

print('Hello world!')
x = 2*2
print('2 * 2 is')
print(x)

Save your text file as hello.py in your Documents/myPython folder. The .py extension on the file name is the standard extension for all Python code. Run your program by typing python hello.py at the system command-line prompt. You have just written your first script. (Don't forget to use the Unix cd command to move to the directory where the program is stored before trying to run it.) I got the following output for the program above.

~$ python hello.py
Hello world!
2 * 2 is
4
~$

This is the standard way to run Python interactively and execute Python programs, but there is a better way using IPython.

What is IPython?

IPython is is essentially an enhanced interactive version of the Python command line. Run IPython by typing ipython on the system command line.

You can run your newly created Python program from inside IPython by typing run YourProgramName at the IPython prompt. For example:

~$ ipython
Python 3.5.4 |Anaconda custom (64-bit)| (default, Nov  3 2017, 13:44:35) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: run hello
Hello world!
2 * 2 is
4

In [2]:

IPython has lots of useful features. One of the most useful features is tab completion. Tab completion means if you to type the first few characters of any command then hit tab, IPython will compete typing the command for you. If the letter's you've typed aren't complete, it will give you a list of alternatives.

In [2]: pr<tab>
print     property  

In [2]: pri<tab>nt('hello')
hello

IPython helps you edit commands as well. You can press the up-arrow and down-arrow keys to scroll through previous commands. The left-arrow and right-arrow keys allow you to move the cursor back and forth so you can edit the commands. Run IPython and try out these features. IPython's help command is great for learning how any Python command works. You'll learn more about using help on the Python Programming Essentials page. These features of IPython are so useful that we will use IPython for all the examples on this site. But, remember, IPython is just a more enhanced front end for Python.

Python as a scientific calculator

One handy way to use Python is as scientific calculator. Below are some examples of simple arithmetic. Try some operations of your own. Numbers can be expressed in scientific notation, for example 9.11 × 10^-31 = 9.11e-31.

In [1]: x = 1.5e-2

In [2]: y = 2.6

In [3]: x*y
Out[3]: 0.039

In [4]: x**y
Out[4]: 1.8106615259996067e-05

In [5]: x/y
Out[5]: 0.0057692307692307687

Python Libraries

The examples above are quite simple and you'd probably rather use your calculator, but Python has a huge community of developers who have created thousands of specialized libraries that you can import into your own programs. The most useful libraries for scientific computing are listed in the table below.

NumPy	NumPy is the fundamental library for scientific computing with Python. It contains code for dealing with numerical arrays, special functions, and linear algebra.
Matplotlib	Matplotlib is a 2D plotting library.
SciPy	SciPy is a library of specialized scientific computing tools including numerical integration, solving of ODEs, fourier transforms, and much more. Essentially all of its libraries require NumPy to also be loaded.

A Python library is simply a collection of Python modules. A module is a file containing Python definitions and statements. You'll learn more about modules on the Python Programming Essentials page.

You import a module with the import command. For example, import numpy will import the NumPy library and allow you to compute sin(x) by typing numpy.sin(x). Prepending the numpy. to every command becomes cumbersome. You will probably find it more convenient to use the alternative command import numpy as np to import the module. Importing in this way allows you to prepend np. to each command in the library rather than having to prepend numpy.. For example, to call sin(x) function you would type np.sin(x).

You can simplify the call even more by using from numpy import *. The '*' is the wildcard symbol * and imports all of the library's functions and definitions into your session. You could then call sin(x) as sin(x), but this can be dangerous if you are importing more than one library. This is because both of the libraries might have functions or constants with the same name, but different definitions. Two functions having the same name is called a name collision. You can avoid name collisions by using the import numpy as np syntax to import libraries. The only exception is when we use the PyLab library alone (see below). If only one library is imported it is safe to use the wildcard symbol.

The SciPy constants module

The scipy.constants module imports a variety of constants and conversion factors—the most useful ones are shown in the tables below. Go to the SciPy documentation to see a full list of available definitions.

Mathematical constants
`pi`	π
`golden`	Golden ratio

Physical constants
`c`	speed of light in vacuum
`mu_0`	the magnetic constant μ₀
`epsilon_0`	the electric constant (vacuum permittivity), ε₀
`h`	the Planck constant `h`
`hbar`	`h`/2π
`G`	Newtonian constant of gravitation
`g`	standard acceleration of gravity
`e`	elementary charge
`R`	molar gas constant
`alpha`	fine-structure constant
`N_A`	Avogadro constant
`k`	Boltzmann constant
`sigma`	Stefan-Boltzmann constant σ
`Wien`	Wien displacement law constant
`Rydberg`	Rydberg constant
`m_e`	electron mass
`m_p`	proton mass
`m_n`	neutron mass

Let's use the scipy.constants module to solve a simple problem. Suppose you need to calculate the magnitude of the force between an electron and a proton when they are 1 nm away from one another. The magnitude of the force between two charged particles is given by the Coulomb force law, $$F = \frac{1}{4 \pi \epsilon_0} \frac{q_1 q_2}{r^2}, $$ where in this case, r = 1 nm, and $q_1 = q_2 = e$. It is trivial to calculate this using Python and the SciPy Constants library. Import the scipy.constants module and then enter the following commands.

In [7]: import scipy.constants as sc

In [8]: r = 1e-9

In [10]: F = sc.e**2/(4 * sc.pi * sc.epsilon_0 * r**2)

In [11]: F
Out[11]: 2.3070775130706552e-10

The first command sets the distance r in meters. The second command computes the force. The permittivity of free space, ε₀, is sc.epsilon_0, sc.e is the fundamental electrical charge, and π is sc.pi. The scipy.constants module uses SI units unless otherwise noted in the documentation. You'll have to keep track of the units yourself.

The PyLab libarary

Using libraries allows you to take advantage of all the code other programmers have written. The scipy.constants module is an example. Someone had to type in all of those constants the first time, but they have saved you the effort. You can just import their work.

One very useful library that imports almost all of the modules you will need is called PyLab. PyLab actually imports modules from NumPy, Matplotlib, and SciPy. It simulates the computation environment of MATLAB. The PyLab library contains a large number of programs that allow you to do everything from make plots to solving differential equations. If you are using the PyLab library and you are sure there won't be any name collisions with other libraries it is generally safe to import PyLab by using the wildcard symbol. If you import PyLab in this way, you can run any PyLab program by simply typing the name of the program. As an example, lets create two arrays of numbers and then plot the result. (You don't need to type the comments after the # symbol.²)

In [1]: from pylab import *    # import the PyLab package

In [2]: x=linspace(-4.,4.,100) # create the x array

In [3]: y=x**2                 # create the y array

In [4]: plot(x,y)              # plot y versus x
Out[4]: [<matplotlib.lines.Line2D object at 0x37a11f0>]

The resulting plot should look like the one below. If the plot doesn't appear after you have issued the plot(x,y) command, type show() to make the plot visible.³

The first command imports all of the PyLab commands and constants. The next line creates an array of 100 equally-spaced numbers that range for -4 to 4 and assigns the array to the variable x. The y=x**2 command creates the y array by squaring each of the elements in the x array. The plot command plots the result in a separate window.

Exercise 1

Let's exercise your newfound knowledge. Write a program to create a plot of the magnitude of the force between two electrons for separation distances from 0.1 nm to 0.5 nm. Here are a few extra commands you might find handy for making a professional looking plot:

`title('plot title')`	puts a title above your plot
`xlabel('x-axis label')`	labels the x-axis
`ylabel('y-axis label')`	labels the y-axis

You can download a solution to this exercise from the downloads page.

Programs and functions

Before we move on to writing truly useful programs we have learn how to write a Python function. As far as we are concerned, a function is just code that takes some input as arguments and returns some output. An example is the sine function. The Python version of sine takes a number as its argument and returns the sine of that angle. (It assumes the argument is in radians not degrees.)

In [12]: from pylab import *

In [13]: y=sin(pi/4)

In [14]: print(y)
0.707106781187

Let's write a function that computes sin(x)/x. Use a text editor and type in the following code. Save the code in a file called funcmodule.py.

from pylab import * 
def myfunc(x):
    return sin(x)/x

The indentation in front of the return statement is essential. The indentation tells Python that the indented code is to be executed as part of the function. If you forget to indent, then Python thinks this function has no content. The best convention is to indent by four spaces. Most text editors will allow you to set a <tab> to be translated to four spaces.

In order to make your function available in Python you have to import your module (yes, you have just written your first Python module). Let's test the function.

In [1]: from funcmodule import myfunc

In [2]: x=3.5

In [3]: myfunc(x)
Out[3]: -0.10022377933989138

In [4]: sin(x)/x
Out[4]: -0.10022377933989138

It seems to work!

Exercise 2

Write a program using myfunc() to plot sin(x)/x from -5π to 5π. Don't forget to label the axes and title the plot.

You can download a solution to this exercise from the downloads page.

Putting it all together

Often scientific programming involves inputing data, processing the data, and displaying the result. Let's do a simple example to illustrate the process. Suppose we just landed on a distant planet and wanted to determine the planet's acceleration due to gravity. We decide to do this by dropping a ball from several different heights and measuring the time it takes for the ball to hit the ground. Let $d$ be the distance the ball falls. This distance is related to the free-fall time $t$ by the equation $$d = \frac{1}{2} g t^2,$$ where $g$ is the acceleration due to gravity. If we plot d versus t we should get second-degree polynomial. The factor in front of the $t^2$ term should be equal to $g/2$.

Suppose we drop the ball nine times and get the data shown in the table below. (The data is available in the file FreeFallData.txt on the downloads page.)

time (s)	distance (m)
1.727737	10.000000
2.518588	21.250000
3.114722	32.500000
3.613821	43.750000
4.051902	55.000000
4.447035	66.250000
4.809816	77.500000
5.147090	88.750000
5.463584	100.000000

Let's write a program to read the data into two arrays, plot it, and then determine $g$ by fitting a second-degree polynomial. Here's an outline of the program.

Outline of program findg.py

Read the data from the file
Plot the data and label the graph
Fit a polynomial to the data
Plot the fit on the same graph
Print out the value of g from the fit

An outline like the one above is a good first step in writing any program. Notice that the program below is broken in to sections with each section corresponding to a line in the outline. The program introduces some new commands that are described below.

"""
findg.py
Program to determine the accelaration due to gravity from distance 
versus time data
"""
from pylab import *

## Read the data from the file
t,d = loadtxt('FreeFallData.txt', unpack=True)

## Plot the data and label the graph
plot(t,d,'bo')
xlabel('time (s)')
ylabel('distance (m)')

## Fit a polynomial to the data
(a2,a1,a0) = polyfit(t,d,2)

## Plot the fit
d_plot = a2*t**2 + a1*t + a0
plot(t,d_plot,'r-')

## Show the graph
show()

## Print out the value of g from the fit 
g = 2.*a2
print("the acceleration due to gravity is:")
print(g)

The text in the first lines enclosed by the triple quotes (""") is called the docstring. You can type anything you want in the docstring. It's just used to document the code and isn't executed by the Python interpreter. After you've importing your modules, typing help('program_name') at the Python prompt will print the docstring as well as some other information. The line after the docstring is the import command for loading the PyLab package. The next section consists of a single line that calls the loadtxt() function. (Remember any line that starts with a '#' is just a comment and isn't executed.) The function loadtxt() reads the file named FreeFallData.txt and creates two arrays, t and d. The t array contains the time data that was stored in the first column of FreeFallData.txt, the d array contains the distance data from the second column. The unpack=True argument is needed so that loadtxt() puts the columns of data into the arrays properly. You can check to see if the arrays were loaded properly by using the len(). The command len(t) will return the number of elements in the t array. If it returns 0 you will know loadtxt() didn't load the array correctly.

The next section of code contains the plotting commands. The first two arguments of the plot(x,y,'bo') command are just the arrays to be plotted on the x and y axes respectively. The third argument 'bo' specifies the format for the plot. In this case 'bo' specifies the points on the plot should be displayed as blue circles. The format argument 'r-' would make a solid red line from point to point. You can learn more about plotting on on the matplotlib plot documentation page.

The function polyfit(x,y,deg) fits the data to a polynomial of degree deg. In this case, polyfit() finds the values a₂, a₁, and a₀ so that the function y(x) = a₂ x² + a₁ x + a₀ gives the best fit to the data. The two lines of code after the call to polyfit() plot the fit as a solid red line on the same graph. The show() command causes the plots to appear. You need this command because, by default, the plotting commands don't draw anything until show() function is called.⁴

The value of the acceleration due to gravity g = 2a₂. The last section of the code prints the result for g.

Download the code and the data file from the downloads page. Make sure you can run the code. Once you are able to run the program and you understand how it works, try Exercise 3.

Exercise 3

The goal of this exercise is to verify Kepler's third law using the orbital data of the Galilean moons of Jupiter. Kepler's third law is $$P^2 = \frac{4 \pi^2}{G (M+m)}a^3, $$ where P is the period, a is the semimajor axis, M is the mass of Jupiter in this case and m is the mass of the moon (m << M). Its not too hard to show that this equation can be rewritten as $$\log(P) = \log\left(\frac{2 \pi}{\sqrt{GM}}\right) + \frac{3}{2}\log(a). $$ If we plot log(P) versus log(a) for the four Galilean moons we should get a straight line with a slope of 3/2. The intercept will allow us to calculate the mass of Jupiter. The table below gives the semimajor axes and periods for the Galilean moons. Verify that the slope of a log-log plot of these data is 3/2 and determine the mass of Jupiter. (The Numpy function log10() calculates the log base 10. The sqrt() function calculates the square root.)

Moon	Io	Europa	Ganymede	Callisto
Period (days)	1.769	3.552	7.155	16.689
Semimajor axis (m)	421.6×10⁶	670.9×10⁶	1070.4×10⁶	1882.7×10⁶

Command	Module	Description
`python`	NA	Command entered at the system command line to start Python
`ipython`	NA	Command entered at the system command line to start IPython
`print('some text')`	built-in	Python print command
* + = /	built-in	Multiplication, addition, subtraction, and division
**	built-in	`x**y` is `x` to the power `y`
`exit()`	built-in	Exits Python
`from modulename import *`	built-in	Imports all of the code in `modulename` into an interactive Python session
`import modulename as mn`	built-in	Imports all of the code in `modulename` into an interactive Python session. Attributes, functions, and methods in `modulename` are referenced using `mn.` notation. For example to reference the function `func()`, in your code use `mn.func()`.
`linspace(start,stop,num=n)`	`pylab (numpy)`	Creates an evenly spaced array of `n` elements from `start` to `stop`
`x,y,z=loadtxt(filename,usecols=(0,2,5),unpack=True)`	`pylab (numpy)`	Loads data from a text file. `filename` is the name of the text file. `usecols` specifies which columns to read, with 0 being the first. `unpack=True` transposes the returned array, so that the data may be unpacked using `x,y,z = loadtxt(...)`
`len(x)`	built-in	Returns the length or number of elements in the array `x`.
`plot(x,y,'bo')`	`pylab (matplotlib.pyplot)`	Plot `x` and `y` using blue circle markers (`bo`)
`xlabel('axis label')`	`pylab (matplotlib.pyplot)`	Add a label to the x axis of a plot
`ylabel('axis label')`	`pylab (matplotlib.pyplot)`	Add a label to the y axis of a plot
`title('plot title')`	`pylab (matplotlib.pyplot)`	Add a title to a plot
`show()`	`pylab (matplotlib.pyplot)`	Displays all plots in figure windows in non-interactive mode.
`fitpar=polyfit(x,y,deg)`	`pylab (numpy)`	Fit a polynomial of degree `deg` (i.e. if `deg = 2`, `y`(`x`) = `a`₂ `x`² + `a`₁ `x` + `a`₀, then `fitpar` = `a`₂, `a`₁, `a`₀)
`sin(x)`	`pylab (numpy)`	Returns the sine of `x` (`x` in radians) `x` can be a single number or an array
`log10(x)`	`pylab (numpy)`	Returns the base 10 log of `x` `x` can be a single number or an array
`sqrt(x)`	`pylab (numpy)`	Returns the square root of `x` `x` can be a single number or an array

^{1. [The command line prompt might look a little different on your computer, but it almost always ends with a $ on a macOS computer and a > on a Windows computer.]↩}

^{2. [The text after a # is treated as a comment by Python and not executed. The comments here are included to explain what each command does. You don't need to type them in when you try the example.]↩}

^{3. [If you start Ipython with the command ipython --matplotlib, you won't need to type the show() command and the plot will appear in a separate window.]↩}

^{4. [If you are entering commands interactively in an Ipython window started with the ipython --matplotlib command, Ipython is smart enough to plot things as you enter the commands. For example, if you call the plot() function, a figure window appears and presents the plot. If you then use the xlabel() to label the x-axis the label appears on the plot. You never need to use the show() command while working interactively in Ipython if you started it with the command ipython --matplotlib.]↩}