Numpy Basics

Unidata Logo

NumPy Basics

Unidata Python Workshop


NumPy Logo

Questions

  1. What are arrays?
  2. How can arrays be manipulated effectively in Python?

Objectives

  1. Create an array of ‘data’.
  2. Perform basic calculations on this data using python math functions.
  3. Slice and index the array

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

  • a powerful N-dimensional array object
  • sophisticated (broadcasting) functions
  • useful linear algebra, Fourier transform, and random number capabilities

The NumPy array object is the common interface for working with typed arrays of data across a wide-variety of scientific Python packages. NumPy also features a C-API, which enables interfacing existing Fortran/C/C++ libraries with Python and NumPy.

Create an array of 'data'

The NumPy array represents a contiguous block of memory, holding entries of a given type (and hence fixed size). The entries are laid out in memory according to the shape, or list of dimension sizes.

In [1]:
# Convention for import to get shortened namespace
import numpy as np
In [2]:
# Create a simple array from a list of integers
a = np.array([1, 2, 3])
a
Out[2]:
array([1, 2, 3])
In [3]:
# See how many dimensions the array has
a.ndim
Out[3]:
1
In [4]:
# Print out the shape attribute
a.shape
Out[4]:
(3,)
In [5]:
# Print out the data type attribute
a.dtype
Out[5]:
dtype('int64')
In [6]:
# This time use a nested list of floats
a = np.array([[1., 2., 3., 4., 5.]])
a
Out[6]:
array([[1., 2., 3., 4., 5.]])
In [7]:
# See how many dimensions the array has
a.ndim
Out[7]:
2
In [8]:
# Print out the shape attribute
a.shape
Out[8]:
(1, 5)
In [9]:
# Print out the data type attribute
a.dtype
Out[9]:
dtype('float64')

Poll

Please go to http://www.PollEv.com/johnleeman205 to take a quick poll.

NumPy also provides helper functions for generating arrays of data to save you typing for regularly spaced data.

  • arange(start, stop, interval) creates a range of values in the interval [start,stop) with step spacing.
  • linspace(start, stop, num) creates a range of num evenly spaced values over the range [start,stop].

arange

In [10]:
a = np.arange(5)
print(a)
[0 1 2 3 4]
In [11]:
a = np.arange(3, 11)
print(a)
[ 3  4  5  6  7  8  9 10]
In [12]:
a = np.arange(1, 10, 2)
print(a)
[1 3 5 7 9]

Poll

Please go to http://www.PollEv.com/johnleeman205 to take a quick poll.

linspace

In [13]:
b = np.linspace(5, 15, 5)
print(b)
[ 5.   7.5 10.  12.5 15. ]
In [14]:
b = np.linspace(2.5, 10.25, 11)
print(b)
[ 2.5    3.275  4.05   4.825  5.6    6.375  7.15   7.925  8.7    9.475
 10.25 ]

Poll

Please go to http://www.PollEv.com/johnleeman205 to take a quick poll.

Perform basic calculations with Python

Basic math

In core Python, that is without NumPy, creating sequences of values and adding them together requires writing a lot of manual loops, just like one would do in C/C++:

In [15]:
a = range(5, 10)
b = [3 + i * 1.5/4 for i in range(5)]
In [16]:
result = []
for x, y in zip(a, b):
    result.append(x + y)
print(result)
[8.0, 9.375, 10.75, 12.125, 13.5]

That is very verbose and not very intuitive. Using NumPy this becomes:

In [17]:
a = np.arange(5, 10)
b = np.linspace(3, 4.5, 5)
In [18]:
a + b
Out[18]:
array([ 8.   ,  9.375, 10.75 , 12.125, 13.5  ])

The four major mathematical operations operate in the same way. They perform an element-by-element calculation of the two arrays. The two must be the same shape though!

In [19]:
a * b
Out[19]:
array([15.  , 20.25, 26.25, 33.  , 40.5 ])

Constants

NumPy proves us access to some useful constants as well - remember you should never be typing these in manually! Other libraries such as SciPy and MetPy have their own set of constants that are more domain specific.

In [20]:
np.pi
Out[20]:
3.141592653589793
In [21]:
np.e
Out[21]:
2.718281828459045
In [22]:
# This makes working with radians effortless!
t = np.arange(0, 2 * np.pi + np.pi / 4, np.pi / 4)
t
Out[22]:
array([0.        , 0.78539816, 1.57079633, 2.35619449, 3.14159265,
       3.92699082, 4.71238898, 5.49778714, 6.28318531])

Array math functions

NumPy also has math functions that can operate on arrays. Similar to the math operations, these greatly simplify and speed up these operations. Be sure to checkout the listing of mathematical functions in the NumPy documentation.

In [23]:
# Calculate the sine function
sin_t = np.sin(t)
print(sin_t)
[ 0.00000000e+00  7.07106781e-01  1.00000000e+00  7.07106781e-01
  1.22464680e-16 -7.07106781e-01 -1.00000000e+00 -7.07106781e-01
 -2.44929360e-16]
In [24]:
# Round to three decimal places
print(np.round(sin_t, 3))
[ 0.     0.707  1.     0.707  0.    -0.707 -1.    -0.707 -0.   ]
In [25]:
# Calculate the cosine function
cos_t = np.cos(t)
print(cos_t)
[ 1.00000000e+00  7.07106781e-01  6.12323400e-17 -7.07106781e-01
 -1.00000000e+00 -7.07106781e-01 -1.83697020e-16  7.07106781e-01
  1.00000000e+00]
In [26]:
# Convert radians to degrees
degrees = np.rad2deg(t)
print(degrees)
[  0.  45.  90. 135. 180. 225. 270. 315. 360.]
In [27]:
# Integrate the sine function with the trapezoidal rule
sine_integral = np.trapz(sin_t, t)
print(np.round(sine_integral, 3))
-0.0
In [28]:
# Sum the values of the cosine
cos_sum = np.sum(cos_t)
print(cos_sum)
0.9999999999999996
In [29]:
# Calculate the cumulative sum of the cosine
cos_csum = np.cumsum(cos_t)
print(cos_csum)
[ 1.00000000e+00  1.70710678e+00  1.70710678e+00  1.00000000e+00
  0.00000000e+00 -7.07106781e-01 -7.07106781e-01 -5.55111512e-16
  1.00000000e+00]

Index and slice arrays

Indexing is how we pull individual data items out of an array. Slicing extends this process to pulling out a regular set of the items.

In [30]:
# Create an array for testing
a = np.arange(12).reshape(3, 4)
In [31]:
a
Out[31]:
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Indexing in Python is 0-based, so the command below looks for the 2nd item along the first dimension (row) and the 3rd along the second dimension (column).

In [32]:
a[1, 2]
Out[32]:
6

Can also just index on one dimension

In [33]:
a[2]
Out[33]:
array([ 8,  9, 10, 11])

Negative indices are also allowed, which permit indexing relative to the end of the array.

In [34]:
a[0, -1]
Out[34]:
3

Poll

Please go to http://www.PollEv.com/johnleeman205 to take a quick poll.

Slicing syntax is written as start:stop[:step], where all numbers are optional.

  • defaults:
    • start = 0
    • stop = len(dim)
    • step = 1
  • The second colon is also optional if no step is used.

It should be noted that end represents one past the last item; one can also think of it as a half open interval: [start, end)

In [35]:
# Get the 2nd and 3rd rows
a[1:3]
Out[35]:
array([[ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])
In [36]:
# All rows and 3rd column
a[:, 2]
Out[36]:
array([ 2,  6, 10])
In [37]:
# ... can be used to replace one or more full slices
a[..., 2]
Out[37]:
array([ 2,  6, 10])
In [38]:
# Slice every other row
a[::2]
Out[38]:
array([[ 0,  1,  2,  3],
       [ 8,  9, 10, 11]])

Poll

Please go to http://www.PollEv.com/johnleeman205 to take a quick poll.
EXERCISE:
  • The code below calculates a two point average using a Python list and loop. Convert it do obtain the same results using NumPy slicing
  • Bonus points: Can you extend the NumPy version to do a 3 point (running) average?
In [39]:
data = [1, 3, 5, 7, 9, 11]
out = []

# Look carefully at the loop. Think carefully about the sequence of values
# that data[i] takes--is there some way to get those values as a numpy slice?
# What about for data[i + 1]?
for i in range(len(data) - 1):
    out.append((data[i] + data[i + 1]) / 2)

print(out)
[2.0, 4.0, 6.0, 8.0, 10.0]
In [40]:
# YOUR CODE GOES HERE
SOLUTION
In [41]:
# %load solutions/slice.py

# Cell content replaced by load magic replacement.
data = np.array([1, 3, 5, 7, 9, 11])
out = (data[:-1] + data[1:]) / 2
print(out)
[ 2.  4.  6.  8. 10.]
In [42]:
# YOUR BONUS CODE GOES HERE
SOLUTION
In [43]:
# %load solutions/slice_bonus.py

# Cell content replaced by load magic replacement.
data = np.array([1, 3, 5, 7, 9, 11])
out = (data[2:] + data[1:-1] + data[:-2]) / 3
print(out)
[3. 5. 7. 9.]
EXERCISE:
  • Given the array of data below, calculate the total of each of the columns (i.e. add each of the three rows together):
In [44]:
data = np.arange(12).reshape(3, 4)

# YOUR CODE GOES HERE
# total = ?
SOLUTION
In [45]:
# %load solutions/sum_row.py

# Cell content replaced by load magic replacement.
print(data[0] + data[1] + data[2])

# Or we can use numpy's sum and use the "axis" argument
print(np.sum(data, axis=0))
[12 15 18 21]
[12 15 18 21]

Resources

The goal of this tutorial is to provide an overview of the use of the NumPy library. It tries to hit all of the important parts, but it is by no means comprehensive. For more information, try looking at the:

Top