Tutorial Numpy

Tutorial Numpy#

Hit shift+enter in each cell to execute the Python code therein.

Line 1 imports Numpy as np, which is the recommended convention. The next version shows numpy version. For instance, mine is 1.18.5

import numpy as np # recommended convention
np.__version__
'1.18.5'

The next line creates an array of 32 bit floating point numbers. The itemize property shows the number of bytes per item.

x = np.array([1, 1, 1], dtype=np.float32)
print(x)
[1. 1. 1.]
print(x.itemsize) # number of bytes per item
4

This computes the sine of the input array of all ones, using Numpy’s unary function, np.sin. There is another sine function in the built-in math module, but the Numpy version is faster because it does not require explicit looping (i.e. using a for loop) over each of the elements in the array. That looping happens in np.sin function itself.

print(np.sin(np.array([1, 1, 1], dtype=np.float32)))
[0.84147096 0.84147096 0.84147096]

Numpy arrays can have different shapes and number of dimensions.

x = np.array([[1,2,3], [4,5,6]])
print(x.shape)
(2, 3)

Numpy slicing rules extend Python’s natural slicing syntax. Note the colon : character selects all elements in the corresponding row or column.

x=np.array([ [1,2,3],[4,5,6] ])
print(x[:,0]) # 0th column
[1 4]
print(x[:,1]) # 1st column
[2 5]
print(x[0,:]) # 0th row
[1 2 3]
print(x[1,:]) # 1st row
[4 5 6]

Numpy slicing can select sections of an array as shown.

x = np.array([[1,2,3], [4,5,6]])
print(x)
[[1 2 3]
 [4 5 6]]
print(x[:,1:]) # all rows, 1st thru last column
[[2 3]
 [5 6]]
print(x[:,::2]) # all rows, every other column
[[1 3]
 [4 6]]

In contrast with MATLAB, Numpy uses pass-by-reference semantics so it creates views into the existing array, without implicit copying. This is particularly helpful with very large arrays because copying can be slow.

x = np.ones((3,3))
print(x)
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]
print(x[:,[0,1,2,2]]) # notice duplicated last dimension
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
y = x[:,[0,1,2,2]] # same as above, but assign it
x[0,0] = 999     # change element in x
print(x)         # changed 
[[999.   1.   1.]
 [  1.   1.   1.]
 [  1.   1.   1.]]

Because we made a copy, changing the individual elements of x does not affect y.

print(y)                         # not changed!
[[1. 1. 1. 1.]
 [1. 1. 1. 1.]
 [1. 1. 1. 1.]]
x = np.ones((3,3))
y = x[:2,:2] # upper left piece
x[0,0] = 999 # change value
print(x) 
[[999.   1.   1.]
 [  1.   1.   1.]
 [  1.   1.   1.]]

As a consequence of the pass-by-reference semantics, Numpy views point at the same memory as their parents, so changing an element in x updates the corresponding element in y. This is because a view is just a window into the same memory.

print(y)
[[999.   1.]
 [  1.   1.]]

Indexing can also create copies as we saw before. y is a copy, not a view, because it was created using indexing whereas z was created using slicing. Thus, even though y and z have the same entries, only z is affected by changes to x.

x = np.arange(5) # create array
print(x)
y = x[[0,1,2]] # index by integer list
print(y)
[0 1 2 3 4]
[0 1 2]
z = x[:3]      # slice
print(z)      # note y and z have same entries?
x[0] = 999     # change element of x
print(x)
[0 1 2]
[999   1   2   3   4]
print(y)            # note y is unaffected,
[0 1 2]
print(z)            # but z is (it's a view).
[999   1   2]

Numpy arrays have a built-in flags.owndata property that can help keep track of views until you get the hang of them.

print(x.flags.owndata)
True
print(y.flags.owndata)
True
print(z.flags.owndata)   # as a view, z does not own the data!
False

Numpy arrays support elementwise multiplication, not row-column multiplication. You can use Numpy array for this kind with @ sign for multiplication.

import numpy as np
A = np.array([[1,2,3],[4,5,6],[7,8,9]])
x = np.array([[1],[0],[0]])
print(A@x)        # perkalian matrix
[[1]
 [4]
 [7]]
A = np.ones((3,3))
print(type(A))    # array not matrix
x = np.ones((3,1)) # array not matrix
print(A*x)        # not row-column multiplication!
<class 'numpy.ndarray'>
[[1. 1. 1.]
 [1. 1. 1.]
 [1. 1. 1.]]

Numpy Broadcasting#

X, Y = np.meshgrid(np.arange(2), np.arange(2)) # meshgrid creates 2-dimensional grids
print(X)
[[0 1]
 [0 1]]
print(Y)
[[0 0]
 [1 1]]

Because the two arrays have compatible shapes, they can be added together element-wise.

print(X+Y)
[[0 1]
 [1 2]]
x = np.array([0, 1])
y = np.array([0, 1])
print(x)
[0 1]
print(y)
[0 1]

Using Numpy broadcasting, we can skip creating compatible arrays using meshgrid and instead accomplish the same thing automatically by using the None singleton to inject an additional compatible dimension.

print(x + y[:,None]) # add broadcast dimension
[[0 1]
 [1 2]]
print(X+Y)
[[0 1]
 [1 2]]
x = np.array([0, 1])
y = np.array([0, 1, 2])
X, Y = np.meshgrid(x, y)
print(X)
[[0 1]
 [0 1]
 [0 1]]
print(Y)
[[0 0]
 [1 1]
 [2 2]]
print(X+Y)
[[0 1]
 [1 2]
 [2 3]]
print(x+y[:, None]) # same as w/ meshgrid
[[0 1]
 [1 2]
 [2 3]]

In this example, the array shapes are different, so the addition of x and y is not possible without Numpy broadcasting. The last line shows that broadcasting generates the same output as using the compatible array generated by meshgrid.

Numpy broadcasting also works in multiple dimensions. We start here with three one-dimensional arrays and create a three-dimensional output using broadcasting. The x+y[:None] part creates a conforming two-dimensional array as before, and due to the left-to-right evaluation order, this two-dimensional intermediate product is broadcast against the z variable, whose two None dimensions create an output three-dimensional array.

x = np.array([0, 1])
y = np.array([0, 1, 2])
z = np.array([0, 1, 2, 3])
print(x+y[:, None]+z[:, None, None])
[[[0 1]
  [1 2]
  [2 3]]

 [[1 2]
  [2 3]
  [3 4]]

 [[2 3]
  [3 4]
  [4 5]]

 [[3 4]
  [4 5]
  [5 6]]]

Matplotlib#

The first line imports the Matplotlib module following the recommended naming convention. The next plots a range of numbers. The last line actually forces the plot to render. This is not necessary in the IPython Notebook.

import matplotlib.pyplot as plt
plt.plot(range(10))
#plt.show() 
[<matplotlib.lines.Line2D at 0x7f529be39cc0>]
../_images/3fa0b05bb892af7074fc7e24ba9ba3a8eba0301a6974f0e75196e075bd5b0e21.png
z = np.linspace(-np.pi, np.pi, 201)      # create 201 points between -pi to pi
plt.plot(z, np.sin(z))
[<matplotlib.lines.Line2D at 0x7f5299dd8400>]
../_images/5f678914c0988cceb465dd1b730877349c218a164c2b1d065620853f37095ab5.png