NumPy is a high performance package for numerical computing in Python. It works well with vector and matrix operations. In this brief post we look at the origins of Python, ndarrays and their benefits, use cases and limitations as well as some useful NumPy functions.

Overview and Historical Context

NumPy has its roots in Travis Oliphant’s PhD thesis circa 1998 (Oliphant, 2006). NumPy is a part of the SciPy library. According to Guide to NumPy, he created code in C then wrapped it in Python as the need for that particular functionality arose.

NumPy is typically used to perform numerical computations with vectors and matrices. The star data structure is its ndarray. In particular, it shines in comparison to Python lists where large volumes of data are being operated on.


The magical data structure of NumPy is the ndarray. The speed and efficiency of NumPy arrays can be attributed to the underlying memory usage and structure of the arrays and its ability to eliminate for loops to perform broadcasting and vectorization.

Memory Usage and Structure

The major reason that NumPy is so fast is tied to its calls to compiled C code from the Python wrapper functions. Unlike Python lists, these lists have the following characteristics:

1. ndarrays are held contiguously in memory. This means that when the array has to be iterated over time is saved because the data is not held in scattered locations in memory via pointers which are stored by the python list.

2. ndarrays all in one datatype. Since all the data stored in the arrays are one datatype no type checking is needed. This is in contrast to Python lists which can contain different data types. Storing this extra information also increases memory usage.

3. ndarrays have data types that can be set to hold less space in memory. The data types can be changed to allocate less space in memory from the default sizes if that much information will not be used. For example, the default of int64 can be changed to int8 so each element in the array takes 1 byte instead of 8 bits. This conversion can be done by adding a dtype parameter when instantiating the NumPy array: int8_arr = np.array([1,2,3], dtype=np.int8).

No more Loops: Vectorization and Broadcasting


With broadcasting, an element is copied over as needed to effectively produce an element-wise operation. The copied element can be a single digit or another vector or matrix.

Say we initialize a random array of integers with values between 1 - 10 and shape (2,3).

arr = np.random.randint(1,10, size=(2,3))
array([[9, 1, 6],
       [5, 4, 3]])

Then we multiply that array by 2. As we can see, each element was multiplied by 2 though we did not multiply by a 2 x 3 matrix of 2’s.

arr * 2
array([[18,  2, 12],
       [10,  8,  6]])

Next, let’s multiply arr by a 1 x 3 vector. We can see that each row in arr is multiplied by the 1 x 3 vector. Thus, the 1x3 vector is effectively expanded into a 2*3 vector.

arr * [1,2,3]
array([[ 9,  2, 18],
       [ 5,  8,  9]])

Similarly, now multiply arr by a 2 x 1 vector. Each item in the first row or arr gets multiplied by 1 and each element in the second row gets multiplied by 2. Effectively, the 2 x 1 vector is expanded horizontally.

In [33]: arr * [[1],[2]]
array([[ 9,  1,  6],
       [10,  8,  6]])

This is another technique to remove for loops from the code. It can also make the code up to 300x faster when NumPy arrays are used with the NumPy dot() function(Ng, 2018)! This is a huge performance improvement when you are processing a large data set.

A typical example is for linear regression where WTX + b occurs. Instead of taking every element of W and multiplying it by every element of X then adding them separately inside a for loop, vectorization will do this in one step as a matrix operation. Finally you can add b at the end.

In [1]: W = np.random.randint(1,10, size=5)

In [2]: W
Out[2]: array([9, 8, 9, 4, 7])

In [3]: X = np.random.randint(2,9, size=5)

In [4]: W.T
Out[4]: array([9, 8, 9, 4, 7])

In [5]: W.shape
Out[5]: (5,)

In [6]: W.reshape(5,1)

In [7]: W.T
Out[7]: array([9, 8, 9, 4, 7])

In [8]: X.reshape(5,1)

In [9]:, X)
Out[9]: 206

Use Cases

There are 3 main ways that I engage with NumPy on a daily basis:

1. Pandas
I often use Pandas for data analysis. Pandas is a Python library built on top of NumPy. It can efficiently be used for exploring, preprocessing and manipulating large data sets since its DataFrame object is built on NumPy arrays and comes with all the benefits they carry.

2. ML Algorithms From Scratch
As demonstrated above I often use NumPy to do ML algorithm calculations from scratch instead of using implementations from libraries like Scikit-Learn, Spark or TensorFlow. I have also implemented convolutional operations with NumPy.

3. Functions and Statistical Operations
NumPy has functions that make doing certain calculations over NumPy matricies and vectors very easy. Some of these include sum(), mean(), std() and var(). They are similar to the functions you might find in a spreadsheet application.

Limitations and Drawbacks

  • You can get subtle bugs if you do not know how to use it properly. For example, when multiplying a 3x4 matrix by a 3x1 column vector you may not expect a matrix to be returned but an error.

  • The rank 1 vector (eg. vectors with shape (5,)) will not behave as you might expect it to. This is why the vectors above were reshaped using: W.reshape(5,1). They don’t behave consistently as either column or row vectors. If you try transposing the rank 1 vector it prints the same vector again.

  • It won’t work on certain platforms. Platforms which do not allow C code to run will not be able to invoke the C-API and so the code will not be able to run.