paint-brush
A Quick Introduction to Python Numpy for Beginnersby@biraj21
1,701 reads
1,701 reads

A Quick Introduction to Python Numpy for Beginners

by BirajMay 18th, 2022
Read on Terminal Reader
Read this story w/o Javascript
tldt arrow

Too Long; Didn't Read

NumPy is a Python library mainly used to work with arrays. An array is a collection of items that are stored next to each other in memory. The calculations in NumPy are done by the parts that are written in C, which makes them extremely fast compared to normal Python code. We can create a NumPy array by using the numpy module's `array()` function. The type of our `arr` variable is `numpy.ndarray, which stands for N-dimensional array.

Company Mentioned

Mention Thumbnail
featured image - A Quick Introduction to Python Numpy for Beginners
Biraj HackerNoon profile picture


NumPy is a Python library that is mainly used to work with arrays. An array is a collection of items that are stored next to each other in memory. For now, just think of them as Python lists.


NumPy is written in Python and C. The calculations in NumPy are done by the parts that are written in C, which makes them extremely fast as compared to normal Python code.


Installation

Make sure Python & Pip are installed on your computer. Then open the command prompt or terminal and run the following command:


pip install numpy


Creating Arrays using NumPy

You can create a NumPy array by using the numpy module's array() function as shown below:


import numpy as np

arr = np.array([3, 5, 7, 9])
print(type(arr))


The output will look like this:


<class 'numpy.ndarray'>


We just created a NumPy array from a python list. The type of our arr variable is numpy.ndarray. Here ndarray stands for N-dimensional array.


Dimensions or Axes in NumPy

In NumPy, dimensions are called axes (plural for axis). I like to think of an axis as a line along which items can be stored.


A simple list or a 1-dimensional array can be visualized as:


Axis for 1D Array


We will now look at the following:

  1. Scalars (0D Arrays)
  2. Vectors (1D Arrays)
  3. Matrices (2D Arrays)
  4. 3D Arrays
  5. 4D Arrays


Scalars (0D Arrays)

A scalar is just a single value.


import numpy as np

s = np.array(21)
print("Number of axes:", s.ndim)
print("Shape:", s.shape)
Output:
Number of axes: 0
Shape: ()


Here we have used 2 properties of a NumPy array:

  • ndim: It returns the number of dimensions (or axes) in an array. It returns 0 here because a value in itself does not have any dimensions.
  • shape: It returns a tuple that contains the number of values along each axis of an array. Since a scalar has 0 axes, it returns an empty tuple.


Vectors (1D Arrays)

A vector is a collection of values.


import numpy as np

vec = np.array([-1, 2, 7, 9, 2])
print("Number of axes:", vec.ndim)
print("Shape:", vec.shape)


Output:
Number of axes: 1
Shape: (5,)


vec.shape[0] gives us the number of values in our vector, which is 5 here.


Matrices (2D Arrays)

A matrix is a collection of vectors.


import numpy as np

mat = np.array([
    [1, 2, 3],
    [5, 6, 7]
])

print("Number of axes:", mat.ndim)
print("Shape:", mat.shape)


Output:
Number of axes: 2
Shape: (2, 3)


Here we created a 2x3 matrix (2D array) using a list of lists. Since a matrix has 2 axes, mat.shape tuple contains two values: the first value is the number of rows and the second value is the number of columns.


Matrix


Each item (row) in a 2D array is a vector (1D array).


3D Arrays

A 3D array is a collection of matrices.


import numpy as np

t = np.array([
    [[1, 3, 9],
     [7, -6, 2]],

    [[2, 3, 5],
     [0, -2, -2]],

    [[9, 6, 2],
     [-7, -3, -12]],

    [[2, 4, 5],
     [-1, 9, 8]]
])

print("Number of axes:", t.ndim)
print("Shape:", t.shape)


Output:
Number of axes: 3
Shape: (4, 2, 3)


Here we created a 3D array by using a list of 4 lists, which themselves contain 2 lists.


3D Array


Each item in a 3D array is a matrix (1D array). Note that the last matrix in the array is the front-most in the image.

4D Arrays

4D Array


After looking at the above examples, we see a pattern here. An n-dimensional array is a collection of n-1 dimensional arrays, for n > 0. I hope that now you have a better idea of visualizing multidimensional arrays.




Accessing Array Elements

Just like Python lists, the indexes in NumPy arrays start with 0.


import numpy as np

vec = np.array([-3, 4, 6, 9, 8, 3])
print("vec - 4th value:", vec[3])

vec[3] = 19
print("vec - 4th value (changed):", vec[3])

mat = np.array([
    [2, 4, 6, 8],
    [10, 12, 14, 16]
])
print("mat - 1st row:", mat[0])
print("mat - 2nd row's 1st value:", mat[1, 0])
print("mat - last row's last value:", mat[-1, -1])


Output:
vec - 4th value: 9
vec - 4th value (changed): 19
mat - 1st row: [2 4 6 8]
mat - 2nd row's 1st value: 10
mat - last row's last value: 16


NumPy arrays also support slicing:


# continuing the above code

print("vec - 2nd to 4th:", vec[1:4])
print("mat - 1st rows 1st to 3rd values:", mat[0, 0:3])
print("mat - 2nd column:", mat[:, 1])


Output:
vec - 2nd to 4th: [4 6 9]
mat - 1st row's 1st to 3rd values: [2 4 6]
mat - 2nd column: [ 4 12]

In the last example, [:, 1] says "get 2nd value from all rows". Hence, we get the 2nd column of the matrix as the output.


Example: Indexing in a 4D Array

Indexing in 4D Array


Let's say we want to access the circled values. It is located in the 2nd 3D array's last matrix's 2nd row's 2nd column. It's a lot so take your time.


Here's how to access it:

arr[2, -1, 1, 1]


Python vs. NumPy

At the beginning of the post, I said that calculations in NumPy are extremely fast compared to normal Python code. Let's see the difference.


We will create two lists with 10 million numbers from 0 to 9,999,999, add them element-wise and measure the time it takes. We will convert both lists to NumPy arrays and do the same.


import numpy as np
import time

l1 = list(range(10000000))
l2 = list(range(10000000))
sum = []

then = time.time()
for i in range(len(l1)):
	sum.append(l1[i] + l2[i])

print(f"With just Python: {time.time() - then: .2f}s")

arr1 = np.array(l1)
arr2 = np.array(l2)

then = time.time()
sum = arr1 + arr2
print(f"With NumPy: {time.time() - then: .2f}s")


Output:
With just Python:  2.30s
With NumPy:  0.14s


In this case, NumPy was 16x faster than raw Python.