## Before you go, check out these stories!

0
Numpy With Python For Data Science by@harunurrashid

# Numpy With Python For Data Science

### @harunurrashidHarun-Ur-Rashid

I’m Harun-Ur-Rashid. I'm a self-taught Data Scientist.

## NumPy is the fundamental package for scientific computing with Python.

In Part 1 of the Data science With Python series, we looked at the basic in-built functions for numerical computing in Python. In this part, we will be taking a look at the Numpy library.

NumPy is the fundamental package for scientific computing with Python. It contains among other things:

• a powerful N-dimensional array object
• tools for integrating C/C++ and Fortran code
• useful linear algebra, Fourier transform, and random number capabilities

Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases.

Great, let’s see how to use the Numpy library for basic array manipulation.

### The Numpy library

First, we need to import numpy in Python.

`import numpy as np`

Let’s create a numpy array.

`np.array([4,5,6])`

Output : array([4,5,6])

Now, let’s create a multi-dimensional array.

`mul=np.array([[5,4,6],[7,8,9],[10,11,12]])mul`

Output : array([[4, 5, 6],
[7, 8, 9],[10,11,12]])

Check the shape (rows and columns of the array).

`mul.shape`

Output : (3, 3)

Create an evenly spaced array between 1 and 60 with a difference of 2.

`dif=np.arange(1,60,2)dif`

Output : array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33,
35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59])

Reshape the above array into a desired shape.

`dif.reshape(10,3)`

Output : array([[ 1, 3, 5],
[ 7, 9, 11],
[13, 15, 17],
[19, 21, 23],
[25, 27, 29],
[31, 33, 35],
[37, 39, 41],
[43, 45, 47],
[49, 51, 53],
[55, 57, 59]])

Generate an evenly spaced list between the interval 1 and 8. (Take a minute here to understand the difference between ‘linspace’ and ‘arange’)

`gen = np.linspace(1,8,40)gen`

Output: array([1. , 1.17948718, 1.35897436, 1.53846154, 1.71794872,
1.8974359 , 2.07692308, 2.25641026, 2.43589744, 2.61538462,
2.79487179, 2.97435897, 3.15384615, 3.33333333, 3.51282051,
3.69230769, 3.87179487, 4.05128205, 4.23076923, 4.41025641,
4.58974359, 4.76923077, 4.94871795, 5.12820513, 5.30769231,
5.48717949, 5.66666667, 5.84615385, 6.02564103, 6.20512821,
6.38461538, 6.56410256, 6.74358974, 6.92307692, 7.1025641 ,
7.28205128, 7.46153846, 7.64102564, 7.82051282, 8. ])

Now, change the shape of the array in place (‘resize’ function changes the shape of the array in place, unlike ‘reshape’)

`gen.resize(10,4)gen`

Output: array([[1. , 1.17948718, 1.35897436, 1.53846154],
[1.71794872, 1.8974359 , 2.07692308, 2.25641026],
[2.43589744, 2.61538462, 2.79487179, 2.97435897],
[3.15384615, 3.33333333, 3.51282051, 3.69230769],
[3.87179487, 4.05128205, 4.23076923, 4.41025641],
[4.58974359, 4.76923077, 4.94871795, 5.12820513],
[5.30769231, 5.48717949, 5.66666667, 5.84615385],
[6.02564103, 6.20512821, 6.38461538, 6.56410256],
[6.74358974, 6.92307692, 7.1025641 , 7.28205128],
[7.46153846, 7.64102564, 7.82051282, 8. ]])

Create an array with all elements as ones.

`onarr = np.ones((4,4))onarr`

Output: array([[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.],
[1., 1., 1., 1.]])

Create an array filled with zeros.

`zearr = np.zeros((4,4))zearr`

Output: array([[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.],
[0., 0., 0., 0.]])

Create a diagonal matrix with diagonal values = 1

`dm = np.eye(3)dm `

Output: array([[1., 0., 0.],
[0., 1., 0.],
[0., 0., 1.]])

Extract only diagonal values from an array.

`np.diag(dm)`

Output: array([1., 1., 1.])

Create an array consisting of repeating list

`relist = np.array([1,2,3]*7)relist`

Output: array([1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3])

Now, repeat each element of array n number of times using repeat function.

`np.repeat([1,2,3],3)`

Output : array([1, 1, 1, 2, 2, 2, 3, 3, 3])

Generate two arrays of desired shape filled with random values between 0 and 1.

`relist = np.random.rand(2,3)print(relist)de = np.random.rand(2,3)print(de)`

Output :

[[0.55523672 0.46815197 0.67590369]
[0.5331193 0.62780236 0.45044916]]

[[0.26215572 0.07380256 0.06592746]
[0.89782279 0.95603968 0.82052478]]

Stack the above two arrays created vertically

`st = np.vstack([de,relist])st`

Output :

array([[0.26215572, 0.07380256, 0.06592746],
[0.89782279, 0.95603968, 0.82052478],
[0.55523672, 0.46815197, 0.67590369],
[0.5331193 , 0.62780236, 0.45044916]])

Now, let’s stack them horizontally.

`sh = np.hstack([de,relist])sh`

Output :

array([[0.26215572, 0.07380256, 0.06592746, 0.55523672, 0.46815197,
0.67590369],
[0.89782279, 0.95603968, 0.82052478, 0.5331193 , 0.62780236,
0.45044916]])

Great, now let’s perform some array operations. First let’s create two random arrays

`r1 = np.random.rand(2,2)r2 = np.random.rand(2,2)print(r1)print(r2)`

Output :

[[ 0.02430146 0.14448542]
[ 0.54428337 0.40332494]]

[[ 0.77574886 0.08747577]
[ 0.51484157 0.92319888]]

`r3 = r1+ r2r3`

Output : array([[-0.75144739, 0.05700965],
[ 0.02944179, -0.51987394]])

Element wise subtraction.

`r4 = r1 - r2r4`

Output : array([[-0.75144739, 0.05700965],
[ 0.02944179, -0.51987394]])

Let’s power each element to 3.

`r5 = r1**3r5`

Output : array([[0.65228631, 0.24993365],
[0.97976155, 0.71554632]])

Now, instead of element wise operation, let’s perform a dot product of the two arrays r1 and r2.

`r6 = r1.dot(r2)r6`

Output : array([[ 0.09323893, 0.13551456],
[ 0.62987564, 0.41996073]])

Let’s create a new array and transpose it.

`sh = np.array([[1,2],[3,4]])sh`

Output :

array([[1, 2],
[3, 4]])

`sh.T`

Output :

array([[1, 3],
[2, 4]])

Now, check the datatype of elements in the array.

`sh.dtype`

Output : dtype(‘int32’)

Change the datatype of the array.

`rs = a.astype('f')rs.dtype`

Output : dtype(‘float32’)

Now, let’s look at some mathematical functions in an array, starting with sum of an array.

`c = np.array([1,2,3,4,5])c.sum()`

Output : 15

Maximum of the elements of an array.

`c.max()`

Output : 5

Mean of the elements of the array

`c.mean()`

Output : 3

Now, let’s retrieve the index of the maximum value of the array.

`c.argmax()`

Output : 4

`c.argmin()`

Output : 0

Create an array consisting of square of first ten whole numbers.

`dim = np.arange(10)**2dim`

Output : array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81], dtype=int32)

Access values in the above array using index

`dim[2]`

Output : 4

`dim[1:5]`

Output : array([ 1, 4, 9, 16], dtype=int32)

Use negative sign to access variables in reverse.

`dim[-1:]`

Output : array([81], dtype=int32)

Now, access certain elements of the array based on a step size.

`dim[1:10:2] #dim[start:stop:stepsize]`

Output : array([ 1, 9, 25, 49, 81], dtype=int32)

Create a multidimensional array

`en = np.arange(36)en.resize(6,6)en`

Output : array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 21, 22, 23],
[24, 25, 26, 27, 28, 29],
[30, 31, 32, 33, 34, 35]])

Access the second row and third column

`en[1,2]`

Output : 8

Access 2nd row and columns 3 to 7. Note that the numbering of the rows and columns start with 0.

`en[1, 2:6]`

Output : array([ 8, 9, 10, 11])

Select all rows till the 2nd row and all columns except last column

`en[:2,:-1]`

Output : array([[ 0, 1, 2, 3, 4],
[ 6, 7, 8, 9, 10]])

Select values from array greater than 20.

`en[en>20]`

Output : array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35])

Assign value of the array elements as 20 if the element value is greater than 20.

`en[en>20] = 20en`

Output : array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 20, 20, 20],
[20, 20, 20, 20, 20, 20],
[20, 20, 20, 20, 20, 20]])

To copy an array onto another variable, always use the copy function.

`fun = en.copy()fun`

Output : array([[ 0, 1, 2, 3, 4, 5],
[ 6, 7, 8, 9, 10, 11],
[12, 13, 14, 15, 16, 17],
[18, 19, 20, 20, 20, 20],
[20, 20, 20, 20, 20, 20],
[20, 20, 20, 20, 20, 20]])

Create an array with a set of random integers between 1 and 10. Specify the array to be of shape 4*4

`gom = np.random.randin(1,10,(4,4))gom`

Output : array([[9, 7, 1, 4],
[1, 4, 3, 6],
[2, 5, 5, 1],
[2, 2, 9, 9]])

Great, we have looked at creating, accessing and manipulating arrays in Numpy. In the next part of the series, we will be looking at a library which is built on the Numpy library — Pandas. Pandas is a library which makes data manipulation and analysis much easier in Python. It offers data structures and operations for numerical tables and time series.

#### Resources :

Connect on LinkedIn and, check out Github (below) for the complete notebook.

You can tell me what you think about this, if you enjoy writing, click on the clap 👏 button.

Thanks to everyone.

#### Tags

Subscribe to get your daily round-up of top tech stories!