NumPy is the fundamental package for scientific computing with Python. In of the Data science With Python series, we looked at the basic in-built functions for numerical computing in Python. In this part, we will be taking a look at the Numpy library. Part 1 NumPy is the fundamental package for scientific computing with Python. It contains among other things: a powerful N-dimensional array object sophisticated (broadcasting) functions tools for integrating C/C++ and Fortran code useful linear algebra, Fourier transform, and random number capabilities Besides its obvious scientific uses, NumPy can also be used as an efficient multi-dimensional container of generic data. Arbitrary data-types can be defined. This allows NumPy to seamlessly and speedily integrate with a wide variety of databases. Great, let’s see how to use the Numpy library for basic array manipulation. The Numpy library First, we need to import numpy in Python. import numpy as np Let’s create a numpy array. np.array([4,5,6]) Output : array([4,5,6]) Now, let’s create a multi-dimensional array. mul=np.array([[5,4,6],[7,8,9],[10,11,12]])mul Output : array([[4, 5, 6], [7, 8, 9],[10,11,12]]) Check the shape (rows and columns of the array). mul.shape Output : (3, 3) Create an evenly spaced array between 1 and 60 with a difference of 2. dif=np.arange(1,60,2)dif Output : array([ 1, 3, 5, 7, 9, 11, 13, 15, 17, 19, 21, 23, 25, 27, 29, 31, 33, 35, 37, 39, 41, 43, 45, 47, 49, 51, 53, 55, 57, 59]) Reshape the above array into a desired shape. dif.reshape(10,3) Output : array([[ 1, 3, 5], [ 7, 9, 11], [13, 15, 17], [19, 21, 23], [25, 27, 29], [31, 33, 35], [37, 39, 41], [43, 45, 47], [49, 51, 53], [55, 57, 59]]) Generate an evenly spaced list between the interval 1 and 8. (Take a minute here to understand the difference between ‘linspace’ and ‘arange’) gen = np.linspace(1,8,40)gen Output: array([1. , 1.17948718, 1.35897436, 1.53846154, 1.71794872, 1.8974359 , 2.07692308, 2.25641026, 2.43589744, 2.61538462, 2.79487179, 2.97435897, 3.15384615, 3.33333333, 3.51282051, 3.69230769, 3.87179487, 4.05128205, 4.23076923, 4.41025641, 4.58974359, 4.76923077, 4.94871795, 5.12820513, 5.30769231, 5.48717949, 5.66666667, 5.84615385, 6.02564103, 6.20512821, 6.38461538, 6.56410256, 6.74358974, 6.92307692, 7.1025641 , 7.28205128, 7.46153846, 7.64102564, 7.82051282, 8. ]) Now, change the shape of the array in place (‘resize’ function changes the shape of the array in place, unlike ‘reshape’) gen.resize(10,4)gen Output: array([[1. , 1.17948718, 1.35897436, 1.53846154], [1.71794872, 1.8974359 , 2.07692308, 2.25641026], [2.43589744, 2.61538462, 2.79487179, 2.97435897], [3.15384615, 3.33333333, 3.51282051, 3.69230769], [3.87179487, 4.05128205, 4.23076923, 4.41025641], [4.58974359, 4.76923077, 4.94871795, 5.12820513], [5.30769231, 5.48717949, 5.66666667, 5.84615385], [6.02564103, 6.20512821, 6.38461538, 6.56410256], [6.74358974, 6.92307692, 7.1025641 , 7.28205128], [7.46153846, 7.64102564, 7.82051282, 8. ]]) Create an array with all elements as ones. onarr = np.ones((4,4))onarr Output: array([[1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.], [1., 1., 1., 1.]]) Create an array filled with zeros. zearr = np.zeros((4,4))zearr Output: array([[0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.], [0., 0., 0., 0.]]) Create a diagonal matrix with diagonal values = 1 dm = np.eye(3)dm Output: array([[1., 0., 0.], [0., 1., 0.], [0., 0., 1.]]) Extract only diagonal values from an array. np.diag(dm) Output: array([1., 1., 1.]) Create an array consisting of repeating list relist = np.array([1,2,3]*7)relist Output: array([1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3, 1, 2, 3]) Now, repeat each element of array n number of times using repeat function. np.repeat([1,2,3],3) Output : array([1, 1, 1, 2, 2, 2, 3, 3, 3]) Generate two arrays of desired shape filled with random values between 0 and 1. relist = np.random.rand(2,3)print(relist)de = np.random.rand(2,3)print(de) Output : [[0.55523672 0.46815197 0.67590369] [0.5331193 0.62780236 0.45044916]] [[0.26215572 0.07380256 0.06592746] [0.89782279 0.95603968 0.82052478]] Stack the above two arrays created vertically st = np.vstack([de,relist])st Output : array([[0.26215572, 0.07380256, 0.06592746], [0.89782279, 0.95603968, 0.82052478], [0.55523672, 0.46815197, 0.67590369], [0.5331193 , 0.62780236, 0.45044916]]) Now, let’s stack them horizontally. sh = np.hstack([de,relist])sh Output : array([[0.26215572, 0.07380256, 0.06592746, 0.55523672, 0.46815197, 0.67590369], [0.89782279, 0.95603968, 0.82052478, 0.5331193 , 0.62780236, 0.45044916]]) Great, now let’s perform some array operations. First let’s create two random arrays r1 = np.random.rand(2,2)r2 = np.random.rand(2,2)print(r1)print(r2) Output : [[ 0.02430146 0.14448542] [ 0.54428337 0.40332494]] [[ 0.77574886 0.08747577] [ 0.51484157 0.92319888]] Let’s do element wise addition. r3 = r1+ r2r3 Output : array([[-0.75144739, 0.05700965], [ 0.02944179, -0.51987394]]) Element wise subtraction. r4 = r1 - r2r4 Output : array([[-0.75144739, 0.05700965], [ 0.02944179, -0.51987394]]) Let’s power each element to 3. r5 = r1**3r5 Output : array([[0.65228631, 0.24993365], [0.97976155, 0.71554632]]) Now, instead of element wise operation, let’s perform a dot product of the two arrays r1 and r2. r6 = r1.dot(r2)r6 Output : array([[ 0.09323893, 0.13551456], [ 0.62987564, 0.41996073]]) Let’s create a new array and transpose it. sh = np.array([[1,2],[3,4]])sh Output : array([[1, 2], [3, 4]]) sh.T Output : array([[1, 3], [2, 4]]) Now, check the datatype of elements in the array. sh.dtype Output : dtype(‘int32’) Change the datatype of the array. rs = a.astype('f')rs.dtype Output : dtype(‘float32’) Now, let’s look at some mathematical functions in an array, starting with sum of an array. c = np.array([1,2,3,4,5])c.sum() Output : 15 Maximum of the elements of an array. c.max() Output : 5 Mean of the elements of the array c.mean() Output : 3 Now, let’s retrieve the index of the maximum value of the array. c.argmax() Output : 4 c.argmin() Output : 0 Create an array consisting of square of first ten whole numbers. dim = np.arange(10)**2dim Output : array([ 0, 1, 4, 9, 16, 25, 36, 49, 64, 81], dtype=int32) Access values in the above array using index dim[2] Output : 4 dim[1:5] Output : array([ 1, 4, 9, 16], dtype=int32) Use negative sign to access variables in reverse. dim[-1:] Output : array([81], dtype=int32) Now, access certain elements of the array based on a step size. dim[1:10:2] #dim[start:stop:stepsize] Output : array([ 1, 9, 25, 49, 81], dtype=int32) Create a multidimensional array en = np.arange(36)en.resize(6,6)en Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 21, 22, 23], [24, 25, 26, 27, 28, 29], [30, 31, 32, 33, 34, 35]]) Access the second row and third column en[1,2] Output : 8 Access 2nd row and columns 3 to 7. Note that the numbering of the rows and columns start with 0. en[1, 2:6] Output : array([ 8, 9, 10, 11]) Select all rows till the 2nd row and all columns except last column en[:2,:-1] Output : array([[ 0, 1, 2, 3, 4], [ 6, 7, 8, 9, 10]]) Select values from array greater than 20. en[en>20] Output : array([21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35]) Assign value of the array elements as 20 if the element value is greater than 20. en[en>20] = 20en Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20]]) To copy an array onto another variable, always use the copy function. fun = en.copy()fun Output : array([[ 0, 1, 2, 3, 4, 5], [ 6, 7, 8, 9, 10, 11], [12, 13, 14, 15, 16, 17], [18, 19, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20], [20, 20, 20, 20, 20, 20]]) Create an array with a set of random integers between 1 and 10. Specify the array to be of shape 4*4 gom = np.random.randin(1,10,(4,4))gom Output : array([[9, 7, 1, 4], [1, 4, 3, 6], [2, 5, 5, 1], [2, 2, 9, 9]]) Great, we have looked at creating, accessing and manipulating arrays in Numpy. In the next part of the series, we will be looking at a library which is built on the Numpy library — Pandas. Pandas is a library which makes data manipulation and analysis much easier in Python. It offers data structures and operations for numerical tables and time series. Resources : Numpy documentation Applied Data Science with Python Specialization. Connect on and, check out Github (below) for the complete notebook. LinkedIn _Python-The-Dangerous-Tool-For-ML-Data-Science - Learn data science and Machine learning with Python._github.com harunshimanto/Python-The-Dangerous-Tool-For-ML-Data-Science You can what you think about this, if you enjoy writing, click on the clap 👏 button. tell me Thanks to everyone.
Share Your Thoughts