Understand how to stack NumPy arrays in the first part of this 2 part series.
Numpy is one of the most important libraries for data science and also it provides most of the functions needed to work with data. So, mastering the ins and outs of this library is required.
This is part 1 of 2 and in this article, we are going to see how to stack numpy arrays, allowing you to join 2 numpy arrays in the axis that is specified.
This article assumes basic knowledge of working with numpy. If not, read this article to get familiar with the basics.
Numpy’s stack function is used to join multiple numpy arrays along a new axis and return a numpy array. One of the main requirements to keep in mind is that arrays should have the same shape and dimension.
The parameters of np.stack are:
When we keep the axis parameter 0 which is also the default value, the arrays are stacked on top of each other i.e. row-wise. Whereas if we keep the axis parameter as 1, the arrays are stacked side-by-side i.e. column-wise.
Let’s understand this better with some examples:
import numpy as np
>>> a = np.array([1,2,3])
>>> b = np.array([4,5,6])
# First we will stack on top of each other, which is the default behavior
>>> np.stack([a,b])
array([[1, 2, 3],
[4, 5, 6]])
# Now we will use axis=0, which should also give us the same output as above.
# Even if we want to use default behavior, it is always better
# to mention the value we want to use.
>>> np.stack([a,b], axis=0)
array([[1, 2, 3],
[4, 5, 6]])
# Now we will use axis=1, which will stack them side-by-side.
>>> np.stack([a,b], axis=1)
array([[1, 4],
[2, 5],
[3, 6]])
The above examples are for arrays in 1D, let’s also see a couple of examples for 2D.
>>> a = np.array([[1,2,3], [4,5,6]])
>>> b = np.array([[7,8,9],[10,11,12]])
# Stacking 2D arrays on top of each other.
>>> np.stack([a,b], axis = 0)
array([[[ 1, 2, 3],
[ 4, 5, 6]],
[[ 7, 8, 9],
[10, 11, 12]]])
# Stacking 2D arrays side-by-side.
>>> np.stack([a,b], axis = 1)
array([[[ 1, 2, 3],
[ 7, 8, 9]],
[[ 4, 5, 6],
[10, 11, 12]]])
The numpy hstack function takes 2 arrays with the same number of rows and joins them horizontally. The number of columns in these arrays need not be the same, they can be different and it will stack without any issue.
>>> a = np.array([[1,1], [1,1]])
>>> b = np.array([[2,2,2,2], [2,2,2,2]])
>>> np.hstack([a,b])
array([[1, 1, 2, 2, 2, 2],
[1, 1, 2, 2, 2, 2]])
Here, hstack took array b and joined it horizontally to array a.
The numpy vstack function takes 2 arrays with the same number of columns and joins them vertically. Similar to hstack, here the number of rows can be different and it will stack just as well.
>>> a = np.array([[1,1,1], [1,1,1]])
>>> b = np.array([[2,2,2], [2,2,2], [2,2,2], [2,2,2]])
>>> np.vstack([a,b])
array([[1, 1, 1],
[1, 1, 1],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2],
[2, 2, 2]])
Here, vstack took array b and stacked it vertically with array.
Both the functions are quite easy to understand from their name. hstack joins horizontally or side-by-side, whereas vstack joins arrays vertically or on top of each other.
That’s it for this part and in the next part I will explain how to split numpy arrays effectively.
Thanks for reading! If you liked this, use the clap button and if you have any suggestion, do comment. Make sure to follow to read upcoming articles on NumPy, Pandas, SQL and all things related to Data Science.
Also published here.