Joining Arrays in NumPy for Beginners
Getting Started with NumPy Arrays
NumPy is a powerful scientific computing library in Python. It’s popular among developers for its efficient handling of multi-dimensional arrays, a fundamental data structure for performing numerical computations.
There are several advanced array operations in NumPy, like joining, splitting, sorting, and more. These operations are used to carry out various data analysis and scientific computing tasks seamlessly, like statistical data filtering, numerical simulations, and data modeling. Moreover, these advanced NumPy array operations offer a significant performance boost. By leveraging vectorization, broadcasting, efficient memory layout, and more, NumPy performs computations on datasets much faster.
In this guide, we’ll discuss in detail the advantages of advanced NumPy array operations like np.concatenate()
, np.stack()
, np.hstack()
, np.vstack()
, and np.dstack()
as well as learn how to join NumPy arrays using these functions.
Advantages of Advanced NumPy Array Operations
Vectorization
Vectorization is a technique in NumPy where operations are applied to entire arrays at once rather than to each element individually. Instead of using loops to iterate over each element, NumPy can apply an operation directly to the entire array, leading to significant performance improvements.
Here is an example of vectorization in NumPy:
import numpy as np# Creating an arrayarr = np.array([1, 2, 3])# Squaring each element in the array simultaneously using vectorizationres = arr ** 2print(res)
Here, vectorization enables NumPy to square all the elements simultaneously instead of doing it individually.
Broadcasting
Broadcasting allows NumPy to perform array operations by automatically expanding the dimensions of the smaller array to match the dimensions of the larger array, following certain rules, simplifying code and making it easier to perform element-wise operations. Since it avoids explicit loops, which take a lot of computation time, it manages to perform numerical calculations much faster, making it an effective technique to use.
Here is an example of broadcasting in NumPy:
import numpy as np# Creating a 2x2 arrayarr = np.array([[1, 2], [3, 4]])# Adding a scalar (2) to the array# The dimensions of this scalar will be increased to 2x2 to match the array dimensionsres = arr + 2print(res)
Here, the scalar (2) is broadcasted or expanded to a 2x2 array with all the elements equal to 2 to match the dimensions of the original array. Then, the expanded array is added to the original array to get the result.
Memory Efficiency
Memory efficiency is another key advantage of NumPy arrays. Unlike Python lists, which can store elements of different data types, NumPy arrays store all the elements of the same data type in contiguous blocks of memory, allowing for more efficient memory access and reduced overhead.
These advantages make NumPy the preferred choice for performing advanced array operations. Next, let’s learn how to perform joining, an advanced array operation in NumPy.
How to Join Two NumPy Arrays?
In NumPy, array joining is the process of combining the contents of multiple arrays into a single array.
There are two functions for joining two arrays in NumPy:
np.concatenate()
np.stack()
Let’s discuss them one-by-one.
Using the np.concatenate()
Function
The np.concatenate()
function joins two NumPy arrays along an existing axis. Here, joining along an axis basically means joining along the rows (horizontally) or columns (vertically). To understand how it works, let’s see the following example:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.concatenate((arr1, arr2), axis=0)print(arr3)
In the above code, we first created two NumPy arrays using the numpy.array()
function. Then, we passed those arrays in a tuple as the first argument and the axis for joining as the second argument to the function.
Here, the axis
parameter enables us to join NumPy arrays along the rows or columns. When axis
is set to 0
, the existing arrays will become columns (joined along the columns or horizontally) in the new array. When axis
is set to 1
, the existing arrays will become rows (joined along the rows or vertically) in the new array. If axis
is not provided, the default value 0
is used.
However, in the above code, the input arrays are 1D and hence, we can only join them along the only dimension, i.e., along the columns (axis=0
). If we set axis
to 1
, we get an error in the output as axis=1
indicates joining along the second dimension, i.e., along the rows, which doesn’t exist.
Let’s check out the output for the above code:
[12 23 34 45 56 67]
Next, let’s set the axis
parameter to 1
in the np.concatenate()
function and observe the output:
import numpy as np# Creating two arraysarr1 = np.array([[11, 12], [13, 14]])arr2 = np.array([[15, 16], [17, 18]])# Joining the NumPy arraysarr3 = np.concatenate((arr1, arr2), axis=1)print(arr3)
The output is following:
[[11 12 15 16][13 14 17 18]]
Using the np.stack()
Function
Another function for joining two NumPy arrays is np.stack()
. This function works just like np.concatenate()
, but there’s a little difference between them. The np.stack()
function joins two NumPy arrays along a new axis, whereas np.concatenate()
does that along an existing axis.
Here’s how the np.stack()
function works:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.stack((arr1, arr2), axis=0)print(arr3)
Contrary to the np.concatenate()
function, when axis
is set to 0
in np.stack()
, the existing arrays will become rows of the new array. When axis
is set to 1
, the existing arrays will become columns in the stacked array. Moreover, np.stack()
creates a new array that includes one more dimension than the dimensions of the input arrays. Then, the function stacks the input arrays along the rows or columns according to the axis
value.
Here is the output:
[[12 23 34][45 56 67]]
Below is another example with axis
set to 1
:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.stack((arr1, arr2), axis=1)print(arr3)
The output is the following:
[[12 45][23 56][34 67]]
Using Stacking Functions
Besides the np.concatenate()
and np.stack()
functions, there are some stacking functions as well, such as np.hstack()
, np.vstack()
, and np.dstack()
, which can also be used to join two NumPy arrays.
Let’s start with the np.hstack()
function, which joins two NumPy arrays along the columns or the horizontal axis:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.hstack((arr1, arr2))print(arr3)
Here is the output:
[12 23 34 45 56 67]
The np.vstack()
function joins two NumPy arrays along the rows or the vertical axis:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.vstack((arr1, arr2))print(arr3)
The output is following:
[[12 23 34][45 56 67]]
Lastly, the np.dstack()
function joins two NumPy arrays along the width or height:
import numpy as np# Creating two NumPy arraysarr1 = np.array([12, 23, 34])arr2 = np.array([45, 56, 67])# Joining the arraysarr3 = np.dstack((arr1, arr2))print(arr3)
Here is the output:
[[[12 45][23 56][34 67]]]
Concept Review and Next Steps
In this article, we have navigated through a range of topics, including:
- What NumPy arrays are
- The advantages of advanced NumPy array operations
- How to perform array joining in NumPy
Advanced NumPy array operations like joining are crucial in data workflows as they help us solve a diverse range of challenging problems seamlessly. By mastering these operations, we’ll be better equipped to tackle complex data analysis and scientific computing tasks in an efficient manner.
If you want to learn about splitting arrays, check out this article on Codecademy.
Author
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
Creating and Using NumPy Arrays - A Complete Guide
In this article, we will discuss how to create and use NumPy arrays. We will also go through various array operations, indexing, slicing, and reshaping with practical Python examples for data science. - Article
Sorting and Unary Operations in NumPy
Explore sorting and unary operations in NumPy arrays with examples for single and multi-dimensional data.
Learn more on Codecademy
- Skill path
Code Foundations
Start your programming journey with an introduction to the world of code and basic concepts.Includes 5 CoursesWith CertificateBeginner Friendly4 hours - Career path
Full-Stack Engineer
A full-stack engineer can get a project done from start to finish, back-end to front-end.Includes 51 CoursesWith Professional CertificationBeginner Friendly150 hours