Joining Arrays in NumPy for Beginners

Learn how to join NumPy arrays using functions like np.concatenate(), np.stack(), np.hstack(), np.vstack(), and np.dstack(). A beginner-friendly guide with examples.

Getting Started with NumPy Arrays

NumPy is a powerful scientific computing library in Python. It’s popular among developers for its efficient handling of multi-dimensional arrays, a fundamental data structure for performing numerical computations.

There are several advanced array operations in NumPy, like joining, splitting, sorting, and more. These operations are used to carry out various data analysis and scientific computing tasks seamlessly, like statistical data filtering, numerical simulations, and data modeling. Moreover, these advanced NumPy array operations offer a significant performance boost. By leveraging vectorization, broadcasting, efficient memory layout, and more, NumPy performs computations on datasets much faster.

In this guide, we’ll discuss in detail the advantages of advanced NumPy array operations like np.concatenate(), np.stack(), np.hstack(), np.vstack(), and np.dstack() as well as learn how to join NumPy arrays using these functions.

Advantages of Advanced NumPy Array Operations

Vectorization

Vectorization is a technique in NumPy where operations are applied to entire arrays at once rather than to each element individually. Instead of using loops to iterate over each element, NumPy can apply an operation directly to the entire array, leading to significant performance improvements.

Here is an example of vectorization in NumPy:

import numpy as np
# Creating an array
arr = np.array([1, 2, 3])
# Squaring each element in the array simultaneously using vectorization
res = arr ** 2
print(res)

Here, vectorization enables NumPy to square all the elements simultaneously instead of doing it individually.

Broadcasting

Broadcasting allows NumPy to perform array operations by automatically expanding the dimensions of the smaller array to match the dimensions of the larger array, following certain rules, simplifying code and making it easier to perform element-wise operations. Since it avoids explicit loops, which take a lot of computation time, it manages to perform numerical calculations much faster, making it an effective technique to use.

Here is an example of broadcasting in NumPy:

import numpy as np
# Creating a 2x2 array
arr = np.array([[1, 2], [3, 4]])
# Adding a scalar (2) to the array
# The dimensions of this scalar will be increased to 2x2 to match the array dimensions
res = arr + 2
print(res)

Here, the scalar (2) is broadcasted or expanded to a 2x2 array with all the elements equal to 2 to match the dimensions of the original array. Then, the expanded array is added to the original array to get the result.

Memory Efficiency

Memory efficiency is another key advantage of NumPy arrays. Unlike Python lists, which can store elements of different data types, NumPy arrays store all the elements of the same data type in contiguous blocks of memory, allowing for more efficient memory access and reduced overhead.

These advantages make NumPy the preferred choice for performing advanced array operations. Next, let’s learn how to perform joining, an advanced array operation in NumPy.

How to Join Two NumPy Arrays?

In NumPy, array joining is the process of combining the contents of multiple arrays into a single array.

There are two functions for joining two arrays in NumPy:

  • np.concatenate()
  • np.stack()

Let’s discuss them one-by-one.

Using the np.concatenate() Function

The np.concatenate() function joins two NumPy arrays along an existing axis. Here, joining along an axis basically means joining along the rows (horizontally) or columns (vertically). To understand how it works, let’s see the following example:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.concatenate((arr1, arr2), axis=0)
print(arr3)

In the above code, we first created two NumPy arrays using the numpy.array() function. Then, we passed those arrays in a tuple as the first argument and the axis for joining as the second argument to the function.

Here, the axis parameter enables us to join NumPy arrays along the rows or columns. When axis is set to 0, the existing arrays will become columns (joined along the columns or horizontally) in the new array. When axis is set to 1, the existing arrays will become rows (joined along the rows or vertically) in the new array. If axis is not provided, the default value 0 is used.

However, in the above code, the input arrays are 1D and hence, we can only join them along the only dimension, i.e., along the columns (axis=0). If we set axis to 1, we get an error in the output as axis=1 indicates joining along the second dimension, i.e., along the rows, which doesn’t exist.

Let’s check out the output for the above code:

[12 23 34 45 56 67]

Next, let’s set the axis parameter to 1 in the np.concatenate() function and observe the output:

import numpy as np
# Creating two arrays
arr1 = np.array([[11, 12], [13, 14]])
arr2 = np.array([[15, 16], [17, 18]])
# Joining the NumPy arrays
arr3 = np.concatenate((arr1, arr2), axis=1)
print(arr3)

The output is following:

[[11 12 15 16]
[13 14 17 18]]

Using the np.stack() Function

Another function for joining two NumPy arrays is np.stack(). This function works just like np.concatenate(), but there’s a little difference between them. The np.stack() function joins two NumPy arrays along a new axis, whereas np.concatenate() does that along an existing axis.

Here’s how the np.stack() function works:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.stack((arr1, arr2), axis=0)
print(arr3)

Contrary to the np.concatenate() function, when axis is set to 0 in np.stack(), the existing arrays will become rows of the new array. When axis is set to 1, the existing arrays will become columns in the stacked array. Moreover, np.stack() creates a new array that includes one more dimension than the dimensions of the input arrays. Then, the function stacks the input arrays along the rows or columns according to the axis value.

Here is the output:

[[12 23 34]
[45 56 67]]

Below is another example with axis set to 1:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.stack((arr1, arr2), axis=1)
print(arr3)

The output is the following:

[[12 45]
[23 56]
[34 67]]

Using Stacking Functions

Besides the np.concatenate() and np.stack() functions, there are some stacking functions as well, such as np.hstack(), np.vstack(), and np.dstack(), which can also be used to join two NumPy arrays.

Let’s start with the np.hstack() function, which joins two NumPy arrays along the columns or the horizontal axis:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.hstack((arr1, arr2))
print(arr3)

Here is the output:

[12 23 34 45 56 67]

The np.vstack() function joins two NumPy arrays along the rows or the vertical axis:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.vstack((arr1, arr2))
print(arr3)

The output is following:

[[12 23 34]
[45 56 67]]

Lastly, the np.dstack() function joins two NumPy arrays along the width or height:

import numpy as np
# Creating two NumPy arrays
arr1 = np.array([12, 23, 34])
arr2 = np.array([45, 56, 67])
# Joining the arrays
arr3 = np.dstack((arr1, arr2))
print(arr3)

Here is the output:

[[[12 45]
[23 56]
[34 67]]]

Concept Review and Next Steps

In this article, we have navigated through a range of topics, including:

  • What NumPy arrays are
  • The advantages of advanced NumPy array operations
  • How to perform array joining in NumPy

Advanced NumPy array operations like joining are crucial in data workflows as they help us solve a diverse range of challenging problems seamlessly. By mastering these operations, we’ll be better equipped to tackle complex data analysis and scientific computing tasks in an efficient manner.

If you want to learn about splitting arrays, check out this article on Codecademy.

Author

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team