Sorting and Unary Operations in NumPy

Explore sorting and unary operations in NumPy arrays with examples for single and multi-dimensional data.

What is sorting in NumPy?

Sorting in NumPy involves arranging elements in an array in a particular order, either ascending or descending. This helps organize data for more efficient analysis and processing.

To sort in NumPy, we can use the numpy.sort() function. By default, it sorts an array of numbers in ascending order. While direct sorting in descending order isn’t available through this function, it can be achieved by sorting in ascending order and then reversing the result. Additionally, we can customize the sorting order or specify the axis when working with multi-dimensional arrays. This flexibility allows for better organization of numerical data for efficient analysis and processing.

How to sort arrays in NumPy?

The numpy.sort() function allows us to sort arrays by specifying parameters such as the axis to sort along, the sorting algorithm (kind), and any custom sorting order. The syntax of the function is as follows:

numpy.sort(arr, axis=-1, kind=None, order=None) 
  • arr: The input array that is to be sorted.
  • axis: Determines the axis along which the sorting will occur. Setting axis=0 sorts the columns of a multi-dimensional array, while axis=1 sorts the rows. The default value is -1, which sorts along the last axis, meaning the most nested level of the array. In a 2D array, the last axis refers to the rows, while in a 3D array, it represents the innermost dimension.
  • kind: The sorting algorithm to use. Options are ‘quicksort’, ‘mergesort’, and ‘heapsort’. The default is ‘quicksort’.
  • order: Specifies the field(s) to sort by. The fields correspond to the names of the columns in the array, and you can specify one or multiple field names as either a string or a list of strings.

Sorting a single-dimensional array

To sort a 1D array (similar to a Python list), we can use the numpy.sort() function. By default, this function sorts the array in ascending order.

import numpy as np
array = np.array([12, 3, 7, 5, 9])
sorted_array = np.sort(array) # Sorts in ascending order
print(sorted_array)

Output:

[3 5 7 9 12]

If the array needs to be sorted in descending order, the result can be reversed like so:

import numpy as np
array = np.array([3, 9, 4, 1, 5])
sorted_array = np.sort(array)[::-1] # Sorts in ascending order and reverses for descending
print(sorted_array)

Output:

[9 5 4 3 1]

Sorting a multi-dimensional array

When working with multi-dimensional arrays (such as 2D arrays), sorting can be done along specific axes - either rows or columns. As we saw in the syntax, the numpy.sort() function includes an axis parameter to control which dimension to sort.

Sorting rows in a 2D array

To sort the elements of each row in a 2D array, the axis parameter is set to 1 in the numpy.sort() function.

Example:

import numpy as np
array_2d = np.array([[9, 4, 2], [3, 8, 5]])
sorted_rows = np.sort(array_2d, axis=1)
print(sorted_rows)

Output:

[[2 4 9]
[3 5 8]]

This sorts each row individually in ascending order.

Sorting columns in a 2D array

To sort the elements of each column in a 2D array, we need to change the axis parameter to 0. This will sort column-wise across rows.

import numpy as np
array_2d = np.array([[9, 4, 2], [3, 8, 5]])
sorted_columns = np.sort(array_2d, axis=0)
print(sorted_columns)

Output:

[[3 4 2]
[9 8 5]]

In this case, each column is sorted independently, while rows remain unchanged.

Now that we have covered basic sorting, let’s explore advanced techniques for multi-criteria sorting and handling structured arrays in NumPy.

Advanced sorting in NumPy

Advanced sorting in NumPy provides efficient and flexible ways to organize data, especially for complex datasets. Let’s explore these advanced techniques.

Sorting Arrays in-place in NumPy

In-place sorting modifies the original array directly, saving memory by avoiding the creation of a new sorted array. We can achieve this using the numpy.ndarray.sort() method.

In contrast, methods like numpy.sort() create a new sorted array, leaving the original unchanged and consuming extra memory. In-place sorting is more efficient for large datasets, as it reduces memory usage and enhances performance.

Example:

import numpy as np
array = np.array([5, 2, 9, 1, 5])
array.sort() # Sorts the array in-place
print(array)

Output:

[1 2 5 5 9]

This method is efficient for large datasets because it processes data directly in place, using less memory compared to creating multiple copies of the data.

Sorting an array using numpy.argsort()

The numpy.argsort() function returns the indices of the elements in an array that would sort the array in ascending order. This allows us to reorder the data based on these indices without changing the original array.

import numpy as np
array = np.array([5, 2, 9, 1, 5])
sorted_indices = np.argsort(array)
print(sorted_indices)

Output:

[3 1 0 4 2]

We can use these indices to rank the elements in the original array from smallest to largest or to rearrange related arrays so that they match the sorted order of the original array.

Sorting an array using numpy.lexsort()

When sorting structured arrays by multiple criteria, we can use numpy.lexsort(). A structured array is like a table where each column can hold different types of data. With numpy.lexsort(), we can sort the structured array based on multiple columns, starting with the first column and using the next columns to decide how to order items that are the same.

import numpy as np
# Creating a structured array with names and ages
data = np.array([("Alice", 25), ("Bob", 30), ("Charlie", 25)],
dtype=[("name", "U10"), ("age", "i4")])
# Using numpy.lexsort to get indices that sort by age first, then by name
sorted_indices = np.lexsort((data["age"], data["name"]))
# Rearranging the original data based on the sorted indices
sorted_data = data[sorted_indices]
# Printing the sorted structured array
print(sorted_data)

Output:

[('Alice', 25) ('Bob', 30) ('Charlie', 25)]

In this code, we create a structured array with names and ages. The numpy.lexsort() function sorts the array first by age and then by name. The U10 specifies that the “name” field can hold up to 10 Unicode characters, while i4 indicates that the “age” field is a 4-byte integer. The sorted indices are used to rearrange the original data, resulting in a new array organized according to the specified criteria, making it easier to analyze.

What are unary operations in NumPy?

Unary operations in NumPy are fundamental tools that perform element-wise calculations on arrays, enabling quick and efficient data transformations.

Applying unary operations in NumPy

Performing mathematical operations on arrays

NumPy provides a variety of unary mathematical operations, below are some common examples:

Example:

import numpy as np
array = np.array([-1, 0, 1, 4, 9])
# Performing unary operations
square_root = np.sqrt(array[array >= 0])
absolute_values = np.abs(array)
squared = np.square(array)
rounded = np.round(array)
ceiling = np.ceil(array)
floor = np.floor(array)
# Printing the results
print("Original Array:", array)
print("Square Root (non-negative):", square_root)
print("Absolute Values:", absolute_values)
print("Squared Values:", squared)
print("Rounded Values:", rounded)
print("Ceiling Values:", ceiling)
print("Floor Values:", floor)

Output:

Original Array: [-1 0 1 4 9]
Square Root (non-negative): [0. 1. 2. 3.]
Absolute Values: [1 0 1 4 9]
Squared Values: [ 1 0 1 16 81]
Rounded Values: [-1 0 1 4 9]
Ceiling Values: [-1. 0. 1. 4. 9.]
Floor Values: [-1. 0. 1. 4. 9.]

These unary operations allow us to perform essential mathematical transformations on the data efficiently, enabling a wide range of analytical tasks in data analysis and scientific computing.

Performing trigonometric functions

Trigonometric functions compute the sine, cosine, and tangent of a value, which is useful for various scientific and engineering applications. Common trigonometric functions in NumPy include:

Example:

import numpy as np
# Creating a NumPy array of angles in radians
angles = np.array([0, np.pi/4, np.pi/2, np.pi, 3*np.pi/2])
sine_values = np.sin(angles)
cosine_values = np.cos(angles)
tangent_values = np.tan(angles)
print("Angles (radians):", angles)
print("Sine Values:", sine_values)
print("Cosine Values:", cosine_values)
print("Tangent Values:", tangent_values)

Output:

Angles (radians): [0.         0.78539816 1.57079633 3.14159265 4.71238898] 
Sine Values: [ 0.00000000e+00  7.07106781e-01  1.00000000e+00  1.22464680e-16 
-1.00000000e+00] 
Cosine Values: [ 1.00000000e+00  7.07106781e-01  6.12323400e-17 -1.00000000e+00 
-1.83697020e-16] 
Tangent Values: [ 0.00000000e+00  1.00000000e+00  1.63312394e+16 -1.22464680e-16 5.44374645e+15] 

Using exponential and logarithmic functions on arrays

Exponential functions in NumPy

The exponential function is fundamental in various fields, including finance, physics, and biology. In NumPy, we can calculate the exponential of all elements in an array using the np.exp() function.

Example:

import numpy as np
array = np.array([0, 1, 2, 3, 4])
# Calculating the exponential of each element
exponential_values = np.exp(array)
print("Original Array:", array)
print("Exponential Values:", exponential_values)

Output:

Original Array: [0 1 2 3 4]
Exponential Values: [ 1. 2.71828183 7.3890561 20.08553692 54.59815003]

Logarithmic functions in NumPy

Logarithmic functions compute values that help scale large numbers and analyze exponential growth. In NumPy, the np.log() function applies logarithmic transformations to arrays, calculating the natural logarithm (base e), while np.log10() is used for base 10 logarithms.

Example:

import numpy as np
array = np.array([1, 10, 100, 1000])
# Calculating the natural logarithm of each element
logarithmic_values = np.log(array)
print("Original Array:", array)
print("Logarithmic Values:", logarithmic_values)

Output:

Original Array: [ 1 10 100 1000]
Logarithmic Values: [0. 2.30258509 4.60517019 6.90775528]

Why are sorting and unary operations efficient in NumPy?

Why is sorting faster in NumPy than Python’s built-in functions?

NumPy employs advanced internal sorting algorithms that are optimized for performance, making them significantly faster than Python’s built-in sorting methods. Here’s why:

  • Optimized Algorithms: NumPy utilizes efficient algorithms like Timsort and Quicksort, which are designed for speed and low memory usage, particularly for large arrays.
  • Contiguous Memory Allocation: NumPy arrays are stored in contiguous memory blocks, enabling quicker access and manipulation of data compared to Python lists, which can be fragmented.
  • Low-Level Implementation: NumPy is implemented in C, allowing for lower-level optimizations that enhance performance, especially for large datasets.

What makes Unary Operations in NumPy more efficient?

NumPy’s unary operations are built to carry out calculations on entire arrays at once instead of on individual elements. This approach offers several advantages:

  • Vectorization: Instead of processing elements individually, NumPy applies operations to entire arrays simultaneously. This approach reduces overhead and speeds up calculations.
  • Memory Efficiency: NumPy uses less memory when performing operations on large datasets, as it operates directly on the data without requiring additional copies.
  • Parallel Processing: Many of NumPy’s operations can be executed in parallel, leveraging modern CPU architectures for faster computation.

The efficiency of sorting and unary operations in NumPy makes it an essential tool for data manipulation and analysis.

Wrapping up

In this article, we explored NumPy’s powerful sorting and unary operations features, highlighting its significance in efficient data manipulation. NumPy is an essential library for numerical computing in Python, offering robust tools for sorting arrays and performing fast element-wise calculations. Its superior performance compared to Python’s built-in functions makes it the go-to choice for handling large datasets, making your data analysis tasks smoother and more efficient.

To take your skills to the next level, explore additional NumPy functionalities like Ufuncs in NumPy.

Author

Codecademy Team

'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'

Meet the full team