Sorting and Unary Operations in NumPy
What is sorting in NumPy?
Sorting in NumPy involves arranging elements in an array in a particular order, either ascending or descending. This helps organize data for more efficient analysis and processing.
To sort in NumPy, we can use the numpy.sort()
function. By default, it sorts an array of numbers in ascending order. While direct sorting in descending order isn’t available through this function, it can be achieved by sorting in ascending order and then reversing the result. Additionally, we can customize the sorting order or specify the axis when working with multi-dimensional arrays. This flexibility allows for better organization of numerical data for efficient analysis and processing.
How to sort arrays in NumPy?
The numpy.sort()
function allows us to sort arrays by specifying parameters such as the axis
to sort along, the sorting algorithm (kind
), and any custom sorting order. The syntax of the function is as follows:
numpy.sort(arr, axis=-1, kind=None, order=None)
arr
: The input array that is to be sorted.axis
: Determines the axis along which the sorting will occur. Settingaxis=0
sorts the columns of a multi-dimensional array, whileaxis=1
sorts the rows. The default value is -1, which sorts along the last axis, meaning the most nested level of the array. In a 2D array, the last axis refers to the rows, while in a 3D array, it represents the innermost dimension.kind
: The sorting algorithm to use. Options are ‘quicksort’, ‘mergesort’, and ‘heapsort’. The default is ‘quicksort’.order
: Specifies the field(s) to sort by. The fields correspond to the names of the columns in the array, and you can specify one or multiple field names as either a string or a list of strings.
Sorting a single-dimensional array
To sort a 1D array (similar to a Python list), we can use the numpy.sort()
function. By default, this function sorts the array in ascending order.
import numpy as nparray = np.array([12, 3, 7, 5, 9])sorted_array = np.sort(array) # Sorts in ascending orderprint(sorted_array)
Output:
[3 5 7 9 12]
If the array needs to be sorted in descending order, the result can be reversed like so:
import numpy as nparray = np.array([3, 9, 4, 1, 5])sorted_array = np.sort(array)[::-1] # Sorts in ascending order and reverses for descendingprint(sorted_array)
Output:
[9 5 4 3 1]
Sorting a multi-dimensional array
When working with multi-dimensional arrays (such as 2D arrays), sorting can be done along specific axes - either rows or columns. As we saw in the syntax, the numpy.sort()
function includes an axis
parameter to control which dimension to sort.
Sorting rows in a 2D array
To sort the elements of each row in a 2D array, the axis
parameter is set to 1
in the numpy.sort()
function.
Example:
import numpy as nparray_2d = np.array([[9, 4, 2], [3, 8, 5]])sorted_rows = np.sort(array_2d, axis=1)print(sorted_rows)
Output:
[[2 4 9][3 5 8]]
This sorts each row individually in ascending order.
Sorting columns in a 2D array
To sort the elements of each column in a 2D array, we need to change the axis
parameter to 0
. This will sort column-wise across rows.
import numpy as nparray_2d = np.array([[9, 4, 2], [3, 8, 5]])sorted_columns = np.sort(array_2d, axis=0)print(sorted_columns)
Output:
[[3 4 2][9 8 5]]
In this case, each column is sorted independently, while rows remain unchanged.
Now that we have covered basic sorting, let’s explore advanced techniques for multi-criteria sorting and handling structured arrays in NumPy.
Advanced sorting in NumPy
Advanced sorting in NumPy provides efficient and flexible ways to organize data, especially for complex datasets. Let’s explore these advanced techniques.
Sorting Arrays in-place in NumPy
In-place sorting modifies the original array directly, saving memory by avoiding the creation of a new sorted array. We can achieve this using the numpy.ndarray.sort()
method.
In contrast, methods like numpy.sort()
create a new sorted array, leaving the original unchanged and consuming extra memory. In-place sorting is more efficient for large datasets, as it reduces memory usage and enhances performance.
Example:
import numpy as nparray = np.array([5, 2, 9, 1, 5])array.sort() # Sorts the array in-placeprint(array)
Output:
[1 2 5 5 9]
This method is efficient for large datasets because it processes data directly in place, using less memory compared to creating multiple copies of the data.
Sorting an array using numpy.argsort()
The numpy.argsort()
function returns the indices of the elements in an array that would sort the array in ascending order. This allows us to reorder the data based on these indices without changing the original array.
import numpy as nparray = np.array([5, 2, 9, 1, 5])sorted_indices = np.argsort(array)print(sorted_indices)
Output:
[3 1 0 4 2]
We can use these indices to rank the elements in the original array from smallest to largest or to rearrange related arrays so that they match the sorted order of the original array.
Sorting an array using numpy.lexsort()
When sorting structured arrays by multiple criteria, we can use numpy.lexsort()
. A structured array is like a table where each column can hold different types of data. With numpy.lexsort()
, we can sort the structured array based on multiple columns, starting with the first column and using the next columns to decide how to order items that are the same.
import numpy as np# Creating a structured array with names and agesdata = np.array([("Alice", 25), ("Bob", 30), ("Charlie", 25)],dtype=[("name", "U10"), ("age", "i4")])# Using numpy.lexsort to get indices that sort by age first, then by namesorted_indices = np.lexsort((data["age"], data["name"]))# Rearranging the original data based on the sorted indicessorted_data = data[sorted_indices]# Printing the sorted structured arrayprint(sorted_data)
Output:
[('Alice', 25) ('Bob', 30) ('Charlie', 25)]
In this code, we create a structured array with names and ages. The numpy.lexsort()
function sorts the array first by age and then by name. The U10
specifies that the “name” field can hold up to 10 Unicode characters, while i4
indicates that the “age” field is a 4-byte integer. The sorted indices are used to rearrange the original data, resulting in a new array organized according to the specified criteria, making it easier to analyze.
What are unary operations in NumPy?
Unary operations in NumPy are fundamental tools that perform element-wise calculations on arrays, enabling quick and efficient data transformations.
Applying unary operations in NumPy
Performing mathematical operations on arrays
NumPy provides a variety of unary mathematical operations, below are some common examples:
- Square Root (
np.sqrt()
): Computes the square root of each element in the array. - Absolute Value (
np.abs()
): Returns the absolute value of each element. - Square (
np.square()
): Squares each element in the array. - Round (
np.round()
): Rounds each element to the nearest integer. - Ceiling (
np.ceil()
): Rounds each element up to the nearest integer. - Floor (
np.floor()
): Rounds each element down to the nearest integer.
Example:
import numpy as nparray = np.array([-1, 0, 1, 4, 9])# Performing unary operationssquare_root = np.sqrt(array[array >= 0])absolute_values = np.abs(array)squared = np.square(array)rounded = np.round(array)ceiling = np.ceil(array)floor = np.floor(array)# Printing the resultsprint("Original Array:", array)print("Square Root (non-negative):", square_root)print("Absolute Values:", absolute_values)print("Squared Values:", squared)print("Rounded Values:", rounded)print("Ceiling Values:", ceiling)print("Floor Values:", floor)
Output:
Original Array: [-1 0 1 4 9]Square Root (non-negative): [0. 1. 2. 3.]Absolute Values: [1 0 1 4 9]Squared Values: [ 1 0 1 16 81]Rounded Values: [-1 0 1 4 9]Ceiling Values: [-1. 0. 1. 4. 9.]Floor Values: [-1. 0. 1. 4. 9.]
These unary operations allow us to perform essential mathematical transformations on the data efficiently, enabling a wide range of analytical tasks in data analysis and scientific computing.
Performing trigonometric functions
Trigonometric functions compute the sine, cosine, and tangent of a value, which is useful for various scientific and engineering applications. Common trigonometric functions in NumPy include:
- Sine (
np.sin()
): Computes the sine of each angle in radians. - Cosine (
np.cos()
): Calculates the cosine of each angle in radians. - Tangent (
np.tan()
): Returns the tangent of each angle in radians.
Example:
import numpy as np# Creating a NumPy array of angles in radiansangles = np.array([0, np.pi/4, np.pi/2, np.pi, 3*np.pi/2])sine_values = np.sin(angles)cosine_values = np.cos(angles)tangent_values = np.tan(angles)print("Angles (radians):", angles)print("Sine Values:", sine_values)print("Cosine Values:", cosine_values)print("Tangent Values:", tangent_values)
Output:
Angles (radians): [0. 0.78539816 1.57079633 3.14159265 4.71238898]
Sine Values: [ 0.00000000e+00 7.07106781e-01 1.00000000e+00 1.22464680e-16
-1.00000000e+00]
Cosine Values: [ 1.00000000e+00 7.07106781e-01 6.12323400e-17 -1.00000000e+00
-1.83697020e-16]
Tangent Values: [ 0.00000000e+00 1.00000000e+00 1.63312394e+16 -1.22464680e-16 5.44374645e+15]
Using exponential and logarithmic functions on arrays
Exponential functions in NumPy
The exponential function is fundamental in various fields, including finance, physics, and biology. In NumPy, we can calculate the exponential of all elements in an array using the np.exp()
function.
Example:
import numpy as nparray = np.array([0, 1, 2, 3, 4])# Calculating the exponential of each elementexponential_values = np.exp(array)print("Original Array:", array)print("Exponential Values:", exponential_values)
Output:
Original Array: [0 1 2 3 4]Exponential Values: [ 1. 2.71828183 7.3890561 20.08553692 54.59815003]
Logarithmic functions in NumPy
Logarithmic functions compute values that help scale large numbers and analyze exponential growth. In NumPy, the np.log()
function applies logarithmic transformations to arrays, calculating the natural logarithm (base e), while np.log10()
is used for base 10 logarithms.
Example:
import numpy as nparray = np.array([1, 10, 100, 1000])# Calculating the natural logarithm of each elementlogarithmic_values = np.log(array)print("Original Array:", array)print("Logarithmic Values:", logarithmic_values)
Output:
Original Array: [ 1 10 100 1000]Logarithmic Values: [0. 2.30258509 4.60517019 6.90775528]
Why are sorting and unary operations efficient in NumPy?
Why is sorting faster in NumPy than Python’s built-in functions?
NumPy employs advanced internal sorting algorithms that are optimized for performance, making them significantly faster than Python’s built-in sorting methods. Here’s why:
- Optimized Algorithms: NumPy utilizes efficient algorithms like Timsort and Quicksort, which are designed for speed and low memory usage, particularly for large arrays.
- Contiguous Memory Allocation: NumPy arrays are stored in contiguous memory blocks, enabling quicker access and manipulation of data compared to Python lists, which can be fragmented.
- Low-Level Implementation: NumPy is implemented in C, allowing for lower-level optimizations that enhance performance, especially for large datasets.
What makes Unary Operations in NumPy more efficient?
NumPy’s unary operations are built to carry out calculations on entire arrays at once instead of on individual elements. This approach offers several advantages:
- Vectorization: Instead of processing elements individually, NumPy applies operations to entire arrays simultaneously. This approach reduces overhead and speeds up calculations.
- Memory Efficiency: NumPy uses less memory when performing operations on large datasets, as it operates directly on the data without requiring additional copies.
- Parallel Processing: Many of NumPy’s operations can be executed in parallel, leveraging modern CPU architectures for faster computation.
The efficiency of sorting and unary operations in NumPy makes it an essential tool for data manipulation and analysis.
Wrapping up
In this article, we explored NumPy’s powerful sorting and unary operations features, highlighting its significance in efficient data manipulation. NumPy is an essential library for numerical computing in Python, offering robust tools for sorting arrays and performing fast element-wise calculations. Its superior performance compared to Python’s built-in functions makes it the go-to choice for handling large datasets, making your data analysis tasks smoother and more efficient.
To take your skills to the next level, explore additional NumPy functionalities like Ufuncs in NumPy.
Author
'The Codecademy Team, composed of experienced educators and tech experts, is dedicated to making tech skills accessible to all. We empower learners worldwide with expert-reviewed content that develops and enhances the technical skills needed to advance and succeed in their careers.'
Meet the full teamRelated articles
- Article
Creating and Using NumPy Arrays - A Complete Guide
In this article, we will discuss how to create and use NumPy arrays. We will also go through various array operations, indexing, slicing, and reshaping with practical Python examples for data science. - Article
What are Ufuncs in NumPy
Learn how to use NumPy Ufuncs for efficient array operations, including element-wise calculations, and optimize data processing for better performance.
Learn more on Codecademy
- Skill path
Code Foundations
Start your programming journey with an introduction to the world of code and basic concepts.Includes 5 CoursesWith CertificateBeginner Friendly4 hours - Career path
Full-Stack Engineer
A full-stack engineer can get a project done from start to finish, back-end to front-end.Includes 51 CoursesWith Professional CertificationBeginner Friendly150 hours