Data-level parallelism is an approach to computer processing that aims to increase data throughput by operating on multiple elements of data simultaneously.
There are many motivations for data-level parallelism, including:
- Researching faster computer systems
- Multimedia applications
- Big data applications
Single Instruction Multiple Data
Single Instruction Multiple Data (SIMD) is a classification of data-level parallelism architecture that uses one instruction to work on multiple elements of data.
Examples of SIMD architectures are:
- Vector processors
- SIMD Extensions
- Graphical Processing Units
Vector processing is an SIMD architecture that operates on vectors of data to achieve higher data throughput. Vector processors benefit from being able to store and process multiple elements of data. This behavior results in:
Less instruction overhead
With one instruction for multiple elements of data, there will be fewer instructions to fetch and decode compared to a scalar processor working on the same amount of data.
Overlapping memory accesses
Accessing large chunks of memory to be processed can create its own pipeline, where one set of data is processed while the next set is being retrieved from memory.
While pipelining is a form of instruction-level parallelism it can still be used by vector processors. Vector processor designs support and utilize the classic instruction cycle: Fetch, Decode, Execute, Memory Access, and Write-Back.
Vector Processor Architecture
A vector processor’s architecture includes elements that allow for the processing of vectors of data with a single instruction. These include:
These are electronic storage banks with longer bit lengths to hold multiple elements of data at a time.
A circuit design that ensures the next element of data to be processed is ready when the previous element is complete.
Paths that data is sent through to be processed. There can be multiple lanes in a single architecture to operate on multiple elements of data simultaneously.
SIMD Extensions are SIMD architectures that have been added as additions to scalar processors. They allow for the processing of vectors of data while maintaining the initial scalar data functionality of the processor.
Computer processor manufacturers like Intel, AMD, and Arm have implemented SIMD extensions into their commercial architectures to better handle high data applications like video games, multimedia, and machine learning algorithms.
Graphical Processing Units
Graphical Processing Units (GPUs) are a type of SIMD architecture and are specifically designed to process graphical data.
GPUs are generally paired with processors, where the GPU handles graphical data and the processor handles other application data. More recently GPUs are used to process other types of application data that resemble graphical data, such as data sets for machine learning applications.
Single Instruction Multiple Thread
Single Instruction Multiple Thread (SIMT) is an SIMD architecture used by GPUs. It uses simple functional components called threads to process data given an instruction. Threads are designed to be smaller than most functional components so there can be a lot of them to process the large amount of data received by GPUs.