Processors used for personal and business computers started out as scalar processors where the registers and functional units were meant for single elements of data. Because of higher demand, data multimedia applications like processor manufacturers looked at ways they could leverage the benefits of vector processing.
The result was a new processor design that maintains the scalar functionality but added components of vector processors. These additions are known as SIMD Extensions.
Every major processor company has some form of SIMD extension. As data-intensive tasks have become more mainstream the need to improve the performance of the extensions grew.
The x86 instruction set architecture introduced SIMD extensions in the late 1990s with Streaming SIMD Extensions (SSE). The architecture added instructions that worked on 8 registers labeled
xmm7. These registers were 128 bits long and could hold up to four 32-bit floating point values.
Over the years these SIMD Extensions were updated with more instructions and include
zmm registers that are up to 512-bits long. These additional capabilities have been essential to keeping up with the increasing amount of data used by commonly used applications.
The workspace contains opcodes from x86 ISA.
The first instruction,
ADD, takes the scalar value of register
R2 and adds it to the scalar value of
R1. The result is also scalar and is placed in
The second set of instructions operate on vector style registered, called packed here.
ADDPS was one of the first instructions included with the x86 SIMD Extensions. It adds the 128-bit
xmm2 register to the value of the 128-bit
xmm1 register. Similar to
ADD, the result is placed back into the first operand,
VADDPS is an updated version of the
ADDPS instruction. Using newer
zmm registers that support up to 512-bits of data, this instruction can operate on up to sixteen 32-bit floating-point values. There is a 3rd operand so the values in
zmm2 are added and the result is placed in