A supercomputer is a computer that is considered to be at the frontline in terms of processing capacity, particularly speed of calculation at the time it was built. As with all technologies, that which is today's wonder supercomputer fast becomes tommorrow's standard (ordinary) computer.
Supercomputer Evolution
Supercomputer technologies are evolving just as rapidly as other computer technologies. In fact it is some of these “other” computing technologies that are helping to drive the supercomputer.
During the 1970s all the way through the mid-1980s we saw supercomputers built mainly using vector processors working in parallel. Typically this was anywhere between four to sixteen CPUs.
The next phase of the supercomputer evolution saw the introduction of massive parallel processing and a drift away from vector processors.
Now we find that instead of using “specialist” processors in their design, the supercomputers of today and tomorrow are based on "off the shelf" server-class microprocessors, such as the IBM PowerPC, Intel Itanium, or AMD x86-64.
The modern supercomputer is firmly based around massively parallel processing by clustering very large numbers of commodity processors combined with custom interconnects.
Vector Processing
Vector processing is when the processor takes one instruction and applies it to multiple data or data sets. Vector processing works best when very large data sets are involved. Some vector processing instructions are very complex which saves considerably in instruction decoding time for large data sets but is not necessarily great when it comes to simpler processing that does not involve large data sets.
Because of this modern CPUs have vector processing capabilities built into them where the vector unit runs alongside the main scalar processor and is supplied data by programs that “know” it is there.
Single Instruction, Multiple Data (SIMD)
The modern Graphics Processing Unit (GPU) uses a type of vector processing named Single Instruction Multiple Data (SIMD). This technique saves a lot of processing and processing cycles. Intel's SSE is an example of SIMD processing.
Multiple Instruction, Multiple Data (MIMD)
The processor performs mulitple instructions for vector processing on multiple (vectorised) data sets.
As a matter of interest your average home PC processes more data while you watch a short video than all of the 1970s supercomputers put together.
Current Supercomputer Hierarchal Architecture
The supercomputer of today is built on a hierachal design where a number of clustered computers are joined by ultra high speed network (switching fabric) optical interconnections.
Each cluster member is a computer composed of a number of Multiple Instruction, Multiple Data (MIMD) multiprocessors and runs its own instance of an operating system.
Each of these multiprocessors has multiple processing cores of which the application software is oblivious. These multicore processors share tasks using Symmetric MultiProcessing (SMP) and Non-Uniform Memory Access (NUMA).
Each core is a Single Instruction, Multiple Data (SIMD) processor capable of running a number of instructions simultaneously and many SIMD instructions per nanosecond.
Supercomputer Performance - FLOPS
The performance of “normal” computers is measured in terms of Millions of Instructions Per Second (MIPS). Supercomputer performance on the other hand is measured in terms of Floating Point Operations Per Second.
A floating point number is a number expressed in scientific notation (a basic number, a base and an exponent). For example 4.5546 x 1014. In this example I used the standard scientific notation which uses a base of 10. Binary or base 2 is also used.
With the enormous processing power of a modern supercomputer the number of floating point operations that it executes every second is very high, so we use the SI prefix system to make these numbers more manageable for the human mind.
Mega = 106, Giga = 109, Tera = 1012, Peta = 1015, Exa = 1018 and Zetta = 1021.
The reason why we use floating point notation is that it enables us/the computer to deal with incredably large, long numbers that it would otherwise be unable to do.
The Fastest Supercomputer
The November 2007 edition of the Top500 list placed IBMs Blue Gene/L as the fastest supercomputer running. The Blue Gene/L consists of a cluster of 65,536 computers, each with two processors, each of which processes two data streams concurrently. The IBM Blue Gene/L has a peak processing capacity of 596 teraflops. The Cray XT4 with 101.7 teraflops was second.
IBM Claims one petaflops Blue Gene/P
The chip inside IBM's Blue Gene/P supercomputer consists of four PowerPC 450 cores running at 850MHz each whereas that in the IBM Blue Gene/L had two PowerPC cores running at 700MHz.
Each 2' x 2' Blue Gene/P PCB holds 32 of these quad core PowerPC chips and can crunch its way through 435 billion operations per second. Each 6' rack can hold 32 of these PCBs. The one petaflops IBM Blue Gene/P supercomputer comes with 294,912 processors and takes up 72 racks in all.