IBM’s custom BlueGene/Q low-power supercomputer that can deliver more than 2 GFLOPS/W, or 2 billion floating-point operations per second (GFLOPS) divided by watts (W), dominates the top 20 slots of the latest Green500 List.

Two of the custom IBM BlueGene/Q, Power BQC 16C 1.60 Gigahertz models supercomputers tie for the top spot at 2.1088 billion floating-point operations per second divided by watts, each with a total power usage of 41.10 kilowatts. One is a supercomputer jointly operated by the U.S. Department of Energy, the National Nuclear Security Administration, and the Lawrence Livermore National Laboratory in Livermore, Calif., the other is a supercomputer operated by the IBM Thomas J. Watson Research Center in Yorktown Heights, N.Y.

Alternative BlueGene/Q configurations actually hold the next 18 slots on the Green500, with supercomputers operating at 2.1086 billion floating-point operations per second divided by watts and consuming 82.2 kilowatts holding slots 3 through 7.

“IBM Blue is flexing its greenness with 38 of the top 50 spots on the Green500,” said Wu Feng, a computer science researcher at Virginia Tech, and co-founder of the Green 500 List. “Perhaps even more impressive is that IBM did it with custom BlueGene/Q supercomputers as well as commodity-based supercomputers, both accelerator-based and non-accelerator-based.”

The Green500 has ranked the energy efficiency of the world’s 500 fastest supercomputers since its 2007 debut, serving as a complement to the well-known Top500. It is released twice a year, in June and in November. Public performance data on the supercomputers is compiled from data made available through public resources, as well as from input from the research labs.

The Green500 was founded by Feng, an associate professor with Virginia Tech’s Department of Computer Science and Bradley Department of Electrical and Computer Engineering, and Kirk W. Cameron, an associate professor of computer science. Both departments are within the Virginia Tech College of Engineering.

Further down the list, a supercomputer based on the Intel Many Integrated Core (MIC) processor, now known as Intel Xeon Phi, debuts on the Green500 at No. 21 as the most energy-efficient commodity accelerator-based, or many-core, supercomputer, whereas the first 20 spots are custom IBM supercomputers.

The Intel system also is an accelerator-based supercomputer, similar to Virginia Tech’s  commodity-based supercomputer HokieSpeed and Nagasaki University’s DEGIMA cluster in Japan. HokieSpeed was ranked as the most energy-efficient commodity-based supercomputer in the United States, while DEGIMA was ranked as the greenest supercomputer overseas in the fall 2011 Green500 List. However, HokieSpeed is an NVIDIA graphics processor unit-accelerated supercomputer, or commonly GPU, while DEGIMA is an AMD/ATI Radeon graphics processor unit-accelerated supercomputer, said Feng.

The next greenest commodity-based supercomputers are AMD-based, at No. 22, and NVIDIA-based, at No. 23, graphics processor unit-accelerated supercomputers. Elsewhere on the list, 19 of the top 50 machines on the Green500 are accelerator-based with accelerators manufactured by NVIDIA, with 17 supercomputers, and AMD and Intel, with one each.

“IBM is ‘talking the talk’ and ‘walking the walk’ with both custom BlueGene/Q supercomputers topping the Green500 and commodity-based supercomputers, both accelerator-based and non-accelerator-based, landing in the top 50,” said Feng. “In total, IBM landed 38 of the top 50 spots on the Green500.”

In the near future, Feng said IBM’s BlueGene/Q machines, along with graphics processor unit-accelerated supercomputers, and non-accelerated but, efficient commodity central processing unit-based supercomputers, will continue to dominate the upper echelons of the Green500.  

“While central processing unit-based supercomputers and are graphics processor unit-accelerated supercomputers are currently discrete and distinct components in a compute node, we should expect to see the arrival of supercomputing systems with CPUs and GPUs fused onto a single processor,” Feng said. “Much as the floating-point unit started as a discrete co-processor and then was incorporated into the central processing unit, so will the graphics processing unit go from being a discrete co-processor to being fused into a single processing device.”