This week the folks at NVIDIA are making it clear that the K20 family of Tesla GPU architecture is ready for action, and riding in on the wave of power comes the Titan - K20 accelerated and named world's fastest supercomputer just this morning. The Titan supercomputer works with a beastly 18,688 NVIDIA Tesla K20X GPU accelerator units and makes it clear that this family is more than ready to knock the cap off the processing roof in more ways than one. In addition to being the fastest GPU in the world the K20X model working with the Titan has been revealed as the new #1 entry on the Green500 list for energy efficiency.
It's a big day for NVIDIA with the Tesla K20 architecture being reintroduced in its final form powered by CUDA - also known as "the world's most pervasive parallel programming model." NVIDIA backs this claim up with 8,000 institutions with CUDA developers, 1,500,000 CUDA downloads, and a massive 395,000,000 GPUs shipped with CUDA built in. With 629 university courses being taught on CUDA across 62 countries, it's safe to say that it's here for some time to come.
The K20 family also makes with the undeniably next-level powerful performance on scientific applications - this being exactly why the Titan supercomputer uses the architecture for the massive bulk of its processes. The 2011 Gordon Bell Winner for computational simulation was 3.1 Petaflops (3.08 Petaflops on K Computer) with NVIDIA's new effort bringing on 10+ Petaflops here in 2012.
Both the Tesla K20 and the Tesla K20X work with a single GK110 Kepler GPU with your favorite features - Dynamic Parallelism and Hyper-Q! These units have more than one teraflop peak double precision performance and deliver 10 times the performance of a single CPU - this claim by NVIDIA being based on the following: "Ws-lsMs performance comparison between single E5-2687W @ 3.10GHz vs single Tesla K20X. Tesla K20X > 650 gigaflops."
There's also a Tesla K10 model out there, you should know, with memory size of 8GB per board and just SMX inside instead of the addition of Dynamic Parallelism and Hyper-Q, which the K20 and K20X have. The K10 (again, having been on the market now for some time,) has a peak double precision floating point performance of 0.19 teraflops and is made for servers only - it's peak single precision floating point performance, on the other hand, is 4.58 teraflops. The K20 rings in 1.17 teraflops and 3.52 teraflops for double and single precision floating point performance respectively. The K20X nabs 1.31 teraflops and 3.95 teraflops.
The K20 has 5GB memory size per board while the K20X has 6GB, and both devices have just the one GK110 GPU while the K10 has two GK104 units inside. The K20 units are made for massive beastly tasks like financial computing, computational chemistry and physics, and satellite imaging. The K10 on the other hand is made for seismic, image, signal processing, and video analytics.
The NVIDIA Tesla K20 family of GPU accelerators is ready for action this week - shipping today and available for order from your favorite computer store. NVIDIA is working with Appro, ASUS, Cray, Eurotech, Fujitsu, HP, IBM, Quanta Computer, SGI, Supermicro, T-Platforms, Tyan, and NVIDIA reseller partners as well - you'll have no shortage of choices on your hands. Grab a K20 as fast as you can!