By Joel Hruska
For decades, computer chips became smaller and more efficient by
shrinking the size of various features and finding ways to pack more
transistors into a smaller area of silicon. As die shrinks have become
more difficult, companies have turned to 3D die stacking and
technologies like HBM (High Bandwidth Memory) to improve performance.

We’ve talked a great deal about HBM and HBM2
in the past few years, but photographic evidence of the die savings is a
bit harder to come by. SK Hynix helpfully had some HBM memory on
display at GTC this year, and Tweaktown caught photographic evidence of
8Gb of GDDR5 compared with a 1 GB HBM stack and a 4GB HBM2 stack.

Image by TweakTown
The one quibble I have with the Hynix display is that the
labeling mixes GB and Gb. The HBM2 package is significantly larger than
the HBM1 chip, but still much smaller than the 8Gb of GDDR5, despite
packing 4x more memory into its diminutive form factor.
We don’t expect HBM2 to hit market until the tail end of
this year and the beginning of next; GDDR5 is expected to have one last
hurrah with the launch of AMD’s Polaris
this year. These space savings, however, illustrate why both AMD and NV
are moving to HBM2 at the high end. Smaller dies mean smaller GPUs with
higher memory densities for both consumer, professional, and scientific
applications. Technologies like GDDR5X,
which rely on 2D planar silicon, can’t compete with the capacity
advantage of layering multiple chips on top of each other and connecting
them with TSVs (through silicon vias). GDDR5 will continue to be used
for budget and midrange cards this generation, but HBM2 will likely
replace it over the long term as prices fall, lower-end cards require
more VRAM, and manufacturer yields improve.

Over the long term, though, even HBM2 isn’t enough to feed
the needs of next-generation exascale systems. The slide above is from
an Nvidia presentation on high performance computing (HPC) and the
energy requirements of DRAM subsystems. Shifting to HBM drives a
significant improvement in I/O power and an absolute improvement in
total power consumption for the DRAM subsystem. HBM2 draws less power to
provide 1TB/s of bandwidth than GDDR5 used to prove 200GB/s.
Unfortunately, straightforward scaling of the HBM2 interface
won’t prevent future memory standards from exceeding GDDR5 power
requirements. Long-term, additional improvements and process node
shrinks are still necessary — even if die-stacking has replaced planar
silicon die shrinks as the primary performance driver.
Post a Comment