2024-06-01
NVIDIA has used the strength of Grace Hopper to prove that it can become the hardware of choice for AI supercomputing and will no longer be limited to GPUs in the future.
On May 12, Nvidia announced that nine new supercomputers around the world are using Nvidia Grace Hopper superchips to accelerate scientific research and discovery, thereby promoting the high-performance computing (HPC) industry to shift to being driven by AI.
The nine supercomputers that will be launched soon include France's EXA1-HE, Poland's Helios, Switzerland's Alps, Germany's JUPITER, the United States' UIUC's DeltaAI, and Japan's Miyabi.
The combined computing power of these nine systems can provide a processing power of 200 exaflops (that is, 20 billion floating-point calculations per second).
In addition, Isambard-AI and Isambard 3 at the University of Bristol in the UK, as well as systems at the Los Alamos National Laboratory and the Texas Advanced Computing Center in the United States, have also begun to use Grace Hopper's hardware and platform. Among them, the HPE Cray EX2500, the first phase supercomputer of Isambard-AI, is equipped with 168 GH200 super chips, making it one of the most efficient computers in history.
The remaining 5,280 chips are expected to be delivered to the Isambard-AI system this summer, which will improve its performance by approximately 32 times, thereby promoting the development of data analysis, drug discovery, climate research and more AI4Science fields.
NVIDIA's Grace Hopper superchip architecture is the first truly heterogeneous acceleration platform that combines the high performance of Hopper GPUs and the versatility of Grace CPUs in a single chip, purpose-built for accelerated computing and generative AI.
GH200 chip architecture diagram
The GH200 chip in this series has very powerful capabilities in AI and high-performance computing. A single GH200 chip consists of a 72-core Grace CPU and an H100 GPU, with a memory capacity of up to 624GB.
For exascale high-performance computing or trillion-parameter-level AI models, the transmission speed between chips is almost as important as the computing power of the chip. High-speed, seamless communication is required between each GPU in the server cluster. to achieve massive acceleration.
NVIDIA's NVLink technology is designed to solve communication problems. The CPU and GPU in the GH200 are connected together through NVLink C2C, providing 900GB/s bandwidth, which is 7 times the bandwidth of the fifth-generation PCIe.
On a single server, dual GH200 chips connected via NVlink can provide 3.5 times more GPU memory capacity and 3 times the bandwidth than H100.
However, Nvidia has not disclosed the price of the GH200. For reference, the current official price of the H100 series is about US$40,000.
In the past two years, Nvidia has continued to deploy in the fields of servers and high-performance computing, competing with AMD, Intel and other companies.
Although Nvidia's GPU business is booming and has made a lot of money, almost controlling the entire AI GPU market, it is also very important to enter high-performance computing, because providing hardware and platforms for supercomputing systems is a huge and lucrative business. business.
Currently, countries around the world are increasing investment in data, infrastructure, etc. to build more efficient supercomputing systems. These supercomputing centers and technology giants can all become potential users of Grace Hopper hardware and its platform.
To this end, NVIDIA built the Grace series of data center CPUs from scratch based on the Arm architecture, aiming to create high-performance computing and AI superchips.
However, in the HPCC benchmarks released in February, Grace still lagged behind Nvidia's latest Sapphire Rapids CPU, being faster in only three out of eight tests.
However, some articles pointed out that Grace has advantages in heat dissipation and cost, which are also key factors to consider when building a data center.
Launched in August last year, the latest generation of Grace Hopper superchips is the world's first processor to feature HBM3e memory, with a capacity of 141GB, and is designed to handle "the world's most complex generative AI workloads, covering large language models." , recommendation systems and vector databases".
NVIDIA CEO Jensen Huang wore his iconic leather jacket and launched this product on the podium of SIGGRAPH 2023, the world's top computer graphics conference.
The differences between HBM (High Bandwidth Memory) generations are primarily in transfer speed rather than capacity. Compared with the HBM3 memory used by AMD, HBM3e is about 50% faster, increasing the data transfer rate in Grace Hopper from the original 4TB/s to 5TB/s.
In addition to the Grace Hopper series, NVIDIA is also ambitiously expanding more product lines to meet computing needs at different levels and scenarios.
For example, the next-generation Blackwell series of chips that Lao Huang showed at the GTC conference in March this year belong to Nvidia. The GB200 model combines a Grace CPU and two B200 GPUs to achieve 5 petaflops (a quadrillion floating-point calculations per second). processing power, compared to the H200 GPU's raw computing power of just 1 petaflops.
"Barron's" analyst Tae Kim wrote on Twitter that HSBC analysts estimate that the cost of a GB200 chip may be as high as $70,000, and Nvidia prefers to provide customers with servers integrating multiple chips., rather than selling chips directly, will further increase the average price of chips.
For example, the GB200 NVL36 server is equipped with 36 GB200 chips, with an average selling price of about US$1.8 million, and the NVL72 server equipped with 72 chips may be sold for US$3 million.
High-performance computing (HPC) is one of the most important tools driving advances in scientific computing. From weather forecasting and energy exploration to computational fluid dynamics and life sciences, researchers are combining traditional simulation methods with artificial intelligence, machine learning, and big data. Analytics and edge computing converge to solve important scientific questions.
High performance computing for weather modeling
Ian Buck, vice president of hyperscale and high-performance computing at Nvidia, said in a statement that "artificial intelligence is accelerating research on climate change, speeding drug discovery, and enabling breakthroughs in dozens of other fields." "Nvidia Grace Hopper is becoming a high-tech An important part of performance computing because they can transform entire industries while improving energy efficiency.”
References:
https://www.tomshardware.com/tech-industry/supercomputers/nvidia-announces-supercomputers-based-on-its-grace-hopper-platform-200-exaflops-for-ai
https://www.extremetech.com/computing/nvidia-gh200-superchip-is-now-powering-9-supercomputers
https://nvidianews.nvidia.com/news/nvidia-grace-hopper-ignites-new-era-of-ai-supercomputing
2023-11-13
2023-09-08
2023-10-12
2023-10-20
2023-10-13
2023-09-22
2023-10-05
2023-10-16
Please leave your message here and we will reply to you as soon as possible. Thank you for your support.
Sell us your Excess here. We buy ICs, Transistors, Diodes, Capacitors, Connectors, Military&Commercial Electronic components.
Leave Your Message