NVIDIA, the leading technology company in the field of graphics processing units (GPUs), has recently unveiled the NVIDIA HGX H200, a new computer designed for AI and high-performance computing (HPC) workloads. This powerful platform features the NVIDIA H200 Tensor Core GPU, equipped with advanced memory capabilities specifically designed to efficiently handle large datasets for generative AI and HPC tasks.
The announcement has garnered significant attention, with major cloud service providers such as AWS, Google Cloud, Microsoft Azure, and Oracle Cloud Infrastructure among the early adopters set to deploy H200-based instances in 2024. Other service providers, including CoreWeave, Lambda, and Vultr, have also joined as early adopters.
One of the standout features of the NVIDIA H200 is its utilization of the Hopper architecture, making it the first GPU to incorporate HBM3e memory. This faster and larger memory significantly enhances the acceleration of generative AI and large language models, promising advancements in scientific computing for HPC workloads.
The HBM3e-powered NVIDIA H200 impresses with its remarkable 141 GB of memory, operating at an impressive speed of 4.8 terabytes per second. This represents a notable increase in both capacity and bandwidth compared to its predecessor, the NVIDIA A100 GPU, offering 2.4 times more bandwidth and nearly double the memory capacity.
Ian Buck, Vice President of Hyperscale and HPC at NVIDIA, emphasized the crucial role of efficient data processing in creating intelligence with generative AI and HPC applications. He stated, With NVIDIA H200, the industry’s leading end-to-end AI supercomputing platform just got faster to solve some of the world’s most important challenges.
Furthermore, the NVIDIA Hopper architecture continues to raise the bar through ongoing software enhancements, showcasing the company’s commitment to innovation. Recent releases, such as the NVIDIA TensorRT-LLM open-source libraries, demonstrate the platform’s perpetual drive for performance improvements.
The introduction of the H200 is expected to significantly boost inference speed when used with Llama 2, a large language model with 70 billion parameters, compared to the H100. Future software updates are anticipated to further enhance performance and solidify the H200’s position as a leader in the field.
The NVIDIA H200 will be available in NVIDIA HGX H200 server boards, offering four- and eight-way configurations. These boards are fully compatible with both the hardware and software of HGX H100 systems. Additionally, the H200 is integrated into the NVIDIA GH200 Grace Hopper Superchip with HBM3e, allowing for deployment in diverse data center environments, including on-premises, cloud, hybrid-cloud, and edge.
NVIDIA’s global ecosystem of partner server manufacturers, including ASRock Rack, ASUS, Dell Technologies, Eviden, GIGABYTE, Hewlett Packard Enterprise, Ingrasys, Lenovo, QCT, Supermicro, Wistron, and Wiwynn, can upgrade their existing systems with the H200.
The HGX H200, powered by NVIDIA NVLink and NVSwitch high-speed interconnects, offers unparalleled performance across various application workloads, including LLM training and inference for models exceeding 175 billion parameters.
An eight-way HGX H200 provides over 32 petaflops of FP8 deep learning compute and an impressive 1.1TB of aggregate high-bandwidth memory, ensuring optimal performance in both generative AI and HPC applications.
When combined with NVIDIA Grace CPUs and an ultra-fast NVLink-C2C interconnect, the H200 contributes to the creation of the GH200 Grace Hopper Superchip with HBM3e. This integrated module is specifically designed to cater to giant-scale HPC and AI applications.
Overall, the introduction of the NVIDIA HGX H200 represents a significant advancement in the field of AI and HPC. With its advanced memory capabilities, increased capacity, and improved bandwidth, it is poised to accelerate the development of generative AI models and enhance the performance of HPC workloads. The collaboration with major cloud service providers and the support from a wide range of server manufacturers further solidify the H200’s position as a pioneering solution in the industry.