Nvidia recently unveiled the Blackwell B200 GPU, a powerful chip designed for AI-related computing during its GPU Technology Conference. The B200 GPU, powered by the GB200 chip, is a significant upgrade from its predecessor, the H100 AI chip, offering improved performance and efficiency.
The GB200 chip boasts an impressive 208 billion transistors, enabling the B200 GPU to deliver 20 petaflops of FP4. In comparison to the H100 chip, the GB200 chip showcases a 30 times increase in performance for LLM inference workloads while reducing energy consumption by 25 times. The GB200 also outperforms the H100 in the GPT-3 LLM benchmark by seven times.
To illustrate the enhanced capabilities of the B200 GPU, training a model with 1.8 trillion parameters would require 8,000 Hopper GPUs and 15 megawatts of power. In contrast, a set of just 2,000 Blackwell GPUs can achieve the same task using only 4 megawatts.
Addressing the issue of communication efficiency, Nvidia developed a new network switch chip with 50 billion transistors, capable of supporting 576 GPUs with a bidirectional bandwidth of 1.8 TB/s. This innovation aims to optimize communication between GPUs and reduce the time spent on computing.
Nvidia is providing a comprehensive solution for companies interested in leveraging the power of the GB200 chip. For example, the NVL72 rack can accommodate 36 CPUs and 72 GPUs in a single liquid-cooled unit. Additionally, the DGX Superpod for DGX GB200 combines multiple NVL72 systems, offering 288 CPUs, 576 GPUs, and 240TB of memory.
Major tech companies like Oracle, Amazon, Google, and Microsoft have already expressed plans to incorporate the NVL72 racks into their cloud services. It is anticipated that the GPU architecture utilized in the Blackwell B200 GPU will serve as the foundation for the upcoming RTX 5000 series, further enhancing AI computing capabilities.