GB200 Rack Supply Chain Requires Further Optimization, Peak Shipments Expected Between 2Q25 and 3Q25
December 18, 2024 | TrendForceEstimated reading time: 2 minutes
As the market closely follows the progress of NVIDIA’s GB200 rack-mounted solution, TrendForce’s latest research indicates that the supply chain requires additional time for optimization and adjustment. This is largely due to the higher design specifications of the GB200 rack, including its requirements for high-speed interconnect interfaces and thermal design power (TDP), which significantly exceed market norms. Consequently, TrendForce projects that mass production and peak shipments are unlikely to occur until between Q2 and Q3 of 2025.
The NVIDIA GB rack series, which includes the GB200 and GB300 models, features more complex technology and higher production costs, making it a preferred solution for large CSPs. Other potential users include Tier-2 data centers, national sovereign cloud providers, and academic research institutions engaged in HPC and AI applications. The GB200 NVL72 is expected to become the most widely adopted model in 2025, potentially accounting for up to 80% of total deployments as NVIDIA ramps up its market push.
NVIDIA aims to boost the computational performance of AI and HPC server systems with its proprietary NVLink technology, enabling high-speed interconnections between GPU chips. The GB200 employs the fifth-generation NVLink, offering a significantly higher total bandwidth than PCIe 5.0—the current industry standard.
The TDP of the 2024-dominant HGX AI server typically ranges from 60 kW to 80 kW per rack, but the GB200 NVL72’s TDP reaches a staggering 140 kW per rack, effectively doubling power demands. Manufacturers are accelerating the adoption of liquid cooling solutions, as traditional air cooling methods are no longer sufficient for such high thermal loads.
The advanced design requirements for the GB200 have raised concerns over potential delays in component availability and system shipments. TrendForce reports that the production of Blackwell GPU chips is progressing largely as expected, with only limited shipments in 4Q24.
Production volume is anticipated to ramp up gradually from 1Q25 onward. However, as components of the AI server system are still undergoing supply chain adjustments, 2024 year-end shipments are expected to fall short of industry expectations. Consequently, TrendForce forecasts that the peak shipment period for the GB200 full-rack system will be postponed to between Q2 and Q3 of 2025.
Liquid cooling has become essential with the GB200 NVL72’s 140 kW TDP surpassing the limits of traditional air-cooled solutions. The adoption of liquid-cooling components is gaining momentum, with major industry players investing heavily in R&D for liquid cooling technologies.
Notably, coolant distribution unit suppliers are working to improve cooling efficiency by expanding rack sizes and developing more efficient cold plate designs. Sidecar CDUs are capable of dissipating between 60 kW and 80 kW, but future designs are expected to double or even triple cooling capacity. Meanwhile, the development of liquid-to-liquid in-row CDU systems has enabled cooling performance to exceed 1.3 mW, with further improvements anticipated as demands for computational power continue to rise.
Suggested Items
NVIDIA Blackwell's High Power Consumption Drives Cooling Demands; Liquid Cooling Penetration Expected to Reach 10% by Late 2024
07/30/2024 | TrendForceWith the growing demand for high-speed computing, more effective cooling solutions for AI servers are gaining significant attention. TrendForce's latest report on AI servers reveals that NVIDIA is set to launch its next-generation Blackwell platform by the end of 2024.
AI Server, AI Notebook Hardware Upgrades Drive Demand for High-Capacitance MLCCs, Boosting Average Supplier Prices
07/11/2024 | TrendForceThe AI hardware boom is in full swing: TrendForce reports that the first half of this year witnessed a robust increase in AI server orders.
Demand for NVIDIA’s Blackwell Platform Expected to Boost TSMC’s CoWoS Total Capacity by Over 150% in 2024
04/16/2024 | TrendForceNVIDIA’s next-gen Blackwell platform, which includes B-series GPUs and integrates NVIDIA’s own Grace Arm CPU in models such as the GB200, represents a significant development.