NVIDIA Expands AI Portfolio to Counter ASIC Push
March 18, 2026 | TrendForceEstimated reading time: 2 minutes
According to TrendForce’s latest findings on the AI servers, major CSPs are increasing investment in self-developed chips. In response, NVIDIA shifted the focus of GTC 2026 toward deploying AI inference applications across multiple industries, marking a departure from its previous emphasis on cloud-based AI training.
NVIDIA is advancing a diversified product portfolio, which includes GPUs, CPUs, and LPUs, to address both training and inference workloads, while promoting rack-level integrated systems to drive growth across its supply chain ecosystem.
TrendForce reports that as CSPs like Google and Amazon increase their internal chip development, ASIC-based AI servers are forecast to represent 27.8% of all AI server shipments in 2026. This percentage is expected to grow to nearly 40% by 2030.
NVIDIA is promoting rack-scale solutions that integrate CPUs and GPUs, including GB300 and VR200 platforms, with scalability for inference workloads to reinforce its leadership in the AI market. At GTC 2026, the company introduced Vera Rubin, a highly vertically integrated system that combines seven chips and five rack configurations.
Memory vendors are expected to begin supplying HBM4 for Rubin GPUs in 2Q26, enabling NVIDIA to start shipping Rubin chips around 3Q26.
Meanwhile, shipments of NVIDIA’s GB300 and VR200 rack systems are progressing. The GB300 platform replaced the GB200 as the company’s flagship solution in 4Q25, and its shipment share is expected to reach nearly 80% in 2026. The VR200 rack system is projected to begin ramping shipments toward late 3Q26, although the actual timeline will depend on the production schedules of ODMs.
As AI evolves from generative models to agent-based architectures, the decode stage of token generation has become a major bottleneck due to latency and memory bandwidth constraints. NVIDIA is tackling this challenge by integrating technology from the Groq team, introducing the Groq 3 LPU designed specifically for low-latency inference workloads. Each chip integrates 500 MB of SRAM, and a full rack system can provide up to 128 GB of on-chip memory.
However, the memory capacity of LPUs alone cannot accommodate the massive model parameters and KV cache required by systems such as Vera Rubin. NVIDIA therefore introduced “disaggregated inference” at GTC 2026, where the inference pipeline is divided into two stages through an AI factory operating system called Dynamo.
In agent-based AI workloads, the pre-fill and attention stages, which require intensive computation and large KV cache storage, are handled by Vera Rubin systems equipped with high-throughput and large-memory capacity. The decode and token generation stages, which are highly latency-sensitive and bandwidth-limited, are offloaded to LPU racks with expanded memory capacity.
The third-generation Groq LP30 chip, fabricated by Samsung, has entered full-scale production and is expected to begin shipping in 2H26. NVIDIA also plans to introduce a higher-performance LP40 chip in its next-generation Feynman architecture.
Testimonial
"We’re proud to call I-Connect007 a trusted partner. Their innovative approach and industry insight made our podcast collaboration a success by connecting us with the right audience and delivering real results."
Julia McCaffrey - NCAB GroupSuggested Items
Lockheed Martin Completes Live Target Tracking Exercise for the Aegis System Equipped Vessel System
03/23/2026 | Lockheed MartinLockheed Martin, together with the U.S. Department of War (DoW), the Missile Defense Agency (MDA) and Japan’s Ministry of Defense (JMOD), successfully completed the first live target tracking exercise using the SPY-7 radar.
Learning with Leo: Interpreting IPC Soldering Requirements and Acceptance Criteria
03/11/2026 | Leo Lambert -- Column: Learning With LeoThis month, I’m writing about IPC soldering standards and, specifically, structured linguistic conventions used to define mandatory requirements and conditional acceptance criteria for materials, processes, and workmanship. I selected this topic because of the high volume of questions requesting clarification and examples of bracketed criteria in IPC J-STD-001.
Lockheed Martin Awarded $328.5 Million Contract to Deliver Legion-ES Systems
02/05/2026 | Lockheed MartinThe U.S. government awarded Lockheed Martin a $328.5 million Foreign Military Sales (FMS) contract for the production of IRST21® Legion-ES® sensor systems. This contract supports the recent U.S. government FMS agreement for the Taiwan Air Force.
Saab Receives Order for Trackfire Remote Weapon Station
01/09/2026 | SaabSaab has received an order from the Swedish Defence Materiel Administration, FMV, for the Trackfire Remote Weapon Station (RWS).
Sierra Space Completes First Nine Tranche 2 Tracking Satellites Ahead of Schedule
01/07/2026 | BUSINESS WIRESierra Space, a proven defense-tech company delivering solutions for the nation’s most critical missions and advancing the future of security in space, announced the completion of the first nine satellite structures, Plane 1 of the 18 total satellites Sierra Space is contracted to deliver for the Space Development Agency’s (SDA) Tranche 2 Tracking Layer (T2TRK) program.