NVIDIA Expands AI Portfolio to Counter ASIC Push

March 18, 2026 | TrendForce

Estimated reading time: 2 minutes

According to TrendForce’s latest findings on the AI servers, major CSPs are increasing investment in self-developed chips. In response, NVIDIA shifted the focus of GTC 2026 toward deploying AI inference applications across multiple industries, marking a departure from its previous emphasis on cloud-based AI training.

NVIDIA is advancing a diversified product portfolio, which includes GPUs, CPUs, and LPUs, to address both training and inference workloads, while promoting rack-level integrated systems to drive growth across its supply chain ecosystem.

TrendForce reports that as CSPs like Google and Amazon increase their internal chip development, ASIC-based AI servers are forecast to represent 27.8% of all AI server shipments in 2026. This percentage is expected to grow to nearly 40% by 2030.

NVIDIA is promoting rack-scale solutions that integrate CPUs and GPUs, including GB300 and VR200 platforms, with scalability for inference workloads to reinforce its leadership in the AI market. At GTC 2026, the company introduced Vera Rubin, a highly vertically integrated system that combines seven chips and five rack configurations.

Memory vendors are expected to begin supplying HBM4 for Rubin GPUs in 2Q26, enabling NVIDIA to start shipping Rubin chips around 3Q26.

Meanwhile, shipments of NVIDIA’s GB300 and VR200 rack systems are progressing. The GB300 platform replaced the GB200 as the company’s flagship solution in 4Q25, and its shipment share is expected to reach nearly 80% in 2026. The VR200 rack system is projected to begin ramping shipments toward late 3Q26, although the actual timeline will depend on the production schedules of ODMs.

As AI evolves from generative models to agent-based architectures, the decode stage of token generation has become a major bottleneck due to latency and memory bandwidth constraints. NVIDIA is tackling this challenge by integrating technology from the Groq team, introducing the Groq 3 LPU designed specifically for low-latency inference workloads. Each chip integrates 500 MB of SRAM, and a full rack system can provide up to 128 GB of on-chip memory.

However, the memory capacity of LPUs alone cannot accommodate the massive model parameters and KV cache required by systems such as Vera Rubin. NVIDIA therefore introduced “disaggregated inference” at GTC 2026, where the inference pipeline is divided into two stages through an AI factory operating system called Dynamo.

In agent-based AI workloads, the pre-fill and attention stages, which require intensive computation and large KV cache storage, are handled by Vera Rubin systems equipped with high-throughput and large-memory capacity. The decode and token generation stages, which are highly latency-sensitive and bandwidth-limited, are offloaded to LPU racks with expanded memory capacity.

The third-generation Groq LP30 chip, fabricated by Samsung, has entered full-scale production and is expected to begin shipping in 2H26. NVIDIA also plans to introduce a higher-performance LP40 chip in its next-generation Feynman architecture.

Share on:

Testimonial

"The I-Connect007 team is outstanding—kind, responsive, and a true marketing partner. Their design team created fresh, eye-catching ads, and their editorial support polished our content to let our brand shine. Thank you all! "

Sweeney Ng - CEE PCB

Suggested Items

RTX's Raytheon Delivers Second Missile-warning Sensor to U.S. Space Force

04/29/2026 | RTX
Raytheon, an RTX business, has delivered its second sensor to Lockheed Martin for the U.S. Space Force's Next-Generation Overhead Persistent Infrared (Next-Gen OPIR) Geosynchronous Earth Orbit (GEO) Block 0 satellite program.

Lockheed Martin Completes Live Target Tracking Exercise for the Aegis System Equipped Vessel System

03/23/2026 | Lockheed Martin
Lockheed Martin, together with the U.S. Department of War (DoW), the Missile Defense Agency (MDA) and Japan’s Ministry of Defense (JMOD), successfully completed the first live target tracking exercise using the SPY-7 radar.

Learning with Leo: Interpreting IPC Soldering Requirements and Acceptance Criteria

03/11/2026 | Leo Lambert -- Column: Learning With Leo
This month, I’m writing about IPC soldering standards and, specifically, structured linguistic conventions used to define mandatory requirements and conditional acceptance criteria for materials, processes, and workmanship. I selected this topic because of the high volume of questions requesting clarification and examples of bracketed criteria in IPC J-STD-001.

Lockheed Martin Awarded $328.5 Million Contract to Deliver Legion-ES Systems

02/05/2026 | Lockheed Martin
The U.S. government awarded Lockheed Martin a $328.5 million Foreign Military Sales (FMS) contract for the production of IRST21® Legion-ES® sensor systems. This contract supports the recent U.S. government FMS agreement for the Taiwan Air Force.

Saab Receives Order for Trackfire Remote Weapon Station

01/09/2026 | Saab
Saab has received an order from the Swedish Defence Materiel Administration, FMV, for the Trackfire Remote Weapon Station (RWS).

News Highlights

More News

Featured Books

Article Highlights

More Articles

Latest Columns

See all of our columnists

Media Kit - Choose Your Primary Marketing Focus: