
Nvidia is gearing up for its next big leap in AI hardware.
The company said Sept. 8 that its Vera Rubin microarchitecture is now being taped out, ahead of a planned 2026 launch. A new variant, called Rubin CPX, will target AI workloads that demand massive context windows, according to Dave Salvator, Nvidia’s director of accelerated computing products.
“The Vera Rubin platform will mark another leap in the frontier of AI computing — introducing both the next-generation Rubin GPU and a new category of processors called CPX,” said Jensen Huang, founder and CEO of Nvidia, in a press release. “Just as RTX revolutionized graphics and physical AI, Rubin CPX is the first CUDA GPU purpose-built for massive-context AI, where models reason across millions of tokens of knowledge at once.”
The announcement came just before Nvidia revealed fresh MLPerf inference results on Sept. 9.
Nvidia announces new hardware and architecture
Some AI use cases involve context windows exceeding one million tokens, such as software development with over 100,000 lines of code, or HD video generation. For these use cases, Nvidia will offer the Vera Rubin NDL 144 CPX class of GPU starting in late 2026.
A variant of the Vera Rubin NDL 144 made just for applications requiring long context windows, the CPX model delivers 8 exaflops of AI performance, 30 PF NVFP4 for context compute, and 3x exponent operations compared to Nvidia GB300 NVL72 systems. It also includes 128GB of GDDR7 memory, 4 NVENC (encoders) and 4 NVDEC (decoders) for generative video, and 100 terabytes of fast memory.
“It unlocks a new tier of premium use cases like intelligence coding and video generation …” said Shar Narasimhan, Nvidia director of product marketing, AI and Data Center GPUs, at the pre-briefing.
Data center giga-scale reference designs can inform AI factory builds
Vera Rubin NDL 144 CPX can be thought of as part of a larger AI factory. On Sept. 9, Nvidia also announced plans to offer giga-scale reference designs for large data centers.
“This requires that we innovate and co-engineer with a broad set of infrastructure partners,” said Narasimhan.
Narasimhan added that Nvidia is entering into a new era of designing data centers from the compute perspective in partnership with the infrastructure companies. The company will provide reference designs covering architecture, engineering, and construction; design, simulation, and operations; power generation and storage; and mechanical, electrical, and plumbing.
Blackwell GPU sets a record on MLPerf benchmark
The MLPerf benchmark is a test organized by the MLCommons consortium and used by some companies to measure hardware and software performance on generative AI workloads.
The Nvidia Blackwell GPU set a new record for performance by Llama 3.1 405B Interactive, outperforming the Blackwell baseline using a novel technique called disaggregated serving. This method allows improved performance from the same hardware.
“You can deliver that much more performance on the same platform,” Salvator said at the prebrief. “That performance can generate additional revenue for organizations that already have their solutions deployed.”
Meanwhile, Microsoft showed the results of its experiment to speed up AI with an analog optical computer.