Covering Scientific & Technical AI

Nvidia Says Rubin Will Deliver 5x AI Inference Boost Over Blackwell

When it ships later this year, Nvidia’s latest Rubin GPU will sport 5x the NVPF4 inference and 3.5x the NVPF4 training performance of Blackwell, Nvidia CEO Jensen Huang said Monday at CES 2026, where Nvidia officially unveiled the Vera Rubin platform.

The AI revolution so far has run through Nvidia, the GPU chipmaker that has gobbled up 90% of the market for AI chips. Its current Blackwell GPU and Grace Blackwell CPU-GPU superchips have sold exceptionally well, and the company is looking to take that success to the next level with Vera Rubin.

“Vera Rubin arrives just in time for the next frontier of AI,” Huang said during his 1.5-hour CES keynote address. “I can tell you that Vera Rubin is in full production.”

Monday’s announcement was a mixture of old and new stuff. Nvidia has been talking about its Vera Rubin superchip since June 2024, which is when it also started talking about NVLink-6, the scale-up interconnect used to develop NVL72 systems. The company announced its Spectrum-X co-package optics (slated to ship in 2026) chip earlier this year, and launched its Bluefield-4 data processing units (DPUs) in October.

Nvidia shared specs of its upcoming Rubin GPU

What we did not know were the performance marks of the new Rubin GPU, which have been kept under wraps up to this point. Huang also shared some color and context on what went into the “extreme co-design” behind the upcoming Vera Rubin NVL72 server, and why it was necessary.

The AI performance specs for Rubin are impressive. According to Nvidia, the new chip will deliver 50 petaflops of NVPF4 inference performance, which is 5 times more than Blackwell, and 35 petaflops of NVFP4 performance, which is 3.5 times more. It will offer 22 TB per second of HBM4 memory bandwidth, a 2.8x improvement over Blackwell, and 3.6 TB per second of NVLink bandwidth per GPU, a 2x increase.

The Vera CPU, which is based on an Arm design, will deliver twice the performance of the Grace CPU chip that it replaces, according to Huang (who didn’t offer specifics). It will feature 88 custom Olympus cores and offer 176 threads per core with Nvidia’s “spatial multi-threading.” It will offer a 1.8 TB per second NVLink C2C connection, offer 1.5 TB of on-chip memory (3x that of Grace), and 1.2 TB per second of LPDDR5X memory bandwidth.

Huang also shared video of the first rack of a Vera Rubin NVL72 pod going online. The pod features 18 compute trays, nine NVLink compute trays, and weighs nearly 2 tons, he said. All told, it features 220 trillion transistors and took 15,000 engineer-years to design, he said.

Huang in front of a rack of Vera Rubin NVL72 servers at CES 2026

The NVLink72 pod was an example of the kind of “extreme co-design” that Nvidia has been forced to do since Moore’s Law has slowed down, Huang said.

“We have a rule inside our company. It’s a good rule. ‘No new generation should have more than 1 or 2 chips change,’” Huang said during his keynote. “But the problem is … Moore’s Law has largely slowed, and so the number of transistors we can get year after year after year can’t possibly keep up with the 10 times larger models.”

As more AI tokens are generated and the costs come down, that puts pressure on Nvidia and other chipmakers to boost performance. Rubin features 1.6x more transistors than Blackwell, which is the starting point for the performance boost. But 1.6 doesn’t get you to 10x.

“It is impossible to keep up with those kinds of rates, for the industry to continue to advance,” he said, “unless we deployed aggressive, extreme co-design–basically innovating across all of the chips, across the entire stack, all at the same time. Which is the reason why we decided that this generation, we had no choice but to design every chip over again.”

Nvidia shared specs of its upcoming Vera GPU

Huang pointed to Nvidia’s Tensor Core technology, which are specialized processing units within its GPUs that are designed to accelerate the matrix multiplication and accumulation (MMA) operations for AI workloads, as one of the main reasons why the company will be able to deliver a 5x increase in AI inference performance with Rubin over Blackwell.

“It’s an entire processing unit that understands how to dynamically, adaptively adjust its precision and structure to deal with different levels of the transformer, so that you can achieve higher throughput wherever it’s possible to lose precision, and to go back to the highest possible precision, wherever you need to,” he said. “This is groundbreaking. I would not be surprised if the industry would like us to make this format and the structure an industry standard in the future. This is completely revolutionary. This is how we were able to deliver such a gigantic step up in performance even though we only have 1.6 times the number of transistors.”

Nvidia has not yet shared the full performance card for its Rubin GPUs. Some in the HPC community have been concerned that the Blackwell generation of chips delivered less high precision capabilities, such as floating point 64-bit workloads, than its previous GPUs. FP64 is critical for traditional modeling and simulation workloads that have been the bread and butter for the supercomputing community for years.

Last month, Nvidia told HPCwire that it wasn’t abandoning 64-bit computing. We’ll likely have to wait until GTC 26 in March to see the performance specs for FP64, not to mention the energy consumption that is so critical these days.

This article first appeared on HPCwire.

QCWire Graphic

Why Your Million-Dollar GPUs Are Sleeping on the Job

Building a GPU cluster is an expensive task. This is true if you are an…

The ROI of Ambiguity: How Great AI Can Emerge from Vague Questions

Companies love clarity. Define the problem, find clean data, choose the model, deploy, measure –…

Nvidia Releases Nemotron 3, Expanding Its Open Models for Agentic AI

Nvidia has released Nemotron 3, the latest of the company’s open reasoning models designed to…

Untamed Data Is Undermining the AI Revolution

Across industries, organizations are drowning in unstructured data: files, videos, images, chat logs, design documents,…

As AI Scales for Science, the DOE Turns to Nuclear and Federal Land

Data centers running advanced AI systems now require steady and high-density energy at a scale…

A (Mostly) Non-Technical AI Primer

I have studied and watched artificial intelligence grow over the last forty years. Like many,…

Electronics Giants Tap into Industrial Automation with NVIDIA Metropolis for Factories

May 30, 2023 — The $46 trillion global electronics manufacturing industry spans more than 10…

WPP Partners with NVIDIA to Build Generative AI-Enabled Content Engine for Digital Advertising

TAIPEI, Taiwan, May 30, 2023 — NVIDIA and WPP have announced they are developing a…

MediaTek Partners with NVIDIA to Transform Automobiles with AI and Accelerated Computing

May 30, 2023 — MediaTek, a leading innovator in connectivity and multimedia, is teaming with…

HPE Reports Fiscal 2023 2nd Quarter Results

HOUSTON, May 31, 2023 — Hewlett Packard Enterprise has announced financial results for the second quarter…

Syslogic Introduces Rugged Computer Based on NVIDIA Jetson AGX Orin Industrial

BADEN, Switzerland, May 31, 2023 — Syslogic has introduced the first embedded system based on…

Industry Leaders Launch RISE to Accelerate the Development of Open Source Software for RISC-V

BRUSSELS, May 31, 2023 — The RISC-V Software Ecosystem (RISE) Project is a new collaborative…

CoreWeave Extends Its Cloud Platform with NVIDIA Rubin Platform

LIVINGSTON, N.J., Jan. 6, 2026 — CoreWeave, Inc. today announced it will add NVIDIA Rubin technology to…

NVIDIA Unveils Rubin Platform to Support Large-Scale Training and Inference Workloads

Jan. 6, 2026 — NVIDIA has kickstarted the next generation of AI with the launch of the NVIDIA…

Why Your Million-Dollar GPUs Are Sleeping on the Job

Building a GPU cluster is an expensive task. This is true if you are an…

Nvidia Says Rubin Will Deliver 5x AI Inference Boost Over Blackwell

When it ships later this year, Nvidia’s latest Rubin GPU will sport 5x the NVPF4…

Institute of Science Tokyo Advances Catalyst Chemistry with Reaction-Conditioned AI Model

Researchers develop an AI-based platform that integrates reaction data with catalyst performance for the design…

DDN Powers Integrated Compute, Data, and Offload at Scale for NVIDIA Rubin Platform

LOS ANGELES, Jan. 6, 2026 — DDN today announced deep collaboration with NVIDIA to support the…

Source link

What's Hot

Oukitel WP63 can light a fire

We go hands-on with the Lenovo Legion Tab gaming tablet and the tough ThinkTab X11

The Download: Earth’s rumblings, and AI for strikes on Iran

Covering Scientific & Technical AI

Why Your Million-Dollar GPUs Are Sleeping on the Job

The ROI of Ambiguity: How Great AI Can Emerge from Vague Questions

Nvidia Releases Nemotron 3, Expanding Its Open Models for Agentic AI

Untamed Data Is Undermining the AI Revolution

As AI Scales for Science, the DOE Turns to Nuclear and Federal Land

A (Mostly) Non-Technical AI Primer

Electronics Giants Tap into Industrial Automation with NVIDIA Metropolis for Factories

WPP Partners with NVIDIA to Build Generative AI-Enabled Content Engine for Digital Advertising

MediaTek Partners with NVIDIA to Transform Automobiles with AI and Accelerated Computing

HPE Reports Fiscal 2023 2nd Quarter Results

Syslogic Introduces Rugged Computer Based on NVIDIA Jetson AGX Orin Industrial

Industry Leaders Launch RISE to Accelerate the Development of Open Source Software for RISC-V

CoreWeave Extends Its Cloud Platform with NVIDIA Rubin Platform

NVIDIA Unveils Rubin Platform to Support Large-Scale Training and Inference Workloads

Why Your Million-Dollar GPUs Are Sleeping on the Job

Nvidia Says Rubin Will Deliver 5x AI Inference Boost Over Blackwell

Institute of Science Tokyo Advances Catalyst Chemistry with Reaction-Conditioned AI Model

DDN Powers Integrated Compute, Data, and Offload at Scale for NVIDIA Rubin Platform

The Download: Earth’s rumblings, and AI for strikes on Iran

Covering Scientific & Technical AI

Covering Scientific & Technical AI

The Download: The startup that says it can stop lightning, and inside OpenAI’s Pentagon deal

iPhone Pro 13 Rumored to Feature 1 TB of Storage

Oculus Quest X Headset: Discover a Shining New Star

Fujifilm’s 102-Megapixel Camera is the Size of a Typical DSLR

Review: Mi 10 Mobile with Qualcomm Snapdragon 870 Mobile Platform

Comparison of Mobile Phone Providers: 4G Connectivity & Speed

Which LED Lights for Nail Salon Safe? Comparison of Major Brands

Subscribe to Updates

What's Hot

Covering Scientific & Technical AI

Nvidia Says Rubin Will Deliver 5x AI Inference Boost Over Blackwell

Related

Why Your Million-Dollar GPUs Are Sleeping on the Job

The ROI of Ambiguity: How Great AI Can Emerge from Vague Questions

Nvidia Releases Nemotron 3, Expanding Its Open Models for Agentic AI

Untamed Data Is Undermining the AI Revolution

As AI Scales for Science, the DOE Turns to Nuclear and Federal Land

A (Mostly) Non-Technical AI Primer

Electronics Giants Tap into Industrial Automation with NVIDIA Metropolis for Factories

WPP Partners with NVIDIA to Build Generative AI-Enabled Content Engine for Digital Advertising

MediaTek Partners with NVIDIA to Transform Automobiles with AI and Accelerated Computing

HPE Reports Fiscal 2023 2nd Quarter Results

Syslogic Introduces Rugged Computer Based on NVIDIA Jetson AGX Orin Industrial

Industry Leaders Launch RISE to Accelerate the Development of Open Source Software for RISC-V

CoreWeave Extends Its Cloud Platform with NVIDIA Rubin Platform

NVIDIA Unveils Rubin Platform to Support Large-Scale Training and Inference Workloads

Why Your Million-Dollar GPUs Are Sleeping on the Job

Nvidia Says Rubin Will Deliver 5x AI Inference Boost Over Blackwell

Institute of Science Tokyo Advances Catalyst Chemistry with Reaction-Conditioned AI Model

DDN Powers Integrated Compute, Data, and Offload at Scale for NVIDIA Rubin Platform

Related Posts