NVIDIA | Blackwell Architecture, AI Infrastructure & Roadmap

NVIDIA's Engineering Philosophy

NVIDIA no longer treats the GPU as a standalone piece of silicon. The company's current engineering standard views the entire data center network as a single unified supercomputer. The central bottleneck in modern AI training and inference is not raw single-die compute, it is communication latency across nodes. NVIDIA's answer is full-rack scale liquid-cooled integration built around NVLink switch chips and ConnectX-8 SuperNICs that eliminate the bottleneck before data ever reaches the processor.

The GB300 NVL72 rack system couples Grace CPUs and Blackwell Ultra GPUs into a completely fluid-cooled 72-GPU cluster that acts as one coherent memory pool. NVLink Switch Chips allow all 72 GPUs to communicate at full unidirectional speed. ConnectX-8 SuperNICs handle multi-tenant data offloading over 400GbE to 800GbE fabrics. NVIDIA's internal networking revenue grew more than 260% year-over-year on the back of this architecture becoming the hyperscaler deployment standard.

That infrastructure demand is why NVIDIA's Data Center division accounts for over 91% of total company revenue. Consumer gaming, historically the company's identity, is now a secondary business line in financial terms. The RTX 50-series desktop cards launched in early 2025 sell to both gamers and AI researchers who need large local VRAM pools, creating a supply dynamic where consumer GPU availability is partly constrained by the same AI compute buildout driving hyperscale infrastructure investment.

Financial Snapshot

FY2026 Total Revenue

$215.9B

FY2026 Net Income

$120.1B

FY2026 Data Center Share

~91%

FY2026 Gross Margin

71.1%

Q1 FY2027 Revenue

$81.6B

Q1 FY2027 Data Center Rev

$75.2B

Q1 FY2027 Gross Margin

74.9%

Cash Reserves

$62B+

Source: NVIDIA FY2026 annual report and Q1 FY2027 earnings release. Data Center contribution includes compute and networking segments.

Architecture Roadmap

Hopper (H100 / H200)

TSMC 4N · HBM2e / HBM3

Production

Blackwell (B200 / GB300)

TSMC 4N custom · HBM3e

Current

Vera Rubin (R200 / VR200)

TSMC 3nm · HBM4 — 22 TB/s

Late 2026

Rubin Ultra

TSMC 3nm+ · 12x HBM4 stacks

Mid-2027

Feynman

TBD · TBD

2028 Target

Vera Rubin (late 2026): Moves to TSMC 3nm with 336B transistors across a multi-chip module combining two compute dies and HBM4 at a 2,048-bit interface per stack, achieving 22 TB/s per GPU. TDP ranges from 1,800W to 2,300W, making air cooling obsolete. The paired Vera CPU features 88 custom Olympus cores with native FP8 vector execution for data-loading acceleration.

Feynman (2028): Teased by NVIDIA leadership as engineered explicitly for physical AI, multi-agent foundational models, and extreme-scale simulation frameworks.

NVIDIA Coverage

Tech

Turning the Tables | Google Borrows Nvidia Financial Playbook to Break AI Chip Monopoly

Max DeLeonardis · Jun 20, 2026

Tech

Turning the Tables | Google Borrows Nvidia Financial Playbook to Break AI Chip Monopoly

Max DeLeonardis · Jun 20, 2026

Tech

Blackwell Commences Its Reign | NVIDIA GeForce RTX 50-Series Architecture, Specs, and Pricing

Max DeLeonardis · Jun 4, 2026

Tech

Blackwell Commences Its Reign | NVIDIA GeForce RTX 50-Series Architecture, Specs, and Pricing

Max DeLeonardis · Jun 4, 2026

Tech

Nvidia GeForce RTX 5070 Review | The Midrange Blackwell Dilemma in 2026

Max DeLeonardis · May 30, 2026

Tech

Nvidia GeForce RTX 5070 Review | The Midrange Blackwell Dilemma in 2026

Max DeLeonardis · May 30, 2026

Related Coverage

Tech Hub Google Intel Finance Science