Hardware scaled

AI accelerators / advanced compute

GPU-dominated AI compute fractured into specialized lanes between 2024 and 2026 — wafer-scale (Cerebras), inference ASICs (Groq, Etched), photonic interconnect (Lightmatter), and per-hyperscaler custom silicon (TPU/Trainium/Maia/MTIA) — pulling 20–70% of opex out of training and inference.

What to watch next

Rubin (2H 2026) and the Cerebras OpenAI 750 MW deployment going live; Lightmatter Passage co-packaged optics in production; neuromorphic chips (Loihi 3, NorthPole) crossing the practical edge-AI threshold; whether NVIDIA-Groq integration cements inference dominance.

Key sub-ideas & techniques

GPU scaling (Hopper → Blackwell → Rubin) — NVIDIA's flagship line keeps doubling effective AI throughput per generation: Blackwell (208B transistors, ~20 PFLOPS FP4 per B200) ships at scale in 2025, Rubin (3× Blackwell, 1.2 EFLOPS FP8 training) in 2H2026. [source]
Wafer-scale processors — Cerebras WSE-3 (5nm, 4 trillion transistors, 900K cores, 44 GB on-chip SRAM, 21 PB/s memory) eliminates chip-to-chip movement entirely — enabling single-chip training of multi-trillion-parameter models. [source]
Inference-specialized ASICs — Fixed-function silicon optimized for narrow inference workloads — Groq LPU's deterministic conveyor-belt architecture and Etched's transformer-only Sohu ASIC compete with GPUs on $/token. [source]
Hyperscaler custom silicon — Cloud platforms have abandoned GPU monoculture: Google Trillium / TPU v6, AWS Trainium 3, Meta MTIA 300/500, Microsoft Maia 200 each deliver hyperscaler-specific compute economics. [source]
Photonic & neuromorphic compute — Lightmatter's photonic processor and IBM NorthPole / Intel Loihi 3 neuromorphic chips break with the GPU paradigm — using photons or spiking neurons for orders-of-magnitude better energy per inference for the right workloads. [source]
Google four-partner inference TPU supply chain — Google diversifying custom AI silicon by adding Marvell (memory processing unit + next-gen inference TPU) alongside Broadcom and MediaTek, signaling hyperscaler push to commoditize inference accelerators and reduce NVIDIA dependence. [source]
NVIDIA RTX Spark (N1X) client AI superchip — Arm-CPU + Blackwell-GPU SoC with 128GB unified memory bringing data-center-class local AI inference (120B-param LLMs, 1M-token context) to consumer Windows PCs. [source]
Cerebras Systems — Wafer-scale AI accelerator (CS-3/WSE-3) maker expanding European inference capacity
Azure AMD Helios (UALink) — 72 MI455X GPUs, 31TB HBM4, ~260TB/s intra-rack, open UALink-over-Ethernet fabric [source]
Co-packaged optics / silicon photonics — Silicon-photonics foundries hit a mass-production inflection for co-packaged optics — TrendForce (Jul 30, 2026) reports UMC/SILITH shipped first mass-produced 12-inch SiPh wafers and TSMC's COUPE platform entered production, relieving copper-interconnect bandwidth limits for AI accelerators. [source]

Current frontier

NVIDIA Rubin GPUs ship in 2H 2026 with ~3× Blackwell performance and 1.2 EFLOPS FP8 training capability; Rubin Ultra follows in 2027. [source]
Cerebras Systems signed a >$10B agreement with OpenAI (Jan 2026) for ~750 MW of inference compute through 2028 and filed for IPO at ~$23B valuation in April 2026. [source]
NVIDIA acquired Groq for ~$20B (Dec 2025) and integrated the LPU into its 2026 GTC stack; Groq 3 LPU achieves ~1,500 tokens/sec on agentic AI inference. [source]
Hyperscaler custom silicon scaled across all four majors: Google Trillium (4× LLM training perf vs v5e, 2× HBM bandwidth), AWS Trainium 3 (2.52 PFLOPS MXFP8, 144 GB HBM3e), Meta MTIA generations 4× HBM bandwidth, Microsoft Maia 200 (10 PFLOPS FP4). [source]
Intel Loihi 3 (Jan 2026) delivers 8M neurons / 64B synapses at 4nm running at ~1.2W peak vs 300W+ for GPU equivalents — neuromorphic compute crossing the threshold for practical edge AI. [source]
Cerebras' $3.5B IPO at ~$26.6B valuation, anchored by a $20B+ OpenAI compute deal, validates wafer-scale silicon as a credible alternative to GPU dominance for AI inference. [source]
OpenAI + Broadcom 'Jalapeño' reticle-sized inference ASIC, design-to-tape-out in ~9 months, perf/watt 'substantially better' than SoTA, gigawatt-scale with Microsoft from end-2026. [source]
SK hynix sampling 12-high HBM4E (16 Gbps/pin, 48 GB/stack, >20% better power, +17% heat resistance vs HBM4) for next-gen AI accelerators. [source]
NVIDIA fully liquid-cooled AI server (closed-loop, 45°C supply, near-zero water); est. >$4M/yr savings at 50 MW. [source]
AMD+Cerebras disaggregated inference (Helios prefill + WSE token-gen): up to 5x tokens/sec/watt vs WSE-only on a 1T-param model; Cerebras Cloud H2 2026. [source]
Peking Univ. 40nm phase-change memristor chip: full neural dynamical system in 2.12ms; up to 24.7x more power-efficient; claimed 478x vs A100 on 3D cortical reconstruction (Science 2026). [source]

Key people

Jensen Huang CEO & Co-founder · NVIDIA [source]
Andrew Feldman Co-founder & CEO · Cerebras Systems [source]
Jonathan Ross Founder · Groq (acquired by NVIDIA Dec 2025) [source]
Jim Keller CEO · Tenstorrent [source]
Norm Jouppi Engineering Fellow; lead architect, Google TPU · Google [source]
Nick Harris Co-founder & CEO · Lightmatter [source]

Startups & labs to watch

Tenstorrent Tenstorrent · STARTUP · Series B (2024) led by Khosla Ventures and others — Jim Keller's open-source RISC-V + Metalium ASIC pursuing GPU alternatives for hyperscale clusters via 12×400 Gbps Ethernet scale-out; Blackhole successor scaling. [source]
Rebellions Rebellions Inc. · STARTUP · Series C 2025, $1.4B valuation; Arm + Samsung Ventures + Synopsys — Korean accelerator with 4nm UCIe-Advanced quad-chiplet architecture (REBEL-Quad), 144 GB HBM3e, claiming ~3.2× tokens/W vs H200; Arm + Samsung backed. [source]
FuriosaAI (RNGD next-gen) FuriosaAI · STARTUP · Rejected $800M Meta acquisition; raised capital via LG and OpenAI partnerships — Korean TSMC 5nm AI ASIC; founder declined Meta's $800M acquisition (March 2025) and is pursuing a 2027 IPO with LG and OpenAI partnerships. [source]
Lightmatter (photonic compute) Lightmatter · STARTUP · Series D ~$400M (2024); SoftBank Vision Fund, GV — Shipping the Envise photonic processor and Passage L200/L20 co-packaged optics — the leading commercial bet on photonic interconnect for AI datacenters. [source]
SambaNova Systems SambaNova (SN50 inference accelerators) · COMPANY · $1B Series F at $11B valuation, led by General Atlantic — $1B Series F (first tranche) led by General Atlantic at an $11B valuation (July 8, 2026); won JPMorgan Chase as an on-prem inference customer — a notable enterprise inference-ASIC win vs NVIDIA. [source]