State of AI Compute Index v4 (June 2025)

Chip counts per paper and research topic, and updated NVIDIA cluster sizes.

Air Street Press

and

Nathan Benaich

Jun 30, 2025

Article voiceover

1×

0:00

-9:45

Today, we release v4 of the

State of AI Report

Compute Index in collaboration with Zeta Alpha.

You'll now find updated counts as of June 2025 for AI research papers using chips from NVIDIA, TPUs, Apple, Huawei, AMD, ASICs, FPGAs, and AI semi startups, as well as updates to A100 H100/200 cluster sizes. We also include new data on the most and least commonly used chips for specific research topic areas.

A breather at the peak

2024 was a blockbuster year for AI research: AlphaFold 3, Llama 3, new synthetic environments and many frontier model releases. According to our analysis using Zeta Alpha, there were nearly 49,000 open source AI research papers that mentioned the use of a specific AI accelerator (up +58% YoY) such as NVIDIA, TPUs, AMD and more.

2025, however, looks a bit different, at least so far. Based on counts through June 1, 2025, and a volume-adjusted projection for the rest of the year, we expect a full-year total of 43,300 papers citing NVIDIA, AMD, TPUs, and large AI chip startups. This is an 11% decline from the prior year, the first such drop in six years.

While some of this slowdown can be attributed to timing mismatches between hardware deployment and publication, it likely also reflects a growing reluctance among large industrial AI labs to publish their latest work. As competitive pressures and safety concerns mount, many frontier research groups are opting to keep breakthroughs private or delay publication, impacting overall visibility in the open-source paper ecosystem.

Overall, though, NVIDIA remains dominant: its chips appear in nearly 90% of all cited compute mentions in 2025. But that share has drifted down from a 2023 peak of 94%, and total NVIDIA-citing papers are forecast to drop to 38,735 this year, down 13% YoY. Meanwhile, AMD is growing nearly 100% YoY, buoyed by MI300X deployments, while Google TPUs show a slight YoY decline, despite the introduction of v6.

So what does this mean? We think it is mostly a story of cycle timing rather than momentum loss. Research work kicked off on H100 and H200 clusters in late 2024 won't be published until late Q3 or Q4 2025. In parallel, we see growing use of shared compute clouds and managed APIs, where authors are less likely to specify underlying silicon. And for the academic long tail, rising GPU costs have made full-stack training harder to justify, especially for teams without access to institutional compute credits. This, along with the increasing availability of strong open-weight models, better small-scale baselines, and evolving norms around reproducibility, could be nudging more researchers toward inference and lightweight fine-tuning workflows, tilting them toward open-weight inference.

NVIDIA is still king, but might be tapering

The composition of individual NVIDIA chip mentions is shifting in subtle ways:

H100/H200 mentions are up 115% YoY, reflecting late-stage adoption of 2023-24 builds, even as growth moderates into 2025.
Jetson citations are up 33% YoY, possibly due to robotics and edge AI interest in low-power inference.
V100 usage is declining for another year since its peak in 2023
RTX 3090 is losing ground from its peak in 2024 to the 4090, which is growing 30% YoY.

Research topic signals by AI chip

We then classified each paper by its research topic in order to observe topic trends associated with specific accelerators. The dataset covers 6,356 papers published between January 1 and June 1, 2025. Topic labels were assigned using GPT-4o-mini, and we filtered the results to include only chip-topic pairs with at least three papers. To focus the analysis on meaningful differentiation, we excluded topics whose percentage difference fell between -1 and +1 compared to the corpus-wide baseline. The results reveal clear skews between specific chips and research areas.

Of note, LLM-focused research papers are most commonly using the AMD MI300, MI250, Huawei Ascend and NVIDIA H100/H200 chips. Meanwhile, robotics research overwhelmingly uses the NVIDIA Jetson.

By contrast, LLM-focused research papers least frequently use ASICs, the NVIDIA Jetson, the NVIDIA 4090 and the Apple M1. Diffusion models are not commonly using FPGAs, either.

If we look at each major research topic individually and ask which chips are more/less popular compared to all chips, here are the results:

3D models: most popular with NVIDIA 4090 (+2.91 pp) and least popular with FPGAs (-4.76 pp)
Computer vision: most popular with Jetson (+3.42 pp) and least popular with H100/H200 (-4.10 pp)
Diffusion models: most associated with Huawei Ascend (+4.25 pp), least associated with FPGAs (-5.62 pp)
Edge computing: most associated with Jetson (+8.94 pp)
LLMs: most strongly associated with MI300 (+42.53 pp), least associated with ASICs (-9.77 pp)
Multimodal models: most associated with the M4 (+8.32 pp) and least associated with the Jetson (-1.72 pp)
Post-quantum cryptography: most associated with ASICs (+3.28 pp)
Quantization and reasoning: each most associated with the MI300 (+8.38 pp an +7.14 pp, respectively) and least associated with the Jetson (1.17 pp and -1.72 pp, respectively)
Reinforcement learning: most popular with the Huawei Ascend (+7.78 pp) and the NVIDIA Jetson (4.19 pp) and least popular with FPGAs (-2.97 pp)
Robotics: most associated with Jetson (+19.59 pp), least associated with V100 (-3.32 pp) and the H100/H200 (-2.87 pp)
Speech: most associated with the M4 (+7.26 pp)

Startup chips: still a niche, but the fastest-growing one

The most bullish trend in our dataset can be found amongst the startup silicon cohort: Cerebras, Groq, Graphcore, SambaNova, Cambricon, and Habana. Collectively, they show +19% YoY growth, with an estimated 695 papers citing their hardware in 2025, up from 586 in 2024.

Their absolute share, however, remains tiny: 1.6% of all papers mentioning any startup challenger AI accelerator. Even so, their trajectories are notable:

Cerebras WSE-3 benefits from open-sourcing large-scale SlimPajamas-2 training runs
Groq's LPU attracts a wave of interest from academic inference work, driven by viral low-latency demos
Habana Gaudi-2 continues to show up in AWS-funded academic projects
Graphcore is collapsing post-acquisition, with 2025 paper mentions down sharply from their 2022 peak.

Updated: Major H100 and GH200 cluster deployments

We've also updated our tracking of large-scale A100, H100/200 deployments with several new high-profile additions:

Berzelius (Sweden): 752 H100 at Linköping University source
Shaheen III Phase-2 (Saudi Arabia): 2,800 H100 at KAUST source
Israel-1 (Israel): NVIDIA R&D cluster with 2,048 H100 source
Microsoft Eagle (USA): 14,400 H100 source
JUPITER Booster (Germany): 24,000 GH200 source
NVIDIA Eos (USA): 4,608 H100 source

These systems are expected to influence both model training scale and downstream research pipelines throughout late 2025 and into 2026.

A100 clusters:

H100 clusters:

Looking ahead: what might swing the charts next

Several known unknowns will shape the next update:

New datacenter regions in the Gulf: Major NVIDIA GPU clusters in the UAE and Saudi Arabia are ramping in 2025, including KAUST's Shaheen III and G42’s growing footprint. These may begin to appear more visibly in paper metadata by early 2026.
Stargate (USA): The first phase of Stargate is expected to go live later this year. If operational as planned, it will represent the first large-scale deployment of a vertically integrated AI-native datacenter built around liquid-cooled NVIDIA infrastructure.
H200 and B100: As these chips enter wider circulation, expect a Q4 bounce in NVIDIA mentions
MI300X: Already present in tuning pipelines; the next test is whether it sees real training workloads
Groq's compiler UX: If porting friction keeps dropping, academic teams may adopt LPU-backed inference at higher rates
FPGAs: EU-funded open hardware initiatives could breathe new life into an accelerator category that's languished since 2020

Conclusion

After two years of rapid scale-up in compute mentions, 2025 looks quieter, but not so for long. NVIDIA remains the default. Startup silicon continues to climb from a low base. And the next wave of hardware adoption is already underway, just not yet visible in the preprint timelines.

A few notes

- We take the view that usage of chips in AI research papers (early adopters) is a leading indicator of industry usage.

- Papers using AI semi startup chips almost all have authors from the startup.

- 2025 figures = real counts through 1 June 2025 + volume-adjusted full-year forecast. Data from Zeta Alpha open-source AI paper index.

See the live charts here: www.stateof.ai/compute

Kasia Zaniewska

Jul 24

I'm sorry but you should not be drawing any conclusions from the "State of AI" survey. The way it is constructed and phrased goes against the know-how and best practices of building questionnaires. Just like any science Social Sciences have their standards and practices to enable reasoning and conclusions. Happy to share more 1:1 or in any preferred form.

Expand full comment

1 reply by Air Street Press