For a few years, the “AI chip war” story has been told like a switch: either you can buy Nvidia, or you can’t.
But 2025 looks less like a switch and more like a market migration.
On April 1, 2026, Reuters reported that data from an IDC report (reviewed by Reuters) shows Chinese GPU and AI chip vendors shipped about 1.65 million AI accelerator cards in China in 2025, representing about 41% of the market. Nvidia still led shipments (about 2.2 million cards, 55% share), while AMD shipped about 160,000 cards (about 4% share). Total shipments across Nvidia, AMD, and Chinese vendors were about 4 million cards in China in 2025.
That shift matters for builders because it changes what “deploying in China” means at the infrastructure layer: not just export-control compliance, but hardware diversity, software portability, and vendor strategy.
What actually happened (in plain terms)
IDC’s shipment data (as summarized by Reuters) points to two simultaneous trends:
- Nvidia remains the market leader in China—but with a meaningfully smaller share than its historical dominance.
- Domestic vendors crossed a scale threshold where “buy Chinese” is no longer just policy intent; it’s visible in unit volumes.
Reuters’ summary also breaks out the domestic stack:
- Huawei led domestic shipments (about 812,000 chips—roughly half of domestically branded shipments).
- Alibaba’s T-Head shipped about 265,000.
- Baidu’s Kunlunxin and Cambricon shipped about 116,000 each.
The immediate driver is familiar: successive U.S. export controls cut China off from Nvidia’s most advanced products, while Beijing has pushed agencies and companies toward domestic alternatives.
But the builder implication is more subtle: the China AI compute market is becoming structurally multi-accelerator.
Why this is a bigger deal than a “market share” chart
If you read the story as “Nvidia is losing,” you’ll miss the most actionable point:
The unit volumes imply real production deployments, not just lab pilots.
At that scale, the market starts to develop its own gravity:
- procurement teams demand roadmaps, support, and availability
- integrators build repeatable system SKUs around domestic parts
- software teams have to pick which runtimes are first-class, and which are “best effort”
That is how platform shifts become sticky—especially when policy incentives and supply constraints both point in the same direction.
Who is affected (and how)
Builders shipping AI products in China
If your product depends on GPU capacity in-region, you’re moving into a world where “CUDA-only” is a risk posture, not a default.
You should expect:
- more variance in accelerator availability by province/provider
- more heterogeneity in inference performance vs. cost vs. power
- more requests for “works on vendor X” proofs, not just “works on Nvidia”
Infra teams outside China (yes, you too)
Even if you never deploy in China, the market split changes the global ecosystem:
- it pulls engineering attention toward non-Nvidia toolchains
- it encourages forks in kernels, compilers, and inference engines
- it shifts how vendors think about “portable AI” as a selling point
This is the same export-control reality we saw from a different angle in Supermicro Smuggling Claims Show How Messy AI Chip Export Controls Have Become. When access is constrained, the market doesn’t stop—it reroutes.
Nvidia (and the “China SKU” era)
Nvidia’s remaining lead suggests two things can both be true:
- developers still prefer the Nvidia stack where they can get it
- the addressable market is increasingly defined by what’s licensable and shippable, not what’s best on a benchmark
If you’re tracking Nvidia’s roadmap, this is a reminder that product strategy and geopolitics now co-design the SKU list—an undercurrent behind much of the platform messaging in our Nvidia GTC 2026 keynote recap.
What changes next (the builder checklist)
If you build and operate AI systems that might need to run across multiple accelerator families, here is the practical checklist to adopt now—before you’re forced into it mid-migration.
1) Treat “accelerator choice” like a portability requirement
The goal isn’t perfect portability across every chip. It’s avoiding a single-vendor hard dependency in the places where constraints are foreseeable.
Tactics that usually pay off:
- keep model execution behind a thin internal “inference backend” interface
- standardize model packaging and pre/post-processing (tokenization, safety filters, logging) outside the backend
- build a performance/cost/power envelope for each workload, not just a “fastest GPU” target
2) Make kernel and runtime choices explicit
Many teams accidentally bake their accelerator assumption into the most expensive-to-change layer: custom kernels and runtime tooling.
If you’re doing any of the following, you need a portability plan:
- hand-written CUDA kernels or CUDA-only extensions
- tight coupling to Nvidia-specific inference runtimes
- operational tooling that assumes Nvidia observability primitives
Even if you stay Nvidia-first, treat alternatives as disaster recovery for procurement, not as an ideological choice.
3) Re-plan procurement as a software problem
When a market becomes multi-accelerator, procurement stops being “buy the same thing again.”
Procurement becomes:
- qualifying multiple accelerator classes
- validating that your inference stack matches your target hardware mix
- maintaining deployment profiles (power, cooling, rack density, networking) across more than one design point
This is where infra teams can create a compounding advantage: the teams that can deploy across heterogeneous accelerators can keep shipping when supply (or policy) moves.
The durable takeaway
The Reuters/IDC shipment data is not just a snapshot of China.
It’s a preview of what happens when compute becomes strategic: the market turns into a fragmented platform landscape, and builders who want consistent delivery have to engineer for it.
If you only remember one thing, make it this:
The real risk isn’t “Nvidia vs. China.” The risk is “single-stack vs. multi-stack,” and the transition cost lands on builders first.