← Front Page
AI Daily
A walled citadel labelled with chip patterns, surrounded by a flowing tide of small bright tokens pouring in from the countryside
Geopolitics • Thursday, 21 May 2026

China's Plan to Win the AI Race Without Beating Nvidia

By AI Daily Editorial • Thursday, 21 May 2026

For three years the standard story about AI competition has been a chip story. American export controls would deny China the most advanced silicon, the gap in raw performance would widen, and the United States would walk into the inference era with an unassailable lead. A pair of developments this week suggests Beijing has quietly stopped contesting that frame and started competing on the parts of the stack where it can win. A new strategic analysis in The Diplomat lays out the doctrine. A 1.54-exaflop supercomputer built entirely from Huawei ARM cores, reported by TechRadar, is the proof of concept.

The Diplomat piece by analysts of Sino-US competition borrows a phrase from Mao to describe the approach: encircle the cities from the countryside. The "cities" are the high-end training market dominated by Nvidia's H-series chips and the frontier reasoning models of OpenAI, Anthropic and Google. The "countryside" is the much larger global market for inference, the routine generation of tokens for coding, document processing, customer service and enterprise automation. Rather than fight the United States where it is strongest, China is undercutting it where it is most exposed.

The numbers in the analysis are pointed. DeepSeek reportedly trained its V3 model for $6 million using a tenth of the compute Meta used for a comparable LLaMA 3.1. MiniMax's M2.5 model scores within 0.6 percentage points of Anthropic's Claude Opus 4.6 on the SWE-Bench coding benchmark, yet costs roughly one-twentieth as much per task. On OpenRouter, the world's largest LLM API aggregator, Chinese open-weight models passed American models in weekly token volume for the first time in February and by late February accounted for 61 percent of all token consumption on the platform. OpenRouter's chief operating officer noted that the heaviest users of Chinese tokens for agentic workflows were American firms.

That last detail is the strategic punchline. When a U.S. enterprise routes an agentic call through MiniMax or Zhipu's GLM-5, it is not just buying a cheaper input. It is embedding Chinese inference into its operational stack, fine-tuning workflows against those models, and accumulating switching costs that compound with use. Because the leading Chinese models are open-weight, even an outright API ban would not sever the dependency: companies can self-host the same models on domestic hardware and keep the cognitive plumbing intact.

The hardware story this week sharpens the same point. China's new LineShine supercomputer, reported by TechRadar via Tom's Hardware, packs 2.45 million ARMv9 cores across 20,480 nodes and delivers 1.54 exaflops of AI performance without a single GPU. It is less power-efficient than an equivalent Nvidia cluster and almost no one outside China would build something like it by choice. But that is the point. China is willing to accept lower density and higher power consumption in exchange for independence from CUDA and the Nvidia supply chain, and it has the cheap electricity to absorb the trade-off. By the end of 2025 China's installed generation capacity reached 3.89 billion kilowatts, with wind and solar contributing 47 percent of the total, and Chinese industrial electricity runs 30 to 50 percent below American prices.

Put the strands together and a coherent strategy emerges. Algorithmic efficiency closes part of the silicon gap. Cheap, abundant power closes another part. Aggressive pricing on open-weight models captures the routine inference workloads that pay the bills. CPU-based supercomputing provides a domestic fallback if Nvidia is permanently denied. None of this delivers the most advanced model in the world, and on pure mathematical reasoning the American labs still lead clearly. But the analysis argues the United States is busy fortifying the layer of the stack that mattered in 2022, while the contest has moved to inference watts, energy policy and ecosystem lock-in. Whether Washington has the political bandwidth to refocus on grid build-out, inference-stack research and a "trusted token" alliance with the EU, Japan and the Gulf is now the question. The chip war was not a strategic error. It is just no longer the war being fought.

Sources