There is a paradox at the heart of the AI hardware moment: the companies most invested in reducing their dependence on NVIDIA are also the ones deepening it. Meta just debuted its latest in-house silicon, Microsoft is shipping its Maia 200 inference chip, OpenAI has struck a deal with Broadcom to manufacture custom accelerators at gigawatt scale — and yet NVIDIA just announced a new supercomputer platform, expanded its Cosmos AI model suite, and remains the only company capable of supplying the raw training infrastructure that makes all of this possible in the first place.
Meta's MTIA chip — the Meta Training and Inference Accelerator — is now in active data center deployment. Meta has been clear that the goal is not to replace NVIDIA entirely, but to run inference workloads on cheaper, more predictable custom silicon while reserving GPU clusters for training. The economics are compelling: once a model is trained and its behavior is stable, the cost of serving millions of user queries matters enormously at Meta's scale. Custom inference chips, even if they require significant engineering investment upfront, can pay for themselves many times over.
OpenAI's arrangement with Broadcom tells a similar story, though at an almost surreal scale. The two companies announced they are collaborating to deploy ten gigawatts of OpenAI-designed accelerators — a figure that would, if fully realized, represent a meaningful fraction of global data center capacity. OpenAI designs the chips; Broadcom handles manufacturing coordination. The strategy is a direct hedge against the supply constraints and pricing power that come with dependence on a single dominant supplier, though the deployment timeline remains long.
Microsoft's approach is more immediate. The Maia 200, announced in January, is already being used internally for AI inference workloads. Microsoft has been notably careful about how it positions the chip — not as a replacement for NVIDIA hardware, which remains central to Azure's AI offerings, but as a complementary layer for specific, high-volume inference tasks where its own silicon can be optimized. It is a defensive move as much as a strategic one: owning some slice of the compute stack provides negotiating leverage even if it never becomes the dominant path.
And then there is NVIDIA. While the hyperscalers race to build alternatives, Jensen Huang's company announced both the Rubin platform — six new chips anchoring what it calls an AI supercomputer — and a major expansion of its Cosmos world foundation models, including new physical AI capabilities and Cosmos Reason, a model that applies chain-of-thought reasoning to video data. NVIDIA's Nemotron 3 open model family adds yet another layer to its software ecosystem. The pattern is deliberate: every new capability NVIDIA ships makes its ecosystem harder to walk away from, even for companies actively trying to do so.
The deeper tension here is about where the real value in AI infrastructure will ultimately accrue. The bullish case for custom silicon is that as AI workloads become more predictable and commoditized, purpose-built chips will win on efficiency. The case for NVIDIA is that AI is still moving fast enough that flexibility and raw capability matter more than efficiency — and that by the time a custom chip reaches production, the model it was designed for has already been superseded. Neither argument is clearly winning yet, but the billions being spent suggest the hyperscalers are not willing to bet entirely on either outcome.
What makes this moment particularly interesting is the public framing. Each announcement — Meta's MTIA, OpenAI's Broadcom deal, Microsoft's Maia — is presented as a sign of independence and strategic maturity. In practice, all three companies continue to spend prodigiously on NVIDIA hardware. The chip they are building and the chip they are buying are solving different problems. NVIDIA, for its part, seems content to let customers believe they are reducing dependence while continuing to expand the capabilities only its own platform can deliver.