For most of the past three years, running a frontier AI model meant one of two things: paying a cloud provider for API access, or operating a data centre at significant scale. A convergence of hardware and model architecture is quietly dismantling that assumption. NVIDIA's DGX Spark and DGX Station, shipping this spring from a roster of major OEMs, are designed to run frontier-class models from a box that sits on or under your desk. The timing is not accidental — the open-source model ecosystem has matured to the point where there are competitive models worth running locally.
The architectural shift that makes this possible is Mixture of Experts (MoE). Traditional dense models activate every parameter for every token — computationally expensive and difficult to run on anything short of a server. MoE models activate only a subset of specialized "expert" subnetworks for each token, achieving comparable output quality at a fraction of the active compute. NVIDIA notes that MoE now accounts for over 60% of open-source model releases this year, including DeepSeek-R1 and Mistral's recent large model. The practical upshot: models that would have required a data centre rack two years ago can now run on hardware with a consumer-adjacent form factor and price point.
NVIDIA's DGX Spark — available from ASUS, Dell, HP, and others — is aimed at researchers and developers who want to iterate locally without cloud latency or per-token costs. DGX Station steps up for teams running larger models or heavier workloads. The pitch is straightforward: cloud AI is convenient but expensive at scale, and for organizations with data sensitivity or compliance requirements, routing proprietary data through third-party APIs has always been uncomfortable. Local inference sidesteps both problems.
The open-source model side of this story is evolving fast. India's Sarvam AI released a new generation of models this week that illustrates how distributed the frontier has become — 30-billion and 105-billion parameter versions trained with a specific focus on South Asian languages and document parsing, alongside speech and vision models. Sarvam's bet is that the economics of open-source development, combined with purpose-specific training, can produce models that outperform general-purpose frontier models for particular use cases. It's a bet being made by dozens of labs globally.
The implications of this decentralization are worth sitting with. The AI industry has been structured, for practical purposes, around a small number of very large cloud providers controlling access to frontier capability. Regulation, liability, and terms of service all flow through those chokepoints. A world where frontier-quality models run on local hardware and are freely downloadable is structurally different. Governments that want to audit AI systems used within their borders can't just ask the cloud provider. Enterprises that want fine-tuned models trained on proprietary data can do that without sharing the data externally. Developers in markets with poor or expensive internet connectivity can run capable models at all.
There are limits to the optimism. Running a 100-billion parameter model locally still requires serious hardware — DGX Station is not a laptop, and the price point reflects that. The quality gap between the best locally-runnable models and the top closed frontier models (GPT-5, Gemini Ultra, Claude Opus) is real, even if it's narrowing. And open models introduce their own governance challenges: a model you can download can be fine-tuned in ways that remove safety constraints, a fact that regulators have been struggling to address without good answers.
But the direction of travel seems clear. The combination of MoE architectures, competitive open-weight models, and purpose-built local inference hardware is bringing the frontier considerably closer to the desk. The cloud won't disappear — for the largest workloads and the latest frontier models, it remains the only practical option — but the assumption that AI capability requires cloud access is eroding faster than most enterprises have planned for.