Qualcomm Will Sell Meta a Server CPU — Your Inference Cost Floor Just Moved

Qualcomm unveiled its first data center chips and signed Meta as the launch customer for a server CPU shipping in 2028. The Nvidia-GPU/x86 duopoly that has defined data center compute is cracking toward custom silicon and ARM-class cores. If you’re building on a model API, this is not chip trivia — it’s the floor under your inference cost.

What Happened

On June 24, 2026, Qualcomm rolled out its first data center silicon. Two pieces matter. One is the Dragonfly C1000, a server CPU built on Qualcomm’s own Oryon cores — 250-plus of them, clocking above 5 GHz, with a claimed 2x-plus performance-per-watt over existing benchmarks. The other is the Dragonfly AI300, an inference-focused accelerator the company says delivers 4x to 8x better performance-per-watt than existing GPU-based architectures. Both reach commercial availability in 2028, and Meta will start deploying the CPU across its server fleet in the second half of that year. Zuckerberg himself endorsed it, saying Meta is “excited to continue partnering with Qualcomm as they design the next generation of CPUs.”

Why this is more than one vendor’s product launch: data center compute has run on two rails for years. Nvidia GPUs handle training and inference; Intel and AMD x86 chips run the server brains around them. Qualcomm bends both at once — a CPU built on cores it grew in mobile rather than x86, and an accelerator aimed at inference instead of Nvidia’s general-purpose GPU. And Qualcomm isn’t alone. Amazon has Graviton CPUs and Trainium training chips, Google has TPUs, and OpenAI just unveiled its own inference chip with Broadcom. Every hyperscaler is moving down the stack, away from the “general-purpose GPU plus x86” default. This is one frame of that shift.

What This Means for Founders

On the surface it reads as a chipmaker turf war. But for a founder building on a model API, the bottom of your cost structure is moving. Inference already eats north of 20% of revenue at many AI-native companies. More kinds of silicon competing to run that inference means downward price pressure over time — good news in the long run. But be cold about timing. Dragonfly ships in 2028. Your cloud bill doesn’t drop tomorrow; this is a trailer for a supply structure that changes in two or three years. The question is how today’s decisions account for it.

If you’re in the Valley, the read is sharper, because the diversification is already a wave: Amazon, Google, Microsoft (Maia), Meta (MTIA), OpenAI, and now Qualcomm-via-Meta all running serious custom-silicon programs. That’s the same wave YC keeps warning seed founders about — a thin wrapper over a single provider’s API is exactly what gets squeezed when the providers integrate down and prices fragment. The counterintuitive part: cheaper inference dollars in 2028 only become your margin if your code isn’t nailed to one chip or one model today. Model access was never a moat, and neither is an inference pipeline bolted to specific silicon. The companies that can swap backends when supply diversifies are the ones who capture the price drop instead of watching their provider keep it.

What You Can Do Now

First, don’t pin inference to one provider or one chip. Put an abstraction layer in front so the same task can run on a different model or backend; when supply fragments in 2028, that’s the difference between swapping and being stuck. Second, don’t get hypnotized by “8x per watt” — model your own per-token unit economics directly. Cheaper chips don’t automatically widen your margin; the provider still sets the price. Third, write 2028 into your roadmap. If you’re signing a two-year cloud commitment now, use the coming supply shift as a negotiating lever. Fourth, build the moat on assets you control — your data, integrations wired deep into the workflow, domain knowledge — the things that survive anyone halving the cost of inference. The more the giants consolidate the bottom of the stack, the more the narrow, deep slots in specific industries and regulatory regimes are left open.