Frontier Models Leak Through Distillation — The Real Moat for a Company Built on an API

Anthropic named Alibaba: from April 22 to June 5, it says, ~25,000 fake accounts ran 28.8 million exchanges with Claude to harvest its capabilities. If a frontier model can be distilled and chased down by collecting enough outputs, what exactly is the moat for a founder who built on an API?

What happened

Anthropic alleges that Alibaba and its AI lab Qwen ran the largest “distillation attack” the company has seen against Claude. The campaign spanned April 22 to June 5, 2026, using roughly 25,000 fraudulent accounts to generate more than 28.8 million exchanges with Claude. The target was the most valuable capabilities — software engineering and agentic reasoning, the cornerstones of Anthropic’s latest Mythos Preview model. Distillation means collecting a strong model’s outputs and training a weaker one on them: outside actors repeatedly prompt an advanced model, scrape its reasoning patterns and response structure, then train their own model on those responses — bypassing the enormous R&D and compute cost of building a model from scratch. Anthropic says this dwarfs prior cases it flagged: DeepSeek around 150,000 exchanges, Moonshot AI over 3.4 million, MiniMax more than 13 million. Alibaba had not commented as of reporting. Washington is moving too: senators are drafting an amendment to defense legislation that would sanction Chinese firms that improperly access U.S. AI outputs.

What it means for founders

This isn’t a fight between two giants — it’s a question about what an AI moat actually is. On one side is a company that builds frontier models. It burns hundreds of millions to forge top capabilities, but those capabilities flow out as outputs through the API, and outputs, collected in volume, can be cloned via distillation. If Anthropic’s claim holds, model capability sits behind a thin wall of terms of service and account bans — locked by contract, not by technology. On the other side is a founder who built a product on that API. Here the uncomfortable truth splits open. If capability leaks this easily, no one monopolizes model access for long. The best model is a temporary edge, not a permanent moat. In a world where a fast follower distills its way to within months of you, “we use the smartest model” is not a defense line. So the real moat sits outside the model: proprietary data, integrations buried deep in the workflow, switching costs, distribution and brand, regulatory trust — the things distillation can’t copy. There’s a second risk for API consumers, too. As vendors tighten terms and surveil usage patterns to stop distillation, even honest heavy users can get caught in the net — abnormal call patterns, bulk synthetic-data generation, training on outputs. You need to know in advance where your legitimate usage crosses a terms-of-service line.

What you can do now

First, decouple your moat from the model. If your deck lists “we use the best model” as a differentiator, rewrite it. Models are rented, and what you rent your competitor rents too. Second, invest in assets that don’t replicate — unique data, workflow integration, switching costs, distribution. What distillation can’t scrape lives there. Third, actually read the terms of the model you use, especially whether training other models on outputs or bulk-generating synthetic data is allowed. Tripping that line unknowingly and getting your account cut is an outage. Fourth, if your product is the one being distilled — you expose a fine-tuned model or proprietary prompt assets through an API — put rate limits, output watermarking, and anomaly detection up as your defense line. Any company that ships capability as outputs is a potential distillation target.