DeepSeek stole our tech... says OpenAI
Based on Fireship's video on YouTube. If you like this content, support the original creators by watching, liking and subscribing to their content.
OpenAI and Microsoft are reportedly accusing DeepSeek of IP theft tied to distillation using OpenAI API outputs, allegedly violating DeepSeek’s terms.
Briefing
OpenAI and Microsoft are reportedly accusing DeepSeek of intellectual-property theft, specifically alleging that DeepSeek used “distillation” techniques to fine-tune models using outputs from OpenAI—an approach they say violates DeepSeek’s terms of service. The dispute matters because it strikes at the boundary between legitimate model training and copying: distillation can transfer capabilities from a large, expensive model to a smaller one, but the allegation is that DeepSeek did it using OpenAI’s API outputs at scale.
The claims come amid a broader shockwave from DeepSeek’s rapid rise. A Chinese hedge fund is described as having built a state-of-the-art reasoning model that reportedly surpassed OpenAI’s model while spending only $5.5 million, then offering a 100% discount code that undercut Big Tech’s pricing power. That narrative frames the current controversy as more than a technical dispute—it’s also a fight over whether AI progress is being throttled by expensive infrastructure and whether open, low-cost training can break the dominance of companies pushing massive data-center spending.
So far, hard evidence has not been publicly detailed, but screenshots circulating online allegedly show DeepSeek producing responses that look indistinguishable from ChatGPT. Critics argue that this isn’t automatically proof, since similar content can be learned organically from widely available text. Microsoft, however, is said to have observed activity in China involving large-volume extraction of data from the OpenAI API, with accounts potentially linked to DeepSeek. In this telling, DeepSeek becomes “Robin Hood” for some—stealing from the rich to empower cheaper, more accessible models—while OpenAI frames it as unauthorized appropriation.
Distillation itself is not inherently controversial in the transcript’s framing. Models can be distilled from other open models like Llama and Qwen, and even OpenAI models can be distilled in principle as long as the process doesn’t rely on using the API to build a rival model. The core accusation, then, is not the concept of distillation but the alleged source and method.
The controversy unfolds alongside a fast-moving China-vs-China model race. New releases are cited, including Alibaba’s Qwen 2.5 Max and another model, Kim 1.5, both described as outperforming major Western competitors on benchmarks. Meanwhile, DeepSeek is criticized for heavy censorship but also noted as relatively jailbreakable for skilled prompt engineers. A separate technical talking point highlights DeepSeek’s claimed 10x efficiency gains by avoiding CUDA and instead using Nvidia parallel thread execution directly.
The transcript closes by emphasizing a larger trend: open source is gaining ground, and developers should build products on top of it. It also promotes PostHog as an open-source, self-hostable analytics and experimentation tool, positioning it as a practical way to ship better features as the AI landscape accelerates.
Cornell Notes
OpenAI and Microsoft are reportedly accusing DeepSeek of IP theft tied to “distillation,” alleging DeepSeek used OpenAI API outputs to fine-tune models in a way that violates terms. Distillation can legitimately transfer knowledge from one model to another, but the dispute centers on the alleged source and scale of extracted outputs. Public proof is described as limited—screenshots circulate, but similar outputs can appear from organic training data—while Microsoft is said to have observed large-volume API extraction activity linked to accounts in China. The allegations land as DeepSeek’s efficiency and low training cost fuel a broader open-model race, with new Chinese releases like Alibaba’s Qwen 2.5 Max and Kim 1.5 adding pressure on Western leaders. The stakes are both technical (how models are trained) and economic (whether open, cheaper systems can undercut incumbents).
What does “distillation” mean in this dispute, and why is it central to the accusation?
Why aren’t screenshots alone considered decisive evidence?
What specific behavior does Microsoft allegedly observe that links the activity to DeepSeek?
How does the transcript portray DeepSeek’s efficiency and hardware approach?
What other criticisms and constraints are mentioned beyond IP theft?
How do new model releases change the stakes of the OpenAI–DeepSeek dispute?
Review Questions
- What distinction does the transcript draw between legitimate distillation and the alleged misuse of OpenAI API outputs?
- Why might a model produce “ChatGPT-like” responses without any direct copying from OpenAI?
- How do DeepSeek’s claimed efficiency gains (avoiding CUDA) relate to its broader competitive narrative?
Key Points
- 1
OpenAI and Microsoft are reportedly accusing DeepSeek of IP theft tied to distillation using OpenAI API outputs, allegedly violating DeepSeek’s terms.
- 2
Screenshots of overlapping responses are treated as weak evidence because similar outputs can emerge from organic training on widely available text.
- 3
Microsoft is said to have observed large-volume extraction from the OpenAI API by accounts in China, potentially linked to DeepSeek.
- 4
Distillation is portrayed as acceptable when it transfers knowledge from open models like Llama and Qwen, but disputed when it relies on OpenAI API outputs to build a rival.
- 5
DeepSeek’s rise is framed around low training cost, aggressive pricing, and claimed 10x efficiency gains through avoiding CUDA.
- 6
The transcript highlights additional controversies: heavy censorship, jailbreakability, and privacy concerns when using the web version.
- 7
Open-source model momentum is emphasized, with new releases like Alibaba’s Qwen 2.5 Max and Kim 1.5 intensifying competitive pressure.