The FBI is investigating DeepSeek and NVIDIA, plus the complete story of the rise of DeepSeek

TL;DR

DeepSeek was founded in May 2023 as a spinout from High-Flyer and inherited a major GPU base: about 10,000 NVIDIA A100 GPUs obtained in 2021.

Briefing Cornell Notes

Briefing

DeepSeek’s rapid rise is being fueled by a head start in high-end AI hardware—plus aggressive scaling—and that combination is now drawing U.S. scrutiny over possible export-control workarounds. Founded in May 2023 as a spinout from China’s High-Flyer hedge fund, DeepSeek inherited a major GPU pipeline: High-Flyer had already secured about 10,000 NVIDIA A100 GPUs in 2021 and transferred that hardware base to the new company. That early access helped DeepSeek push model capability quickly, including through open-source releases that surprised many observers who hadn’t tracked its papers or performance climb.

The cost narrative DeepSeek has offered—reporting a training run cost of $5.58 million using 2,000 GPUs over 55 days—has been treated as both plausible and incomplete. Analysts note that the figure likely reflects only the incremental cost of the training run, not the broader bill required to build and deploy frontier models: R&D labor, electricity and cooling, data acquisition and curation, and the infrastructure needed for storage and compute management. SemiAnalysis estimates DeepSeek’s annual AI budget at roughly $500 million, underscoring that a single training-run number is only a slice of total spending. Even when training costs fall over time, the “marginal differences” still matter: starting slightly ahead in the race can compound as models improve faster and faster.

Hardware constraints are central to the competitive picture. Multiple analyses cited in the discussion suggest DeepSeek has access to around 50,000 NVIDIA Hopper-class GPUs, including H800s (a China-specific H100 variant with lower performance) and H20s (a restricted variant designed to be sold to China while staying compliant). The A100 stockpile from 2021 also remains relevant. In addition, the account points to possible cloud compute agreements with Chinese providers, allowing DeepSeek to scale even if individual chips are slower than full-power H100s.

That scaling story intersects with the investigation: the FBI is reportedly looking at a “Singapore back door,” a nickname tied to NVIDIA’s unusually large share of sales to Singapore. The concern is that some GPUs shipped to Singapore may be re-labeled and re-exported back to China, potentially bypassing export controls. If that pattern is confirmed, tighter chip export regulations could follow—an outcome that would matter most because DeepSeek’s advantage may depend on access gained during the 2021–2023 window. Releasing a widely known model after navigating restrictions for a period doesn’t guarantee the same access will persist.

Overall, DeepSeek’s engineering momentum appears real, but the sustainability of its edge may hinge less on one training-run cost figure and more on whether hardware access remains stable. The current scrutiny functions as a forward-looking constraint: the competitive gap could narrow if chip supply tightens, even if DeepSeek’s recent releases look strong today.

Cornell Notes

DeepSeek’s fast ascent traces back to early GPU access and aggressive scaling. Spun out in May 2023 from High-Flyer, it inherited about 10,000 NVIDIA A100 GPUs secured in 2021, giving it a hardware head start. DeepSeek has claimed a $5.58 million training run (2,000 GPUs, 55 days), but analysts argue that number covers only incremental training and omits major costs like R&D labor, electricity/cooling, data work, and infrastructure. Estimates from SemiAnalysis place DeepSeek’s annual AI budget around $500 million, suggesting training-run figures are only a fraction of total investment. The FBI’s attention to a “Singapore back door” raises the possibility that export-control workarounds could be tightened, which would directly affect DeepSeek’s future ability to scale.

How did DeepSeek get an early advantage in compute, and why does that matter for model progress?

DeepSeek was founded in May 2023 as a spinout from High-Flyer, which had already integrated AI into trading strategies. High-Flyer invested in hardware by obtaining about 10,000 NVIDIA A100 GPUs in 2021 and handing them to DeepSeek as part of the initial setup. That early GPU base allowed DeepSeek to begin aggressive development sooner than competitors who lacked comparable access. In a fast-moving model race, earlier compute access can compound because improvements accelerate over time.

Why do analysts treat DeepSeek’s $5.58 million training cost as incomplete?

The $5.58 million figure is described as the incremental cost of the training run—2,000 GPUs over 55 days. It does not include major categories that typically drive total model spending: R&D labor, electricity and cooling, data acquisition and curation, and infrastructure/storage costs needed to manage large datasets and training pipelines. SemiAnalysis-style budgeting estimates (around $500 million annually) reinforce that a single training-run number is only a slice of the overall budget.

What hardware mix is suggested for DeepSeek, and how does it affect performance?

The discussion cites an estimate that DeepSeek has access to roughly 50,000 NVIDIA Hopper-class GPUs, including H800s (a China-specific version of the H100 with lower performance) and H20s (a restricted variant intended to be sold to China). It also notes a large A100 stockpile from 2021. Even if these chips are slower than full-power H100s, DeepSeek could offset performance gaps by scaling quantity, at least for a period.

What is the “Singapore back door,” and why is it tied to the FBI investigation?

The “Singapore back door” is a nickname for concerns that NVIDIA has an unusually large share of sales to Singapore. The worry is that some shipments to Singapore may be re-exported back to China—potentially re-labeled or routed in ways that could bypass export controls. The FBI’s reported interest suggests authorities are examining whether that channel is being used to obtain restricted GPUs.

How could export-control changes affect DeepSeek’s competitive position?

If export controls tighten, DeepSeek’s ability to keep scaling hardware could be constrained. The argument is that DeepSeek’s recent advantage may rely on access gained during the 2021–2023 period. Releasing a major model after successfully navigating restrictions for a while doesn’t guarantee the same access will continue; the impact would likely show up with future training cycles and model releases.

Review Questions

What costs are missing from a reported “training run” number, and why does that change how you interpret competitiveness?
How does a hardware head start (like early A100 access) compound over time in frontier model development?
What evidence would be needed to confirm whether a “Singapore back door” is actually being used to route restricted GPUs back to China?

Key Points

1
DeepSeek was founded in May 2023 as a spinout from High-Flyer and inherited a major GPU base: about 10,000 NVIDIA A100 GPUs obtained in 2021.
2
DeepSeek’s reported $5.58 million training cost likely reflects only incremental training, not the full cost of frontier model development (labor, electricity/cooling, data work, and infrastructure).
3
SemiAnalysis estimates DeepSeek’s annual AI budget at roughly $500 million, implying that training-run figures are only a fraction of total spending.
4
Analyses suggest DeepSeek has access to around 50,000 NVIDIA Hopper-class GPUs, including H800s and H20s, plus a large A100 stockpile.
5
The FBI investigation centers on a “Singapore back door,” raising concerns about whether GPUs shipped to Singapore are being re-exported to China in ways that could bypass export controls.
6
If export controls tighten, DeepSeek’s scaling advantage could erode, especially since recent releases may benefit from earlier access that may not persist.

Highlights

High-Flyer’s 2021 acquisition of roughly 10,000 NVIDIA A100 GPUs became a hardware foundation for DeepSeek’s aggressive model development after its 2023 founding.

DeepSeek’s $5.58 million training-run figure is treated as incomplete because it omits R&D labor, electricity/cooling, data acquisition/curation, and infrastructure/storage costs.

Estimates place DeepSeek’s GPU access around 50,000 Hopper-class units, including China-specific H800s and restricted H20s, enabling scale even with lower per-chip performance.

The FBI’s attention to the “Singapore back door” targets concerns that Singapore shipments may be re-routed back to China, potentially triggering tighter export controls.

Topics

DeepSeek Rise
NVIDIA GPUs
Training Costs
Export Controls
FBI Investigation

Mentioned

NVIDIA
High-Flyer
Claude
FBI
AI
R&D
GPU