A Chinese AI model built on a shoestring budget has shocked Silicon Valley and presented a major challenge to Donald Trump.
DeepSeek, a language model that can generate human-like conversation, was released on the same day as Mr Trump’s inauguration.
It has since been tested against some of America’s most powerful AI (artificial intelligence) models, such as chatGPT, and in some cases has come out on top.
Experts warned that the breakthrough was a “wake-up call to America”, which has been battling to prevent China competing at the top level of an AI arms race.
Concerns have also been raised that DeepSeek has built-in censorship and refuses to answer sensitive political questions about China and Xi Jinping, the country’s leader.
Shortly after his inauguration, Mr Trump announced a $500 billion (£400 billion) AI investment project, dubbed “Stargate”, in co-operation with US firms including OpenAI, which created ChatGPT.
DeepSeek’s new model comes despite a plan by Joe Biden’s administration to hamper China’s AI capabilities, in hopes of denying it the political influence and military supremacy which could come from being the first to achieve what is known as superintelligence.
DeepSeek said it had taken just two months and less than $6 million (£4.8 million) to build a model more advanced than many of its Western competitors.
It was developed as a side project by a maverick hedge fund manager who invested heavily in Nvidia, one America’s most sophisticated makers of the computer chips that are crucial for AI models.
Liang Wenfeng reportedly has close links to the Chinese Communist Party.
Mr Trump placed America’s ambition to become the “world capital of artificial intelligence” at the centre of his inauguration last week, reserving the front row at the Capitol Rotunda for tech billionaires developing AI.
On the same day, DeepSeek released its breakthrough R1 open source language model to little fanfare. Wenfeng’s start-up appeared to have immediately and unexpectedly closed the gap with the US and publicly thwarted the US government’s attempts to stifle Chinese innovation.
“Deepseek R1 is one of the most amazing and impressive breakthroughs I’ve ever seen,” warned Marc Andreessen, the Silicon Valley venture capitalist who has been advising Mr Trump.
DeepSeek claimed to have used 2,048 second-rate Nvidia H800 chips and $5.6 million (£4.5 million) to build what is known as a reasoning-focused model.
For comparison, Mark Zuckerberg’s Meta used 16,000 first-class Nvidia H100 chips to build its Llama 3.1 model.
In an interview with Time magazine earlier this year, Dario Amodei, chief executive of the Amazon-backed AI developer Anthropic, estimated the cost of building a frontier model in 2024 as $1 billion (£800 million), with the next generation costing closer to $10 billion (£8 billion).
Yet DeepSeek outperformed Meta and Anthropic’s model, as well as OpenAI’s ChatGPT-4o, in some benchmarks such as accuracy, coding and complex problem-solving.
“DeepSeek is a wake-up call for America,” Alexandr Wang, chief executive of San Francisco-based Scale AI, said, calling for the US to innovate faster and tighten export controls on chips.
Mr Wang, who attended Mr Trump’s inauguration and previously secured a $250 million (£200 million) defence contract, took out a whole page advertisement in The Washington Post last week imploring the president to “win the AI war”.
“DeepSeek … is the top-performing, or roughly on a par with the best American models,” he warned in an interview with CNBC, adding his belief that China had obtained thousands of first-class chips despite export bans.
Mr Biden curtailed exports of the best chips for training AI models to block China from competing with the US. Yet Mr Wang believes thousands of first-class chips still found their way to China.
Gina Raimondo, the former US secretary of commerce, initially championed the ban and sanctions but later conceded that “trying to hold China back is a fool’s errand”, instead advocating for rampant innovation to stay ahead.
‘High-quality people’
Announcing his $500 billion (£400 billion) “Stargate” AI investment last week, Mr Trump said the four-year project was “big money and high-quality people”.
Mr Zuckerberg followed suit by announcing plans to spend up to $65 billion (£52 billion) on AI infrastructure in 2025, while Elon Musk’s xAI set out intentions to expand its Colossus supercomputer to use more than one million computer chips to train his own Grok AI language model.
The Chinese government has announced a comparatively modest $8.2 billion (£6.6 billion) investment fund for AI projects, according to the South China Morning Post.
Yet DeepSeek’s intent has been matched by Alibaba, which launched its QwQ model in November and is said to be hot on the heels of its US counterparts, while Chinese homegrown chips including those designed by Huawei are also improving rapidly.
“The only strike against it is some half-baked PRC censorship,” Barrett Woodside, co-founder of AI hardware company Positron, told the Wall Street Journal, referring to the People’s Republic of China.
The model has drawn criticism online by appearing to refuse to answer sensitive questions about China or mention Xi Jinping.
Mr Woodside explained that such responses could actually be removed as other developers can freely modify the code.
Close links to Chinese government
Nevertheless, Mr Wenfeng enjoys a close relationship with the CCP, having been invited on Jan 20 by Li Qiang, China’s second-most powerful leader, to discuss how homegrown companies could close the gap with the US.
“We have to develop the top talent ourselves”, Mr Wenfeng said in an interview last year.
Mr Wenfeng made a fortune by harnessing AI to identify patterns which affect stock prices.
“When humans make investment decisions, it’s an art, and they just do it by the seat of their pants. When computer programs make such decisions, it’s a science, and it has the optimal solution,” the eccentric billionaire said in a 2019 speech.
In 2021, he started bulk-buying Nvidia graphics processing units on the side, while running his High-Flyer trading fund.
“When we first met him, he was this very nerdy guy with a terrible hairstyle talking about building a 10,000-chip cluster to train his own models. We didn’t take him seriously,” one of Mr Wenfeng’s business partners told The Financial Times.
His DeepSeek model was published not for commercial success but rather research propagation – as he reveals the secrets and explains the breakthroughs in an accompanying paper, instead of protecting them as intellectual property.
By doing so, DeepSeek has reinvigorated AI developers – sending excitement and anxiety to Silicon Valley in equal measure.
Jim Fan, a senior research scientist at Nvidia, hailed the breakthrough, saying a “non-US company is keeping the original mission of OpenAI alive – truly open, frontier research that empowers all”.