Chat with Softimpact
The newly launched DeepSeek models, introduced this month, are known for their high speed and affordability.
DeepSeek-R1, the most recent addition, is already shaking up the AI landscape, putting pressure on industry leaders like OpenAI, Google, and Meta. The ripple effect was felt in the stock market, with chipmaker Nvidia seeing a drop in its share prices.
Here’s what we know about this game-changing AI from China.
DeepSeek made its debut in November 2023 with the launch of DeepSeek Coder, an open-source AI model designed for coding tasks.
Shortly after, it introduced DeepSeek LLM to compete with major AI models, followed by DeepSeek-V2 in May 2024, which gained popularity for its high performance and affordability.
This forced major Chinese tech giants like ByteDance, Tencent, Baidu, and Alibaba to lower the pricing of their AI models to stay competitive.
DeepSeek-V2 was succeeded by DeepSeek-Coder-V2, a more advanced model featuring 236 billion parameters.
Built for complex coding prompts, it offers a substantial context window of up to 128,000 tokens.
For reference, a "token" can be a word, a phrase, or even a single character. A larger context window allows the model to process longer texts, making it useful for analyzing lengthy documents, books, or intricate conversations.
DeepSeek has continued to enhance its technology with the release of DeepSeek-V3 and DeepSeek-R1.
DeepSeek-V3, with 671,000 parameters, requires significantly fewer computational resources than its competitors while still performing exceptionally well in benchmark tests.
The latest model, DeepSeek-R1, launched this month, specializes in advanced tasks such as reasoning, coding, and mathematics, making it a strong competitor to OpenAI’s latest ChatGPT models.
Despite its rapid advancements, DeepSeek remains primarily focused on research and has no immediate plans for commercialization, according to Forbes.
One of DeepSeek’s biggest advantages is that it is completely free for users.
Unlike OpenAI’s latest models or Anthropic’s Claude Sonnet, which require paid subscriptions, DeepSeek remains fully accessible.
Although Google’s Gemini offers a free version, it is limited to older models, whereas DeepSeek currently has no such restrictions.
DeepSeek's chatbot can be accessed through its dedicated interface at "chat.deepseek."
Users simply enter their queries, and the chatbot retrieves responses.
For more in-depth searches, there is a “deep think” option, which scans a wider range of sources for detailed insights.
However, unlike ChatGPT, which relies on a curated set of sources, DeepSeek’s search function may pull information from lesser-known websites, increasing the risk of misinformation. Users should verify the accuracy of results when using the chatbot.
A major concern with DeepSeek is data privacy. Like other AI models, it collects user data, which is likely stored on servers in China.
As with any AI service, users should avoid sharing sensitive information with the chatbot.
Since DeepSeek is open-source, independent researchers can review its code to assess security risks. More information on potential security concerns is expected in the near future.
DeepSeek-R1 and other models have been released largely as open source, meaning developers can access and modify the code to customize the AI for specific use cases. However, its training data remains proprietary.
By contrast, OpenAI’s latest models, such as o1, are fully closed-source and available only through paid subscription plans ranging from $20 to $200 per month.
DeepSeek has managed to develop cutting-edge AI models despite US restrictions on advanced chip exports to China.
A key factor in its success is a strategic partnership with AMD, a US chipmaker. According to Forbes, DeepSeek used AMD Instinct GPUs and ROCm software for crucial stages of training DeepSeek-V3.
Additionally, reports indicate that DeepSeek stockpiled a significant number of Nvidia A100 chips—banned for export to China—before US sanctions took effect. Some estimates suggest the company has between 10,000 and 50,000 units.
To counter hardware limitations, DeepSeek engineers focused on optimizing their algorithms to be more efficient.
While OpenAI’s ChatGPT is believed to require around 10,000 Nvidia GPUs for training, DeepSeek claims to have achieved similar performance with just 2,000 GPUs.
DeepSeek’s emergence has sent shockwaves through the AI industry.
Alexandr Wang, CEO of Scale AI—a company providing training data for AI models used by OpenAI and Google—described DeepSeek as “an earth-shattering model” at the World Economic Forum in Davos.
However, concerns have been raised in the West about the geopolitical implications of China’s AI advancements.
Ross Burley, co-founder of the Centre for Information Resilience, warned that integrating Chinese AI technology into Western systems could pose security risks.
“We’ve seen time and again how Beijing weaponizes its tech dominance for surveillance, control, and coercion,” he stated, citing concerns about spyware, cyber campaigns, and AI-driven censorship.
Others argue that the release of DeepSeek is not just a technological milestone but also a strategic political move amid ongoing tensions between China and the US.
Gregory Allen, director of the Wadhwani AI Center at the Center for Strategic and International Studies, compared DeepSeek’s launch to Huawei’s release of a new smartphone during US-China trade negotiations in 2023.
“China wants to demonstrate that US export controls are ineffective,” Allen told the Associated Press.
Regardless of political implications, DeepSeek has firmly established itself as a major player in the AI space, challenging Western tech giants with its rapid innovation and cost-effective approach.