DeepSeek Hype: All That Glitters Is Not Gold

January, 2025

“In the middle of every difficulty lies opportunity.” – Albert Einstein

China has taken this sentiment to heart, transforming adversity into innovation. Despite the US export controls restricting advanced technology such as semiconductors, China is rewriting the playbook on affordable AI development. A prime example is DeepSeek, a disruptive Chinese startup reshaping the global AI race. DeepSeek has unveiled its DeepSeek-R1, an open-source AI reasoning model, released under an MIT license for free commercial use and modification—a bold move that has US tech giants scrambling to respond. Leveraging a stockpile of NVIDIA A100 GPUs secured before US restrictions took effect, DeepSeek combined these with lower-grade chips, slashing training costs to just $5.6 million. The result is a model that challenges the dominance of Silicon Valley’s priciest innovations, demonstrating that resource constraints can spark ingenuity.

Accessible via a free chat app or API, DeepSeek-R1 has also been scaled down into six smaller versions capable of running locally on laptops, democratizing AI access. Adding to its arsenal, the company recently unveiled Janus-Pro-7B, an open-source image-generation model that surpasses DALL-E 3 and Stable Diffusion in multiple benchmarks, positioning DeepSeek as a formidable player in affordable and cutting-edge AI development. This is no isolated success. China’s tech ambitions are accelerating at an unprecedented pace. With over 300 generative AI models already approved for public use and a target of releasing over 50 global AI standards by 2026, China is positioning itself as the undisputed leader in the global AI race by 2030.

It is not just about AI; from sixth-generation fighter jets to artificial sun prototypes, China is signaling its intent to dominate the critical technologies shaping the future. Yet, this tech resurgence comes against the backdrop of a declining manufacturing empire. Once the world’s factory, China is steadily losing ground to countries such as Vietnam, India, and Mexico, which are capitalizing on tax breaks, subsidies, and nearshoring trends. Rising labor costs, geopolitical tensions, and intellectual property concerns have spurred industries to explore alternative manufacturing hubs.

China, however, is not retreating—it is recalibrating. The nation is now betting on homegrown technological leadership to reclaim its global position. The AI push is not just a technological endeavor but a geopolitical gambit. With the US and its allies aiming to maintain a stranglehold on critical technologies, China’s ambitions represent a direct challenge to the West’s supremacy.

DeepSeek: Breakthroughs and Limitations

DeepSeek has captured global attention with its radical breakthroughs in AI efficiency, emerging as a formidable challenger to US tech dominance. However, this technological marvel is not without its flaws, raising critical questions about its suitability for high-stakes environments.

Breakthroughs: Redefining Efficiency in AI Development

    1. Mixture-of-experts (MoE) architecture: DeepSeek’s flagship model, DeepSeek-V3, features an MoE architecture with an astounding 671 billion parameters. However, its innovation lies in activating only 37 billion parameters per token. This intelligent system dynamically engages specialized modules only when necessary, slashing computational overhead while maintaining impressive performance.
    2. FP8 mixed precision training: DeepSeek has pioneered the use of 8-bit floating-point (FP8) precision, accelerating computations and reducing memory usage. By lowering numerical precision from 32 to eight decimal places, DeepSeek achieves a 75% reduction in memory costs while preserving model accuracy—a significant leap in AI training efficiency.
    3. Multitoken prediction (MTP): With MTP, DeepSeek processes text in chunks rather than word by word, effectively doubling its processing speed. This innovation significantly enhances inference efficiency with minimal trade-offs in accuracy, making it a frontrunner for resource-optimized AI applications.
Picture1 6 1030x599 - DeepSeek Hype: All That Glitters Is Not Gold

Figure 1: DeepSeek AI models’ performance across reasoning, coding, mathematics, and data analysis tasks is comparable with private models from leading players

The Limitations: Trust Under Scrutiny

    1. Censorship concerns: Like other Chinese AI models such as Baidu’s Ernie or ByteDance’s Doubao, DeepSeek is programmed to avoid politically sensitive topics, reflecting the Chinese government’s strict censorship policies. This poses a significant barrier to its acceptance on the global stage, where transparency and open access are key. However, DeepSeek’s base model appears to have been trained on accurate data sources, with censorship introduced through an additional safeguarding layer, likely to comply with domestic regulations. This dual-layer approach raises questions about its usability and credibility in international markets.
    2. Opaque cost disclosures: The much-touted $5.6 million training cost tells only part of the story. Critical expenses, including data collection, synthetic data generation, R&D failures, and staff salaries, are missing from this calculation. This lack of transparency raises questions about the true cost of DeepSeek’s so-called efficiency.
    3. Data privacy risks: DeepSeek’s reliance on remote servers in China for data processing raises alarm bells about data privacy and security. There is growing suspicion that the model could be a data mining operation in disguise, with potential legal battles looming.
    4. Copyright infringement: Microsoft alleges that DeepSeek illegally distilled OpenAI’s models using its API, a potential breach of OpenAI’s terms of service and intellectual property rights. If proven, this could lead to legal action that challenges DeepSeek’s legitimacy. Additionally, DeepSeek’s use of transformer architectures mirrors cases such as the New York Times versus OpenAI, where unauthorized data usage for AI training is under scrutiny. The Italian Data Protection Authority has already filed a GDPR complaint against DeepSeek, and more regulatory challenges are likely on the horizon.
    5. Security vulnerabilities: Recent audit reports from responsible AI watchdogs highlight DeepSeek-R1’s significant vulnerabilities, including biases, generation of insecure code, and toxic or harmful content, including CBRN (chemical, biological, radiological, and nuclear) material. These flaws make it unsuitable for high-risk scenarios, such as autonomous decision-making in healthcare or finance, without rigorous testing and safeguards. Furthermore, issues such as language inconsistencies, harmful behavior, and multiturn performance degradation render it unfit for agentic AI deployments, posing heightened risks in critical environments.
Picture2 1030x527 - DeepSeek Hype: All That Glitters Is Not Gold

Figure 2: DeepSeek-R1 red teaming report

China’s technological ambitions have long been a matter of global debate, and DeepSeek adds a new layer to this narrative. While it showcases China’s ability to innovate under constraints, it also highlights the risks of rushing technology to market without robust safeguards.

Implications for the Industry

The AI revolution is built on the assumption that computing and energy needs would grow exponentially, fueling massive investments in data centers and energy infrastructure. This belief has, in turn, bolstered the valuation of energy and semiconductor stocks. But on January 27, 2025, those assumptions were shaken to the core. The stock market reacted brutally to DeepSeek’s performance benchmarks that rival leading proprietary models—at a fraction of the cost and energy consumption. The Nasdaq tumbled 3%, wiping off nearly $1 trillion in value from tech stocks, with NVIDIA leading the collapse. Investors panicked over the potential disruption, fearing that DeepSeek’s efficiency could upend the AI hardware and energy investment boom. Constellation Energy, which has been aggressively expanding its AI-related energy capacity, saw its stock nosedive by more than 20%. Broadcom and other AI-dependent firms followed suit, caught in the wake of this unexpected disruption.

While the market’s knee-jerk reaction suggests an existential threat to incumbents, the reality is far more nuanced. DeepSeek is not here to replace OpenAI, Google, or Anthropic overnight, but it does signal a shift in AI economics. The company’s breakthrough efficiency exposes how AI dominance is no longer solely dictated by computing power and capital but by smarter architectures. This is not a death knell for existing players but an inflection point—a wake-up call that AI can be built in less costly, less power-hungry ways. And the industry is taking notice. Rather than resisting, some have already embraced DeepSeek’s vision:

  • AIonOS, Indosat, and DeepSeek are collaborating to launch an AI CoE in Indonesia. The center will leverage DeepSeek’s open-source models to develop affordable, sector-specific AI for agriculture and travel.
  • Former Intel CEO Pat Gelsinger is betting on DeepSeek-R1, integrating it into Gloo’s AI service, Kallm, instead of OpenAI models.

If DeepSeek or models like it were to replace incumbent solutions from OpenAI, Anthropic, or Google, global AI energy consumption might see a net reduction. However, as history has shown with Jevons Paradox, increased efficiency often leads to greater overall adoption and usage. The ability to run sophisticated AI at lower costs and with reduced energy inputs will likely democratize access, driving more applications, broader deployment, and, ultimately, sustained demand for AI infrastructure.

However, in its current form, it remains a risky proposition for high-stakes applications. Issues surrounding transparency, data security, and ethical safeguards must be addressed before integration into the business environment. The future of AI will not be dictated solely by raw performance—it will be shaped by the delicate balance between innovation and responsible deployment.

The Road Ahead: Building the Future of AI Adoption

    1. Hybrid LLM Architectures: Collaboration, Not Competition
      The future of AI is not about pitting one model against another but leveraging the collective strengths of multiple models to address specific use cases. Hybrid LLM architectures are emerging as the go-to solution for efficiency and cost optimization. For example, DeepSeek-R1 shines in planning and structuring complex coding tasks, while Claude Sonnet excels in execution, debugging, and iteration. This pairing demonstrates how combining specialized capabilities—DeepSeek’s cost efficiency and Claude’s speed can optimize productivity. In the long term, these architectures will define how AI applications are developed and deployed across industries.
    2. Interface as the Real Moat: Winning User Loyalty
      In the race for AI supremacy, user-friendly interfaces are proving to be a more durable competitive advantage than performance benchmarks. Companies such as DeepSeek, Meta, and Anthropic might focus on outperforming each other with cutting-edge models, but the true differentiator lies in owning the user experience. ChatGPT became a household name not because it had the best model but because it was intuitive and seamlessly integrated into daily workflows. Similarly, startups prioritizing interface design and usability are positioning themselves as leaders, creating tools that become indispensable for users. As frontier models increasingly become interchangeable commodities, the battle will shift to the interface layer, where user loyalty and habitual reliance form an unshakable moat. Technology evolves quickly, but the tools people instinctively reach for will remain powerful.
    3. Application Innovation: The True Differentiator
      While underlying models will continue to evolve and disrupt benchmarks, the real value will lie in the application layer. Enterprises will increasingly prioritize AI solutions that solve complex, domain-specific challenges and deliver tangible outcomes. The focus will shift from developing standalone models to creating integration-friendly applications tailored to industries and use cases. These applications must capture the unique nuances of each sector, ensuring relevance and usability. As enterprises adopt modular AI architectures, the onus will fall on applications to act as the seamless bridge between models and business objectives.

Conclusion

DeepSeek’s rise marks an untapped opportunity rather than an existential crisis. The shake-up could lead to more innovation, lower costs, and a more distributed AI landscape, loosening big tech’s grip on AI development. More significantly, it has reignited interest in reinforcement learning over traditional machine learning—a shift that moves AI closer to true autonomy and artificial general intelligence. But for now, using DeepSeek in high-risk scenarios remains a gamble as its security, reliability, and responsible AI benchmarks are still in question.

Now, the real question remains whether the US’s export controls and innovation lead are enough to contain China’s AI surge or if DeepSeek and AI models of its ilk, such as Qwen 2.5-Max or Kimi k1.5, will mark the dawn of a new technological superpower. As the AI race intensifies, it is clear that China’s ambitions are as unyielding as they are disruptive.


By Anupam Govil (Managing Partner), Chandrika Dutt (Research Director)