Three reasons why DeepSeek’s new model matters

Reuters previously reported that Chinese government officials recommended that DeepSeek integrate Huawei chips in its training process. And this pressure fits a broader pattern in China’s industrial policy: Strategic sectors are often pushed, and sometimes effectively required, to align with national self-reliance goals. But there’s a particular urgency when it comes to AI. Since 2022,…

Three reasons why DeepSeek’s new model matters
Three reasons why DeepSeek’s new model matters

Reuters previously reported that Chinese government officials recommended that DeepSeek integrate Huawei chips in its training process. And this pressure fits a broader pattern in China’s industrial policy: Strategic sectors are often pushed, and sometimes effectively required, to align with national self-reliance goals. But there’s a particular urgency when it comes to AI. Since 2022, US export controls have cut Chinese firms off from Nvidia’s most powerful chips, and they later also restricted access to downgraded China-market versions. Beijing’s response has been to accelerate the push for a domestic AI stack, from chips to software frameworks to data centers.

Chinese authorities have reportedly been pushing data centers and public computing projects to use more domestic chips, including through reported bans on foreign-made chips, sourcing quotas, and requirements to pair Nvidia chips with Chinese alternatives from companies such as Huawei and Cambricon. 

Still, replacing Nvidia is not as simple as swapping one chip for another. Nvidia’s advantage lies not only in its chips, but in the software ecosystem developers have spent years building around them. Moving to Huawei’s Ascend chips means adapting model code, rebuilding tools, and proving that systems built around those chips are stable enough for serious use.

To be clear, DeepSeek does not appear to have fully moved beyond Nvidia. The company’s technical report reveals that it is using Chinese chips to run the model for inference, or when someone asks the model to complete a task. But Liu Zhiyuan, a computer science professor at Tsinghua University, told MIT Technology Review that DeepSeek appears to have adapted only part of V4’s training process for Chinese chips. The report does not say whether some key long-context features were adapted to domestic chips, so Liu says V4 may still have been trained mainly on Nvidia chips. Multiple sources who spoke on the condition of anonymity, due to political sensitivity around these issues, told MIT Technology Review that Chinese chips still don’t perform as well as Nvidia chips but are better suited for inference than training.

DeepSeek is also tying the future costs of V4 to this hardware shift. The company says V4-Pro prices could fall significantly after Huawei’s Ascend 950 supernodes begin shipping at scale in the second half of this year. 

If that works, V4 could be an early sign that China is successfully building a parallel AI infrastructure.

source

About the Author

Leave a Reply

Your email address will not be published. Required fields are marked *

About the Author

Easy WordPress Websites Builder: Versatile Demos for Blogs, News, eCommerce and More – One-Click Import, No Coding! 1000+ Ready-made Templates for Stunning Newspaper, Magazine, Blog, and Publishing Websites.

BlockSpare — News, Magazine and Blog Addons for (Gutenberg) Block Editor

Search the Archives

Access over the years of investigative journalism and breaking reports