DeepSeek V4 seen having bigger impact than R1 on strong price performance

[DigitalToday reporter Chi-gyu Hwang] China's AI company DeepSeek has released its new AI model V4. Some assessments say it could have a bigger impact than the reasoning model R1 it released last year, as it offers frontier-level performance in open source while being much cheaper than Opus 4.7 or GPT-5.5.

DeepSeek V4 Pro is a 1.6 trillion-parameter mixture-of-experts (MoE) model with 49 billion active parameters and a context length of 1 million tokens. V4 Flash has 284 billion total parameters and 13 billion active parameters. Both models were trained on about 33 trillion tokens and showed performance close to Opus 4.7 and GPT-5.5 on the MMLU Pro, GPQA Diamond and SWE-Bench benchmarks.

ㆍDeepSeek finally releases new V4 model...competes on overwhelming price performance

Matthew Berman, who runs the YouTube channel Forward Future, recently wrote on social media platform X (Twitter), "Economic logic is pulling U.S. companies toward DeepSeek," and added, "GPT-5.5 and Opus 4.7 cost about $30 per 1 million output tokens, while DeepSeek is much cheaper than that. Because it is open source, fine-tuning and self-hosting are also possible. Most corporate work does not require cutting-edge AI performance, so there is a strong incentive to choose DeepSeek, which is 'good enough.'"

Geopolitical risk could be a problem. He said, "If U.S. companies build AI strategies based on Chinese open-source models, they could face a serious situation if Chinese AI companies change the architecture or block access. There is also concern that, just as social media began in the United States and shaped discourse around the world, if Chinese models become the foundation, Chinese cultural bias could be embedded in AI."

Views differ on the impact of U.S. government export controls targeting China. According to DeepSeek's own paper, V4 Pro service capacity will be limited until the second half of this year, when supernode expansion takes place, suggesting controls are having some effect.

But it is also necessary to consider that constraints may instead spur algorithmic innovation, leading to outcomes in which models are built cheaply even with low-cost GPUs.

U.S. AI model developers such as Anthropic and the U.S. government have warned that China is conducting large-scale distillation using U.S. AI models (distillation: a method of making a new model by using outputs from existing AI models as training data).

On this, Berman said, "DeepSeek used answers from U.S. AI models 150,000 times for training, far fewer than MoonshotAI's 3.4 million and Minimax's 13 million, so some analysis says it is hard to explain this level of performance through distillation alone."

He said, "The conclusion is twofold. The United States must become more active in developing open-source models, and OpenAI and Anthropic must cut prices much faster. If U.S. companies weigh performance against cost, DeepSeek is far more advantageous right now."

Chi-gyu Hwang delight@d-today.co.kr

Keyword