DeepSeek unveils open-source DSpark technology to boost AI inference speed by up to 85 percent

Generating...

Chi-gyu Hwang (황치규)

published 2026-06-30 07:23:27

Share this article

Chinese AI company DeepSeek has unveiled DSpark, an open-source technology that speeds up responses from large language models, VentureBeat reported on June 29 local time.

DSpark focuses on optimising the inference process without changing the model itself.

The report said existing AI chatbots generate text sequentially, one piece at a time. With DSpark, a small, fast auxiliary model predicts the next text a few steps ahead, and a larger model verifies those predictions in a batch. If the predictions are correct, it confirms multiple pieces at once to speed up output. If not, it discards only that part and tries again.

In DeepSeek tests, perceived speed improved by 60 to 85 percent and total system throughput rose by as much as 661 percent, it said.

DSpark has two core technologies. One is that the auxiliary model predicts multiple tokens at once while taking surrounding context into account to improve accuracy. The other is dynamically adjusting the verification range based on server load, verifying more when traffic is light and skipping predictions that are likely to be wrong when the system is busy, VentureBeat said.

DSpark can be applied not only to DeepSeek's own V4 model but also to other open-source models such as Alibaba Qwen and Google Gemma. DeepSeek released DSpark code, training pipelines and checkpoints under the MIT License so anyone can use them for research and commercial purposes, VentureBeat said.

Chi-gyu Hwang (황치규) delight@d-today.co.kr

Keyword