| Mobile Web

Cerebras compresses 163 seconds into 5 seconds, says GPU era is over

AI chip designer Cerebras put the 1 trillion-parameter open-weight model Kimi K2.6 into its enterprise inference service and achieved 981 tokens per second, a pace it says is the world’s fastest. It also cut the time to complete 500 output tokens from a 10,000-token input to 5.6 seconds, versus 163.7 seconds on the official Kimi endpoint. The company is pursuing an IPO and reported 2025 revenue of $510 million and net profit of $238 million.