Search results for Aime
Industry
Human intern narrowly beats Figure AI humanoid in 10-hour parcel sorting test
A public test comparing productivity between a humanoid robot and a human worker ended with a narrow win for the human intern. Figure AI’s F.03 sorted 12,732 packages in 10 hours, while intern Aime handled 12,924. Average speeds differed by about 0.04 seconds per package. The robot briefly led while the intern took breaks. Figure AI CEO Brett Adcock said it would be the last time a human wins.
AI & Enterprise
DeepSeek V4 Pro nears GPT-5-level performance, rated best Chinese AI model
DeepSeek’s latest model, DeepSeek V4 Pro, is about 8 months behind top U.S. AI models, a U.S. government-affiliated assessment showed. The AI Standards and Innovation Center under NIST said it delivered the highest performance among Chinese AI models, scoring about 200 points above Kimi K2.5. CAISI also rated it more cost-efficient than comparable models, outperforming OpenAI’s GPT-5.4 Mini on 5 of 7 benchmarks.
AI & Enterprise
OpenAI unveils GPT-5.5 Instant, personalised answers using past chats and Gmail
OpenAI has replaced ChatGPT’s default model with its newly released GPT-5.5 Instant, TechCrunch reported on May 5. OpenAI said the model reduces hallucinations in sensitive areas such as law, medicine and finance while maintaining low latency. On the AIME 2025 math test it scored 81.2, up from 65.4. It also improved on the MMMU-Pro benchmark. The model can reference past chats, files and Gmail for more personalised answers.