| Mobile Web

Lightweight \'Needle\' model targets AI agents for low-cost smartphones

Cactus Compute has unveiled Needle, a 26-million-parameter tool-calling AI model that can run locally on smartphones and other small devices. According to online outlet Gigazine, Needle was developed by distilling the tool-calling function of Google\'s Gemini-3.1-Flash-Lite. The model targets on-device use, with prefill processing at 6,000 tokens per second and decoding at 1,200 tokens per second. It is distributed on GitHub and Hugging Face under the MIT license.