Xiaomi's research team has unveiled 'HarnesX,' a framework that automatically improves AI agent harnesses, VentureBeat reported on June 24 local time. The approach can significantly boost AI performance without changing the model itself.
A harness is a software structure that connects large language models to external environments. It includes prompts, tools, memory management and execution flow. As corporate AI agents handle more complex long-term tasks, harness design becomes more important, but until now it has had to be built and improved manually.
HarnesX is designed with a modular structure that allows harnesses to be replaced and evolved independently. The core is an automatic optimisation engine called AEGIS. AEGIS consists of a four-stage pipeline: a digester that analyses agent execution logs, a planner that explores improvement directions, an evolver that generates code-level modifications, and a critic-gate that blocks side effects.
In tests across 15 model-benchmark combinations, HarnesX improved performance in 14, recording an average 14.5 percent gain. The open-source model Qwen3.5-9B improved by 44 percent on an implementation planning benchmark, and it also rose 18.2 percent on a software engineering benchmark.
HarnesX also supports a co-evolution approach that advances harness evolution and model training at the same time. It used execution data generated during the harness improvement process for model reinforcement learning, leading to an additional average 4.7 percent performance gain. Co-evolution can be applied only to open-source models. The research team used Claude Opus 4.6 as a meta agent and Claude Sonnet 4.6 and GPT-5.4 as actual task agents. A limitation identified was that a powerful frontier model is currently needed as a meta agent. The team plans to release the code in a future update.