KAIST said on July 5 that a research team led by Yoo Min-soo (유민수), a chair professor in the School of Electrical Engineering, systematically analysed for the first time the compute resources and power consumption AI agents use in real service environments. Key characteristics of AI agents and infrastructure implications. [Photo: KAIST]

KAIST said on July 5 that a research team led by Yoo Min-soo (유민수), a chair professor in the School of Electrical Engineering, systematically analysed for the first time the compute resources and power consumption AI agents use in real service environments.

The team defined AI agents not as simple programs but as a new type of workload that data centre servers and graphics processing units (GPUs) must continuously handle. On that basis, it analysed the computation and energy consumption generated during actual execution of AI agents.

The analysis found that, unlike the conventional step-by-step inference approach known as Chain-of-Thought (CoT), AI agents repeatedly call large language models (LLMs) while carrying out tasks. Computation and response time increased as the agent ran the language model multiple times to make new judgments or generate answers.

Response times for AI agents were up to 153.7 times longer than the conventional approach. The analysis found that while external tools such as internet search or code execution carried out tasks, GPUs waited without performing separate computation for up to 54.5 percent of total run time. This means a new form of inefficiency can occur, in which expensive GPUs are not fully used as AI performs more complex work.

The team also analysed, at a data centre scale, the amount of electricity used by AI agents. An LLM-based AI agent with 70 billion parameters, at a level used for current commercial AI services, consumed an average of 348.41 watt-hours of electricity to handle a single question. That was 136.5 times higher than the simple Q&A method of conventional generative AI.

The team also analysed a future environment assuming 13.7 billion AI agent requests per day. In that case, power demand for AI data centres was estimated at about 198.9 gigawatts. It far exceeds the multi-gigawatt scale AI data centres being pursued around the world and amounts to about half of average electricity consumption across the United States.

The team said the study shows that competitiveness in the AI era is expanding from "smarter AI" to "more efficient AI". It said co-design is needed to optimise together not only AI model performance but also AI chips, data centres and power infrastructure. Such co-design can be used as a core technology to cut AI service operating costs and build sustainable AI infrastructure.

Yoo said, "In an era when AI agents become widespread in the future, an approach that integrates and jointly designs and optimises not only AI data centre infrastructure but also AI agent models and power infrastructure will become important." He added, "Through this, research and investment are needed to dramatically reduce end users' costs for using AI services and to build sustainable AI infrastructure."

The study was carried out with Kim Ji-in (김지인), a doctoral student in the School of Electrical Engineering, as first author. The results were presented in February at the 32nd IEEE International Symposium on High-Performance Computer Architecture (HPCA), an international conference in computer system design.

The team said it released as open source the AI agent implementation technology used in the paper and the benchmark for performance comparison and evaluation so researchers worldwide can use them for follow-up research.

The study was carried out with support from the Institute of Information & Communications Technology Planning & Evaluation (IITP) through its SW Star Lab, the K-Cloud technology development project using AI semiconductors, the leading technology development project to advance AI semiconductor-based data centres, and Samsung Electronics' Future Technology Incubation Center.

Keyword

#KAIST #AI agent #GPU #LLM #IEEE HPCA
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.