KAIST cuts AI chip cooling energy by 90 percent

Structure of a manifold microchannel cooling device for cooling high-heat semiconductor chips. [Photo: KAIST]

A KAIST research team has developed a technology that cuts the cooling energy needed to remove heat from artificial intelligence (AI) semiconductors to one-tenth. It can also be applied to existing semiconductor manufacturing processes, drawing attention as a way to reduce power consumption at AI data centres.

KAIST said on Monday that teams led by mechanical engineering professor Seong-jin Kim (김성진) and AX department professor Ik-jin Lee (이익진) developed a highly efficient liquid-cooling technology that combines a manifold supplying coolant inside a chip with microchannels.

As AI semiconductors become more powerful, the heat generated by chips increases. With air cooling alone facing limits in removing heat from high-performance semiconductors such as next-generation graphics processing units (GPUs), liquid cooling that runs coolant directly through chips is emerging as an alternative.

The manifold microchannel (MMC) used by the team removes heat by running coolant through fine channels thinner than a strand of hair. Splitting the coolant supply through a manifold to multiple points reduces travel distance and improves cooling efficiency. Existing technologies, however, have had a problem in which coolant concentrates in some channels, creating differences in cooling performance by channel. The team combined a computational model with precise simulations to design a structure that lets coolant flow evenly through all channels while minimising energy loss.

The team fabricated the optimised structure on a silicon wafer and verified its performance. The coefficient of performance (COP), a measure of heat removed per unit of energy input, reached 106,000. That means inputting 1 unit of cooling energy can remove heat equivalent to 106,000 units. It is more than 10 times higher than the previous best level reported in the international journal Nature in 2020.

It achieved the performance using only room-temperature water, without using boiling cooling, nanoscale surface treatment, or expensive materials such as diamond. Another feature is that it can be applied on current semiconductor production lines without separate large-scale facility investment. The technology was verified on an experimental chip measuring 5 mm by 5 mm. The team explained that the same design principle can be applied to large AI semiconductors up to 7.5 cm by 7.5 cm, including GPUs and tensor processing units (TPUs).

In an experiment applying the technology to cold plates used in data centres, it confirmed cooling performance more than 30 percent higher than existing methods. The team said it expects the approach can later be applied to Nvidia's next-generation AI platform, Vera Rubin-class high-performance chips.

Seong-jin Kim said, "In the AI era, competitiveness depends not only on semiconductor performance but also on how effectively heat is controlled." He said, "We hope this technology will be used as a key technology to reduce power consumption at AI data centres."

The research was carried out with support from the mid-career researcher support programme led by the National Research Foundation of Korea and from a specialised research lab project for ultra-high heat flux cooling systems promoted by the Agency for Defense Development's funding and led by the Korea Institute for Defense Technology Promotion Research. The results were published on Sunday in the international journal Energy Conversion and Management.

Jin-ho Lee jhlee26@d-today.co.kr

Keyword