Red Hat, an open-source solutions company, said on Monday it has worked with SoftBank Corp. to integrate llm-d into SoftBank's AI-RAN orchestrator AITRAS.
The company said llm-d is an open-source framework jointly established by Red Hat and other companies. It is designed to boost performance and efficiency by dynamically and intelligently distributing large language model inference in radio access network (RAN) environments.
Red Hat said that as technical implementation of AI-RAN becomes a reality, telecommunications operators are focusing not only on running AI and RAN on the same hardware, but also on how to manage and scale them efficiently.
To commercialise AI-RAN, telecommunications operators must be able to run AI workloads with the same level of flexibility as cloud-native network functions (CNFs) and applications.
With that in mind, Red Hat and SoftBank are pursuing AI-RAN cooperation using llm-d and vLLM.
vLLM supports high-performance model deployment on a single GPU node and has positioned itself as an open-source leader in AI inference. But it has limitations in managing model deployment in complex multi-node environments, and llm-d was developed to address this. llm-d uses Kubernetes to orchestrate vLLM across multiple nodes, extending vLLM's efficiency to distributed environments and enabling production-grade AI inference.