[Photo: Datadog]

Datadog, an AI-based observability and security platform company, said on Wednesday it will formally launch GPU monitoring.

The company said GPU monitoring provides integrated visibility across the AI stack in a single solution. Customers can identify GPU workloads where bottlenecks occur and reduce costs.

Most GPU tools currently provide only basic metrics on device status. They do not show bottlenecks caused by imbalanced resource use between departments or explain why training and inference workloads fail. Visibility is also lacking on which devices are idle or being used inefficiently.

The company said GPU monitoring provides a unified view that platform engineering teams and machine learning teams can use to analyse issues together, helping them scale AI cost-effectively.

Keyword

#Datadog #GPU monitoring #AI #machine learning #observability
Copyright © DigitalToday. All rights reserved. Unauthorized reproduction and redistribution are prohibited.