https://developer.nvidia.cn/dcgm
NVIDIA DCGM | NVIDIA 开发者
nvidiadcgm
https://developer.nvidia.com/blog/monitoring-gpus-in-kubernetes-with-dcgm/
Monitoring GPUs in Kubernetes with DCGM | NVIDIA Technical Blog
Aug 21, 2022 - Monitoring GPUs is critical for infrastructure or site reliability engineering (SRE) teams who manage large-scale GPU clusters for AI or HPC workloads.
nvidia technical blogmonitoringgpuskubernetesdcgm
https://developer.nvidia.com/dcgm
NVIDIA DCGM | NVIDIA Developer
Manage and Monitor GPUs in Cluster Environments
nvidiadcgmdeveloper