Robuta

https://techcommunity.microsoft.com/blog/azurehighperformancecomputingblog/automated-hpcai-compute-node-health-checks-integrated-with-the-slurm-scheduler/3113454 Automated HPC/AI compute node health-checks Integrated with the SLURM scheduler | Microsoft... Oct 26, 2022 - It is best practice to run health-checks on compute nodes before running jobs, this is especially important for tightly coupled HPC/AI applications. The... node health checks