NVLink link down
An NVLink on a multi-GPU host is in the down state. Multi-GPU bandwidth is reduced; if the GPU participates in NCCL collectives the entire training/inference job's latency degrades.
Remediation
When this rule fires on one of your servers, the dashboard alert detail page renders the full remediation guidance: the command to run, what to verify after, and Furnace's annotation for your specific distro + hardware. Sign in at app.glassmkr.com to see the live alert.