nvlink_link_down P1 GPU

NVLink link down

An NVLink on a multi-GPU host is in the down state. Multi-GPU bandwidth is reduced; if the GPU participates in NCCL collectives the entire training/inference job's latency degrades.

Remediation

When this rule fires on one of your servers, the dashboard alert detail page renders the full remediation guidance: the command to run, what to verify after, and Furnace's annotation for your specific distro + hardware. Sign in at app.glassmkr.com to see the live alert.