ecc_errors P1 Hardware (BMC/IPMI)

ECC memory errors

Memory controller reported one or more uncorrectable ECC errors. Data corruption has occurred; the DIMM is failing. Replace immediately.

Remediation

When this rule fires on one of your servers, the dashboard alert detail page renders the full remediation guidance: the command to run, what to verify after, and Furnace's annotation for your specific distro + hardware. Sign in at app.glassmkr.com to see the live alert.