Glassmkr: bare metal monitoring built by operators. Terminal preview: crucible fleet --status showing 3 servers, 38 rules evaluated, all healthy.

Introducing Glassmkr: bare metal monitoring built by operators

You rented a dedicated server because you wanted control. What you got instead is a machine that only tells you it's alive by answering ping. The drive is accumulating reallocated sectors. A fan slowed down last week. The RAID array lost a member yesterday and the rebuild is stressing the surviving disks right now. Your hosting provider doesn't know. Nagios didn't ship with a rule for it. Datadog costs more than the server.

This is the gap Glassmkr fills.

What Glassmkr is

Glassmkr is monitoring for people who run their own hardware. Three tools, one philosophy: collect what actually breaks physical servers, alert on the priorities that matter, and don't pretend the solution is a cloud APM that was never designed for this workload.

The three products work together or standalone.

Forge is the SaaS dashboard. It receives health data from your servers, stores history, renders fleet views, and sends alerts. 38 opinionated alert rules across hardware, storage, network, and OS layers. AI-powered health analysis that runs on our own GPU, not a third party's. Free for up to 3 servers, $3/node/month after that.

Bench is a set of MCP servers for infrastructure. It gives Claude Code, Cursor, or any MCP client structured access to your Netdata instance, your IPMI sensors, your Proxmox cluster. 38+ tools across 3 MCP servers. MIT licensed, npm install, zero config. For operators who want their AI agents to actually understand the hardware underneath.

Crucible is the open-source collector. One curl | bash install. Reads from smartctl, ipmitool, mdadm, /proc, /sys. Pushes a complete health snapshot to Forge every 5 minutes. 90MB RSS. No kernel modules, no eBPF, no root hooks into your application stack. Available on npm and Docker Hub. MIT licensed. Run it standalone, pipe to your own systems, or pair it with Forge.

Where this came from

The team behind Glassmkr has spent a decade operating bare metal infrastructure across 67 global locations. Every alert, every threshold, every diagnostic in the product comes from real operational experience. The 38 alert rules are not theoretical coverage; they are the things we have been woken up by. The IPMI parsing handles vendor quirks because we have hit them. The RAID degradation detection fires on member loss rather than performance because that is the one you actually care about at 3 AM.

We built Glassmkr over the first four months of 2026. Every feature decision, every rule priority, every dashboard layout was driven by either our own production pain or feedback from early users running it. We do not build features we have not encountered. We do not write checks for failure modes that do not exist. Opinionated coverage beats configurable generality when the goal is catching real problems before they become outages.

The architecture

Your server runs Crucible as a systemd service. Every 5 minutes it gathers SMART attributes for every drive, IPMI sensor readings, RAID array state, per-core CPU (not just aggregate, the per-core breakdown that catches IRQ pinning and single-threaded saturation), memory and swap, network interface stats, filesystem state, security posture, and pending updates. It pushes this snapshot over HTTPS to Forge.

Forge evaluates the 38 alert rules against the snapshot, compares it to history, and either fires new alerts or closes resolved ones. Each alert is assigned a priority level: P1 for data loss imminent, P2 for service-impacting, P3 for degrading, P4 for informational. Alert cards include evidence links, diagnostic commands, and recent trend data. Notifications go to Slack, Telegram, or email.

The AI analysis runs on our own NVIDIA L4 in Amsterdam, serving Gemma 4 26B-A4B over a private WireGuard network. Your server's data never touches a third-party API. We have written separately about why we self-host the model and how we chose it; the short version is that sending IPMI sensor data and hardware serials to an external cloud to analyze whether your infrastructure is healthy is ironic.

Pricing

The agent (Crucible) and the AI tools (Bench) are MIT licensed and free. Always.

Forge Free covers up to 3 servers, all 38 alert rules, 7-day history, and all notification channels. If that is enough for your setup, you are done.

Forge Pro is $3/node/month. You get longer history, the AI health analysis, more notification routing options, and priority support. There is no per-metric surcharge, no alert-volume tiering, and no cloud-scale pricing math. You pay for the nodes you have.

Start

Install Crucible on a server:

curl -sf https://forge.glassmkr.com/install | bash

The install script registers the server with Forge, sets up the systemd service, and begins collecting data within a few minutes. Browse to forge.glassmkr.com to see your fleet.

If you just want the agent and not the SaaS, Crucible is on GitHub and npm. Pipe its output wherever you want it.

If you are an AI-tooling person looking at the MCP side, Bench is at bench.glassmkr.com. Each server does one thing well.

What we are not

We are not a Datadog replacement for cloud workloads. If your infrastructure is Kubernetes on EKS, use something else.

We are not a SaaS that hides the collector in a proprietary binary. Read the Crucible source. Audit exactly what leaves your server.

We are not trying to be everything. No application performance monitoring, no distributed tracing, no log aggregation. Glassmkr does hardware and OS health. It does not pretend to be observability.

What is next

Near-term roadmap: GPU monitoring via NVIDIA SMI, deeper NVMe controller telemetry, storage controller support beyond mdadm, and trend-based alerting that fires on patterns rather than instantaneous threshold crossings. The trend-alerting piece is the largest in-flight feature. Statistical detection of reallocated-sector growth, fan-speed drift, temperature baseline shifts, anything where the signal is the trajectory rather than the value.

Every addition follows the same rule: only ship alerts for failure modes we have actually encountered in production.

For operators, by operators

Glassmkr exists because the monitoring tools that worked for cloud-native teams did not work for us. We built what we needed. If you run bare metal and recognize the description above, try it. The Free tier is genuinely free. The paid tier is priced to be affordable at any fleet size.