Glassmkr Documentation

From zero to monitoring in 5 minutes. One agent, 38 alert rules, no inbound ports.

#What Glassmkr Monitors

Glassmkr is a monitoring agent for bare metal and dedicated servers. The agent collects hardware and OS metrics every 5 minutes and pushes them to the Glassmkr dashboard, where 38 alert rules evaluate each snapshot automatically.

Hardware

IPMI sensors (temperature, fan speed, voltage, power draw), IPMI SEL event log, ECC memory errors, PSU redundancy status

Storage

SMART health and wear level, disk space and inodes, RAID array status, ZFS pool health and scrub errors, filesystem read-only detection, I/O errors and latency

Network

Interface errors and drops, link speed negotiation, bandwidth saturation, bond slave status, conntrack table usage

OS

CPU per-core utilization and iowait, load averages, RAM and swap, OOM kills, clock drift, NTP sync, systemd failed units, file descriptor exhaustion, unexpected reboots

Security

SSH root password authentication, firewall status, pending security updates, kernel vulnerabilities, reboot required flag, unattended upgrades configuration

38 alert rules evaluate on every collection cycle. All rules included on every plan, including Free.

#Quick Start

Docker (recommended)

# 1. Create config directory
sudo mkdir -p /etc/glassmkr

# 2. Add your collector key (get it from glassmkr.com after signing up)
sudo tee /etc/glassmkr/collector.yaml << 'EOF'
server_url: https://forge.glassmkr.com
collector_key: col_YOUR_KEY_HERE
interval: 300
EOF

# 3. Download and start
curl -O https://raw.githubusercontent.com/glassmkr/crucible/main/docker-compose.yml
docker compose up -d

# 4. Verify
docker compose logs glassmkr-crucible

The container runs with --privileged and network_mode: host for IPMI, SMART, and bond monitoring. See Security for details.

npm alternative

npm install -g @glassmkr/crucible
sudo glassmkr-crucible --config /etc/glassmkr/collector.yaml

Requires Node.js 24+. System packages smartmontools, ipmitool, dmidecode needed for full hardware monitoring.

Your server appears in the dashboard within 5 minutes.

#Alert Rules Reference

OS (9 rules)

Rule	Trigger	Severity
`ram_high`	≥ 90% used, ≥ 95% critical. Configurable threshold.	Warning / Critical
`cpu_high`	≥ 90% utilization, ≥ 98% critical	Warning / Critical
`load_high`	Load average > 2x core count	Warning
`cpu_iowait_high`	≥ 20% iowait. Configurable.	Warning
`oom_kills`	Any OOM kill detected	Critical
`clock_drift`	Offset > 1 second	Warning
`swap_high`	> 50% swap used	Warning
`ntp_not_synced`	NTP daemon not running or clock not synced	Warning
`unexpected_reboot`	Server restarted unexpectedly	Event

Storage (8 rules)

Rule	Trigger	Severity
`disk_space_high`	≥ 85% warning, ≥ 95% critical. Configurable.	Warning / Critical
`smart_failing`	Reallocated/pending sectors or health != PASSED	Critical
`nvme_wear_high`	≥ 85% wear warning, ≥ 95% critical. Configurable.	Warning / Critical
`raid_degraded`	Any degraded or failed RAID array	Critical
`disk_latency_high`	Average latency > 100ms	Warning
`filesystem_readonly`	Any mounted filesystem is read-only (excluding expected ones)	Critical
`inode_high`	≥ 90% inodes used	Warning
`disk_io_errors`	I/O errors detected in dmesg	Critical

Network (5 rules)

Rule	Trigger	Severity
`interface_errors`	Hardware errors > 0 per interval, drops > 500	Warning
`link_speed_mismatch`	Interface negotiated below expected speed	Warning
`interface_saturation`	> 80% of link capacity utilized	Warning
`conntrack_exhaustion`	> 80% of conntrack table used	Warning
`bond_slave_down`	A bond member interface is down	Critical

Hardware / IPMI (5 rules)

Rule	Trigger	Severity
`cpu_temperature_high`	> 80C warning, > 90C critical	Warning / Critical
`ecc_errors`	Correctable > 0 warning, uncorrectable > 0 critical	Warning / Critical
`psu_redundancy_loss`	PSU status not OK	Critical
`ipmi_sel_critical`	Critical SEL entries detected	Critical
`ipmi_fan_failure`	Fan speed below minimum threshold	Critical

ZFS (2 rules)

Rule	Trigger	Severity
`zfs_pool_unhealthy`	Pool state != ONLINE	Critical
`zfs_scrub_errors`	Scrub detected errors	Warning

Security (6 rules)

Rule	Trigger	Severity
`ssh_root_password`	Root login with password enabled	Warning
`no_firewall`	No active firewall detected	Warning
`pending_security_updates`	> 0 security updates pending	Info
`kernel_vulnerabilities`	Active kernel vulnerabilities	Warning
`kernel_needs_reboot`	Kernel update requires reboot	Info
`unattended_upgrades_disabled`	Auto-updates not configured	Info

Service Health (3 rules)

Rule	Trigger	Severity
`systemd_service_failed`	Any systemd unit in failed state	Warning
`fd_exhaustion`	> 80% of system file descriptors used	Warning
`server_unreachable`	Server missed 2+ check-ins (server-side watchdog)	Critical

State alerts auto-resolve when the condition clears. Event alerts (unexpected_reboot) stack occurrences and have a Resolve button. Acknowledged alerts still auto-resolve.

#Configuration Reference

# Required
server_url: https://forge.glassmkr.com
collector_key: col_YOUR_KEY_HERE

# Optional
interval: 300          # Collection interval in seconds (default: 300)
# hostname: my-server  # Override auto-detected hostname
# modules:             # Disable specific collection modules
#   ipmi: false
#   smart: false
#   zfs: false
#   security: false

server_url: The Glassmkr ingest endpoint. Always https://forge.glassmkr.com for the hosted service.
collector_key: Your server's authentication token. Generated when you add a server in the dashboard. Prefixed with col_.
interval: How often (in seconds) the agent collects and pushes a snapshot. Default is 300 (5 minutes). Minimum is 60.
hostname: Override the auto-detected hostname. Useful when the system hostname is generic or changes between reboots.
modules: Disable individual collection modules. Set any module to false to skip it. The agent will not attempt to read sensors for disabled modules.

#System Requirements

Operating System: Linux with systemd. Tested on Debian 11/12, Ubuntu 20.04 to 24.04, Rocky 8/9, AlmaLinux 8/9.
Runtime: Docker (recommended) or Node.js 24+.
Privileges: Root access required for IPMI, SMART, and /proc system reads.
Network: Outbound HTTPS on port 443 to forge.glassmkr.com. No inbound ports needed.
Resource usage: Approximately 90 MB RSS. Varies by hardware; servers with more IPMI sensors or drives use slightly more.
Optional packages (npm install only): smartmontools, ipmitool, dmidecode for full hardware monitoring. Missing packages are silently skipped.

#How It Works

Your Server

The agent reads /proc, /sys, smartctl, ipmitool

CPU RAM Disk SMART IPMI Network ZFS Security

HTTPS / TLS
every 5 min

Glassmkr

Dashboard 38 Rules Notifications AI Analysis

PostgreSQL + ClickHouse on EU dedicated servers

The agent is MIT open source: github.com/glassmkr/crucible
Agent pushes outbound only, opens no inbound ports
Snapshots contain hardware metrics only, no user data
Dashboard runs on EU dedicated servers, no cloud providers
AI analysis runs on a self-hosted GPU, no external AI providers

#Notification Channels

Email

Free + Pro. Alerts delivered from [email protected].

Free + Pro. Bot messages with alert details and direct links.

Slack

Pro only. Block Kit formatted messages with severity colors.

Webhooks

Pro only. POST JSON to any URL you configure.

All channels support per-priority filtering (P1 to P4). Agent update notifications send major version alerts to everyone; patch notifications are opt-in.

#Pricing

Free

Up to 3 servers
All 38 alert rules
Email + Telegram notifications
7 days data retention
No credit card required

Pro $3/node/month

$3/node/month for every server
90 days data retention
Slack + webhooks
AI health analysis
MCP API access
Email support

3 nodes: $9/mo 10 nodes: $30/mo 25 nodes: $75/mo 50 nodes: $150/mo

Enterprise

Custom pricing and configuration. Contact [email protected].

#FAQ

Do I need to open any inbound ports?

No. The agent initiates all connections outbound over HTTPS (port 443). Your firewall rules do not need to change.

Does the agent work without IPMI?

Yes. If ipmitool is not installed or the BMC is not reachable, the IPMI module is silently skipped. All other monitoring continues normally.

What happens if connectivity is lost?

The server_unreachable rule fires after the server misses 2 consecutive check-ins, roughly 10 minutes at the default interval. When connectivity resumes, the agent continues pushing snapshots.

Can I self-host the dashboard?

The agent is MIT-licensed and fully open source. The dashboard and alert evaluation engine are SaaS-only.

How does pricing work mid-month?

Proration. Add a server mid-month and you are charged proportionally for the remaining days. Remove a server and the next bill reflects the change.

Is my data stored in the EU?

Yes. All infrastructure, including the database servers and AI GPU, runs on dedicated servers in EU data centers.

#Links

GitHub Security Status Dashboard Support