Skip to main content

Health endpoints

The node exposes CometBFT RPC on port 26657 (localhost only):
EndpointDescription
http://localhost:26657/healthReturns {} if running
http://localhost:26657/statusSync status, validator info, latest block
http://localhost:26657/net_infoPeer count and connections
autheod status | jq '.SyncInfo.catching_up'
autheod status | jq '.SyncInfo.latest_block_height'

Enable Prometheus metrics

Edit config/config.toml:
[instrumentation]
prometheus = true
prometheus_listen_addr = "127.0.0.1:26660"
sudo systemctl restart autheod
curl -s http://localhost:26660/metrics | head -20
Port 26660 must never be exposed publicly. Proxy through your monitoring stack over private network only.

Critical metrics

MetricAlert thresholdMeaning
tendermint_consensus_heightNo increase for >60sBlock production stopped
tendermint_consensus_validator_power= 0Validator jailed or tombstoned
tendermint_p2p_peers< 3Too few peers; sync at risk
tendermint_consensus_roundsConsistently > 1Network consensus trouble
tendermint_mempool_size> 4,500Mempool near capacity
process_resident_memory_bytes> 85% of total RAMMemory pressure; OOM risk
go_goroutines> 5,000Possible goroutine leak
process_open_fds> 90% of limitIncrease LimitNOFILE

Grafana setup

1

Install Prometheus and configure scraping

Add http://localhost:26660/metrics as a scrape target.
2

Install Grafana and add data source

Add Prometheus at http://localhost:9090.
3

Import community dashboard

Dashboards → Import → Dashboard ID 11036 (Cosmos/CometBFT community dashboard).
4

Configure alerts

Set up Alertmanager alerts for validator_power = 0, peer count < 3, and block height stall.

Log monitoring

sudo journalctl -u autheod -f
sudo journalctl -u autheod -f | grep '"level":"error"'
Key log patterns:
PatternMeaning
"level":"error"Any error condition
"jailed"Validator jailed
"tombstoned"Permanent ban — act immediately
"license" with errorLicense state transition failure

Monitoring checklist

  • Prometheus enabled and scraping successfully
  • Grafana dashboard imported (ID 11036)
  • Alert configured for validator power = 0
  • Alert configured for peer count < 3
  • Alert configured for block height stall (>60s)
  • Alert configured for RAM usage >85%
  • Log monitoring set up for error-level events