personal-site/src/articles/linux_tools.md
jgrogan 060c7de471
Some checks are pending
/ test (push) Waiting to run
Add some self hosting notes
2025-02-09 15:34:48 +00:00

1.6 KiB

Logging and Monitoring

/var/log/syslog /var/log

  • eBPF

  • Zabbix

  • Influx

  • Grafana

  • Prometheus (+ Grafana + Loki as stack)

  • timescaleDB

  • AlertManager

  • Loki

  • Graphite

  • Spiceworks

  • Crowdsec

  • Netdata

  • NodeExtractor/NodeExporter

  • ELK - Elasticsearch, Kibana, Logstash

https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/

Setting up Grafana: https://grafana.com/docs/grafana/latest/setup-grafana/installation/docker/

Setting up Prometheus: https://github.com/prometheus/prometheus

Some things to measure:

  • apt status (for security/critical updates that haven't been run yet)
  • reboot needed (presence of /var/run/reboot-required)
  • fail2ban jail status (how many are in each of our defined jails)
  • CPU usage
  • MySQL active, long-running processes, number of queries
  • iostat numbers
  • disk space
  • SSL cert expiration date
  • domain expiration date
  • reachability (ping, domain resolution, specific string in an HTTP request)
  • Application-specific checks (WordPress, Drupal, CRM, etc)
  • postfix queue size
  • apt/yum/fwupd/... pending updates
  • mailqueue length, root's mailbox size: this is an indicator for stuff going wrong silently
  • pending reboot after kernel update
  • certain kinds of log entries (block device read error, OOMkills, core dumps).
  • network checksum errors, dropped packets, martians
  • presence or non-presence of USB devices: desktops should have keyboard and mouse. servers usually shouldn't. usb storage is sometimes forbidden.

Further Reading

Auto updates