# Logging and Monitoring # * rsyslog * [syslog](https://en.wikipedia.org/wiki/Syslog): RFC5424 `/var/log/syslog` `/var/log` * eBPF * Zabbix * Influx * Grafana * Prometheus (+ Grafana + Loki as stack) * timescaleDB * AlertManager * Loki * Graphite * Spiceworks * Crowdsec * Netdata * NodeExtractor/NodeExporter * ELK - Elasticsearch, Kibana, Logstash https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/ Setting up Grafana: https://grafana.com/docs/grafana/latest/setup-grafana/installation/docker/ Setting up Prometheus: https://github.com/prometheus/prometheus Some things to measure: - apt status (for security/critical updates that haven't been run yet) - reboot needed (presence of /var/run/reboot-required) - fail2ban jail status (how many are in each of our defined jails) - CPU usage - MySQL active, long-running processes, number of queries - iostat numbers - disk space - SSL cert expiration date - domain expiration date - reachability (ping, domain resolution, specific string in an HTTP request) - Application-specific checks (WordPress, Drupal, CRM, etc) - postfix queue size * apt/yum/fwupd/... pending updates * mailqueue length, root's mailbox size: this is an indicator for stuff going wrong silently * pending reboot after kernel update * certain kinds of log entries (block device read error, OOMkills, core dumps). * network checksum errors, dropped packets, martians * presence or non-presence of USB devices: desktops should have keyboard and mouse. servers usually shouldn't. usb storage is sometimes forbidden. ## Further Reading ## * https://www.redhat.com/en/blog/log-aggregation-rsyslog # Auto updates