# Logging and Monitoring #

* rsyslog
* [syslog](https://en.wikipedia.org/wiki/Syslog): RFC5424

`/var/log/syslog`
`/var/log`

* eBPF

* Zabbix
* Influx
* Grafana
* Prometheus (+ Grafana + Loki as stack)
* timescaleDB
* AlertManager
* Loki
* Graphite
* Spiceworks
* Crowdsec
* Netdata
* NodeExtractor/NodeExporter
* ELK - Elasticsearch, Kibana, Logstash


https://grafana.com/blog/2016/01/05/logs-and-metrics-and-graphs-oh-my/


Setting up Grafana: https://grafana.com/docs/grafana/latest/setup-grafana/installation/docker/

Setting up Prometheus: https://github.com/prometheus/prometheus

Some things to measure:

- apt status (for security/critical updates that haven't been run yet)
- reboot needed (presence of /var/run/reboot-required)
- fail2ban jail status (how many are in each of our defined jails)
- CPU usage
- MySQL active, long-running processes, number of queries
- iostat numbers
- disk space
- SSL cert expiration date
- domain expiration date
- reachability (ping, domain resolution, specific string in an HTTP request)
- Application-specific checks (WordPress, Drupal, CRM, etc)
- postfix queue size
* apt/yum/fwupd/... pending updates
* mailqueue length, root's mailbox size: this is an indicator for stuff going wrong silently
* pending reboot after kernel update
* certain kinds of log entries (block device read error, OOMkills, core dumps).
* network checksum errors, dropped packets, martians
* presence or non-presence of USB devices: desktops should have keyboard and mouse. servers usually shouldn't. usb storage is sometimes forbidden.

## Further Reading ##

* https://www.redhat.com/en/blog/log-aggregation-rsyslog

#  Auto updates