What is logging? Today I want to consider one important component of observability. If monitoring is a pretty clear thing, now I want to focus on considering logging, talk about how to use logs' information, how to work with and aggregate events. In my past article, we have already discussed the difference between observability and monitoring. You can find this article here: https://hackernoon.com/observability-vs-monitoring-whats-the-difference Let us make a brief review of how Linux (and other Unix-like systems too) writes messages into files. Logs - text information that is generated by a running program. Just imagine. You run your own program written in any programming language, you want to see what your application is doing at the moment. For this purpose you can add strings like: printf("Hello World\n"); in C language or print("Hello world") in python and so on. Real programs have hundreds of such 'print' and output lots of information. It's OK, when you run your program you'll see what you put as an argument to print() function; But what about daemons? They don't have or . All interesting information should be written into a file called a log file. Traditionally, Linux has a special system for logging: stdout stderr syslog To write there, use a special syscall or module in python or in bash. syslog() syslog logger There are special facilities, severity levels and so on that are used to differentiate messages. It's a pretty powerful system. If necessary you can find a detailed description in . Most logs are stored under directory man syslog /var/log as a place for logs is not a requirement, it is allowed to write logs everywhere. Many applications write logs to their own locations, but syslog is a convenient approach to writing logs, which allows to separate files and locations, including transferring logs to remote storage over a network (Honestly, it depends on implementation. All modern systems use rsyslog which has such support) /var/log Because logs are text, there are many utilities to work with text in Linux. , , , , , etc. You must be familiar with them. Grep uniq sed awk tail head This is very nice, we have a set of utilities and can analyze logs, search for necessary info, then create different tops of something. You should understand, it is only once and next time you need to create this again. It is annoying. Syslog As was said above, Linux traditionally has a logging system called syslog. Syslog is a Unix subsystem for delivering messages to files. Also, syslog is a main system journal. Depending on Linux flavor, syslog is located at (for Debian based distros) /var/log/syslog or (for Red Hat like distros) /var/log/messages But for full understanding, there are many other predefined files to be written: or - authorization related messages /var/log/auth.log /var/log/security.log - kernel messages /var/log/dmesg - for cron jobs /var/log/cron and others. Let’s take a closer look at syslog, because this is the most well-known place for logging. The system call allows developers not to think about timestamps, what file logs are written to: syslog() syslog(LOG_LOCAL0, "%s%s%s\n", strerr, ": ", strerror(err)); By default, messages are written into syslog with a prefix of timestamp, hostname, and application name: Aug 23 13:28:17 vds swd: Parsing config file /etc/swd/swd.cfg
Aug 23 13:28:17 vds swd: Port number = 80
Aug 23 13:28:17 vds swd: Setting rootdir = /var/www
Aug 23 13:28:17 vds swd: Listen to 0.0.0.0
Aug 23 13:28:17 vds swd: Number of workers = 2
Aug 23 13:28:17 vds swd: Started OK, My PID = 26385 Of course, though syslog has an application name, sometimes file becomes hard to read and grows too fast. To facilitate this, there are at least two options: Redirect writing of a specific application log into its own file Use logrotate to rotate logs and compress The best practice is the following:  for every application use redirecting to separate file and then rotate. You can find pretty examples in /etc/syslog/syslog.d/50-default.conf like this: kern.*                          -/var/log/kern.log which means to write all messages with facility and with all levels into kern /var/log/kern.log Levels are also different: emerg alert crit err warning notice info debug For your application the most fit facilities are ** local0 - local7 user Messages may came asynchronously to syslog Anyway, there is still the option to create custom logs wherever you prefer (even in home directory) Despite this, a recommended place for custom logs is /var/log/ is useful to prevent eating all disk space, in case you store files locally. Nowadays, most systems transfer their logs to remote storage for many reasons. At the moment, we only notice, rsyslog (a modern implementation of syslog) can also send logs over the network to remote storage. Logrotate RSyslog RSyslog - is an abbreviation of ‘Rocket-fast system for log processing’ This system is very advanced in log processing: Multithreaded Supports TCP, UDP, TLS Possible to store logs in database like MySQL, PostgreSQL, Oracle, Elasticsearch Filter any part of log Customizable output format To enhance functionality, Rsyslog has modules: input - collect info from different sources output - redirect messages, destination may be either local file or remote storage parse - parse messages modification - modify messages string generator - generates string based on message Moreover, rsyslog allows creating rules based on filters and actions: :msg,contains,"[UFW " /var/log/ufw.log which filters messages ( property in syslog) containing and writes such messages to a specific file :msg [UFW /var/log/ufw.log There are a huge number of different ways to create rulesets (or just rules). You can modify logs of your application, which writes messages to syslog (or just to a file), as you need. For full flexibility, Rsyslog has scripting, which allows creating complex rules for processing messages. Rsyslog has queues inside its architecture to improve performance in multithreaded mode and allows creating queues in config files for actions. On the one hand, such an approach can increase performance significantly, on the other hand can also call for performance degradation. Conclusion of using Rsyslog As shown above,  rsyslog is a high performance and advanced system to work with logging. This Unix subsystem allows developers writing programs, but rely on the reliable system for logging. For administrators, modern syslog is a useful tool to configure log flow as it is preferred in distributed systems, including such popular storage like Elasticsearch for further analysis. Journald Most people already use journald, but don’t suspect this. Look at the command: systemctl status nginx will show the status of web server and its tail of log. This is an example of using journals. nginx Almost all Linux distros have instead of for many years and shipped with as a default tool to work with logs. is a part of . systemd systemV journald journald systemd Features Binary logs (forgery protection) Does not require special set-up Supports multi-line, multi-field logs Indexed data Centralized storage Supports both local storages: disk, memory Journald has very rich functionality to work with compressing, freeing space, forwarding messages What types of logs does journald take? syslog systemd units logs auditd logs submitting logs via Journal API kernel logs kmsg Auditd There is another important log - . This system registers kernel events (configured in special files) and writes them into a log. There are many use cases for . For more details I invite you to read my article at auditd auditd https://medium.com/p/dda085551798 Log shippers We’ve considered two modern log systems in Linux, which allows transferring data to a centralized storage. These are rsyslog, and journald, they are present by default. They have both pros and cons. There are many resources, which give information about detailed comparison rsyslogd and journald in terms of remote transferring data, their performance and so on. But I would like to focus on a new approach to store logs remotely for further analysis. I mean log shippers - lightweight processes which take file logs as an input, process them if necessary, extract required info and/or transform to specific format, then ingest this to a remote storage. A well-known example is by Elastic. Filebeat is not the only one, there are many implementations from different developers. If you ask me regarding rsyslog vs journald vs filebeat for transferring messages of a specific application, I’ll reply as follows: “My choice is filebeat”. filebeat In my opinion, it is easier to configure and does only one thing. Of course, it is not a versatile solution, you should find yours for your tasks. Pros and cons of remote storing Pros for storing files remotely: Systems don’t spend disk space for log files Storing logs remotely, we can conveniently analyze logs from all servers and build dashboards In such case we prevent logs from being removed accidentally or being made fake If logs are stored locally: Inconvenient to analyze, especially if the system has hundreds of application instances and they are distributed There is a risk to remove logs in case if a server is compromised Logs create additional load on the system If remote server is inaccessible, there is a risk to lose messages (depending on implementation)

Flow

Oracle

Logging in Observability - Part 1

About Author

Comments

TOPICS

THIS ARTICLE WAS FEATURED IN

Related Stories

Untitled Story

Logging in Observability - Part 2

10 Must Have Chrome Extensions for a Web Developer

The Noonification: Can Mankind Survive as an Interplanetary Species? (10/20/2022)

132 Stories To Learn About Performance

3 Golang Pitfalls Every Developer Needs to Know

Logging in Observability - Part 2

10 Must Have Chrome Extensions for a Web Developer

The Noonification: Can Mankind Survive as an Interplanetary Species? (10/20/2022)

132 Stories To Learn About Performance

3 Golang Pitfalls Every Developer Needs to Know

Light-Mode

Classic

Newspaper

Dark-Mode

Neon Noir

Minty

HN StartUps