I’m sure I logged the error somewhere
I usually use a scheduling framework to run scheduled (duh) tasks. But this really isn’t a heavyweight app, and I didn’t want to add extra tooling and configuration. Opting for using cron to run my app tasks from outside look.
Cron is the built in linux scheduler. And a user “MyAppUser” has an optional crontab (read “cron table”) the crond (the cron daemon) reads configuration from. Mission accomplished, the tasks are running and everything is ok. Open source rocks and a quick win for me.
And then IT HAPPENED. Something critical was broken, really broken. And I really needed the logs from the command I was running through cron.
But where are they?
Following the 12 factor app recommendations My app logs to stdout and stderr streams, letting the streams be taken care of from outside. For the regular app logs, it’s supervisor that is handling the stdout and stderr streams, writing to log files, rotating, etc..
But supervisord does not run the scheduled tasks, but crond. So who takes care of my logs? I remembered seeing cron entries in syslog so I took a look there and found entries like:
Sep 23 10:00:01 ip-172–30–0–129 CRON[2021]: (myuser) CMD (/bin/bash /home/myuser/run_task.sh)
Sep 23 10:00:05 ip-172–30–0–129 CRON[2018]: (CRON) info (No MTA installed, discarding output)
OH. so cron did try to run my_task.sh but “discarding output”. But why? And what is an MTA and what does it have to do with cron logs?
Googling supplied an explanation. MTA is an acronym for “Message Transfer Agent” which usually translates an email sending service (POSTFIX is quite common as one). So Cron jobs output is logged through email. And if you don’t have the correct configuration it goes into an internal /var/mail/myuser file (In email format).
While I can see why having cron job reporting by email might look like a good idea in certain use cases, I have never actually encountered such a use case :). Actually configuring the MTA to send the emails would flood me with endless reporting emails with no value (I look at the logs when I want to search for something and not all the time). Not configuring it would flood /var/mail/myuser with this kind of output. We’ll get the text but without line by line order and context, date stamp (unless I take care for that in my app logging logic). But a worse problem is that I DONT WANT TO INSTALL AN EMAIL SERVER in my app server. Opening another possible attack vector, another thing to worry about and configure, this doesn’t look like an attractive option.
After ruling out installing an email service what can we do otherwise? I found lots of problematic answers to this problem, as least in my view. Most of them either suggesting log file handling in the applicative code (Would force me to handle double logging logic (the regular stdout and file logging , log to somewhere my user has access too and handle log rotation etc). Another common solution is to redirect output from the scheduled task to a manual log file:
15 15 * * 0–5 /bin/bash /home/myuser/run_task.sh >> /var/log/mylog.log 2>&1
This is a better solution but might not be good enough, logs are without a date, unless I add in the app. And again, this is a file that I need to take care of adding a logrotate configuration (which doesn’t look like a fun thing to do and adds another thing to configure and worry about)
The solution I opted for (adapted from a stackoverflow question I can’t find right now) was using the following:
15 15 * * 0–5 /bin/bash /home/myuser/run_task.sh 2>&1 | /usr/bin/logger -t tag_name
Try:
echo “hey there” 2>&1 | logger -t my_tag
And check /var/log/syslog for the result
Sep 24 09:37:47 ip-172–30–0–129 my_tag: hey there
So the result would be logging the output from cron tasks into a reliable log file with all the needed context of other things happening on the server in the same time.
Mission accomplished. Changing this logging configuration uncovered the bug I was hunting for soon enough.