Monitor your web server performance and health, in real-time, with netdata . Check the live demo . Web server log files exist for more than 20 years. All web servers, of all kinds, from all vendors, , produce log files, saving in real-time, all accesses to web sites and APIs. since the time NCSA httpd was powering the web Yet, after the appearance of online web analytics (such as Google Analytics), all these web server log files are mostly just filling our disks, rotated every night without any use whatsoever. This is about to change! I will show you how you can easily turn this “useless” log file, into a powerful performance and health monitoring , capable of detecting, , most common web server problems, including: tool in real-time **too many redirects**(i.e. oops! ) this should not redirect clients to itself! **too many bad requests**(i.e. oops! ) a few files were not uploaded! **too many internal server errors**(i.e. oops! ) this release crashes too much! **unreasonably too many requests**(i.e. oops! ) we are under attack! **unreasonably few requests**(i.e. oops! ) call the network guys! **unreasonably slow responses**(i.e. oops! ) the database is slow again! **too few successful responses**(i.e. oops! ) help us God! install netdata If you haven’t already, it is probably now a good time to . install netdata is a performance and health monitoring system for , FreeBSD and MacOS. is , meaning that everything it does is , so all the information presented, is interactive and just a second behind. netdata Linux netdata real-time per second If you install it on a system running a web server, it will detect it and it will automatically present a series of charts, with information obtained from the web server API, like these ( ): these do not come from the web server log file netdata charts based on metrics collected by querying the _nginx_ API(i.e. _/stab_status_ ). netdata , at the time of this writing, supports _apache_ , _nginx_ , _lighttpd_ and _tomcat_ . To obtain real-time information from a web server API, the web server needs to expose it. For directions on configuring your web server, check [_/etc/netdata/python.d/_](https://github.com/firehol/netdata/tree/master/conf.d/python.d) . There is a file there for each web server. tail the log! has a powerful plugin, capable of incrementally parsing any number of web server log files. This plugin is automatically started with and comes pre-configured, for finding web server log files on popular distributions. netdata web_log netdata Its configuration is at and looks like this: [/etc/netdata/python.d/web_log.conf](https://github.com/firehol/netdata/blob/master/conf.d/python.d/web_log.conf) You can add one such section, for each of your web server log files. Important Keep in mind netdata runs as user _netdata_ . So, make sure user _netdata_ has access to the web server logs directory and can read the log file. chart the log! Once you have all log files configured and restarted, you will get a section at the dashboard, with the following charts. netdata for each log file defined netdata responses by status In this chart we tried to provide a meaningful status for all responses. So: counts all the valid responses (i.e. informational, successful and not modified). success 1xx 2xx 304 are internal server errors. These are very bad, they mean your web site or API is facing difficulties. error 5xx are responses, except . All are redirects, but means "not modified" - it tells the browsers the content they already have is still valid and can be used as-is. So, we decided to account it as a successful response. redirect 3xx 304 3xx 304 are bad requests that cannot be served. bad as all the other, non-standard, types of responses. other real-time chart of web server responses by status, obtained from its log file netdata responses by type Then, we group all responses by code family, without interpreting their meaning. real-time chart of web server responses by type, obtained from its log file netdata responses by code And here we show the count of requests for each response code. real-time chart of web server responses by code, obtained from its log file netdata Important If your application is using hundreds of non-standard response codes, your browser may become slow while viewing this chart, so we have added a configuration option to disable this chart . bandwidth This is a nice view of the traffic the web server is receiving and is sending. What is important to know for this chart, is that the bandwidth used for each request and response is accounted at the time the log is written. Since refreshes this chart every single second, you may have unrealistic spikes if the size of the requests or responses is too big. The reason is simple: a response may have needed 1 minute to be completed, but all the bandwidth used during that minute for the specific response will be accounted at the second the log line is written. netdata As the legend on the chart suggests, you can use FireQoS to setup QoS on the web server ports and IPs to accurately measure the bandwidth the web server is using. Actually, … there may be a few more reasons to install QoS on your servers real-time chart of web server bandwidth, obtained from its log file netdata _Important_ Most web servers do not log the request size by default.So, unless you have configured your web server to log the size of requests , the _received_ dimension will be always zero. timings will also render the , and time the web server needed to respond to requests. netdata minimum average maximum Keep in mind most web servers timings start at the reception of the full request, to the dispatch of the last byte of the response. So, they include network latencies of responses, but they do not include network latencies of requests. real-time chart of web server response timings, obtained from its log file netdata _Important_ Most web servers do not log timing information by default.So, unless you have configured your web server to also log timings , this chart will not exist. URL patterns This is a very interesting chart. It is configured entirely by you. can map the URLs found in the log file into categories. You can define these categories, by providing names and regular expressions in . netdata web_log.conf So, this configuration (for my API URLs): Produces the following chart. The section is matched in the order given. So, pay attention to the order you give your patterns. categories real-time chart of web server URL patterns, obtained from its log file netdata HTTP methods This chart breaks down requests by HTTP method used. real-time chart of web server HTTP request methods, obtained from its log file netdata IP versions This one provides requests per IP version used by the clients ( , ). IPv4 IPv6 real-time chart of web server clients IP version, obtained from its log file netdata Unique clients The last charts are about the unique IPs accessing your web server. This one counts the unique IPs for each data collection iteration (i.e. ). unique clients per second real-time chart of web server unique clients, obtained from its log file netdata And this one, counts the unique IPs, since the last restart. netdata real-time chart of web server unique client, obtained from its log file netdata Important To provide this information _web_log_ plugin keeps in memory all the IPs seen by the web server. Although this does not require much memory, if you have a web server with several millions of unique client IPs, we suggest to disable this chart . watch the log! The magic of is that all metrics are collected per second, and all metrics can be used or correlated to provide . netdata real-time alarms Out of the box, automatically attaches the to all charts (i.e. to each log file individually): netdata following alarms web_log The ratio of HTTP redirects (3xx except 304) over all the requests, during the last minute. (i.e. ).Minimum requests: 120/minWarning: > 20%Critical: > 30% 1m_redirects Detects if the site or the web API is suffering from too many or circular redirects oops! this should not redirect clients to itself The ratio of HTTP bad requests (4xx) over all the requests, during the last minute. (i.e. ).Minimum requests: 120/minWarning: > 30%Critical: > 50% 1m_bad_requests Detects if the site or the web API is receiving too many bad requests, including _404_ , not found oops! a few files were not uploaded The ratio of HTTP internal server errors (5xx), over all the requests, during the last minute. (i.e. ).Minimum requests: 120/minWarning: > 2%Critical: > 5% 1m_internal_errors Detects if the site is facing difficulties to serve requests oops! this release crashes too much The percentage of successful web requests of the last 5 minutes, compared with the previous 5 minutes. (i.e. too many = or too few = )Minimum requests: 120/5minWarning: > double, or < halfCritical: > 4x, or < 1/4x 5m_requests_ratio Detects if the site or the web API is suddenly getting too many or too few requests oops! we are under attack, oops! call the network guys The average time to respond to requests, over the last 1 minute, compared to the average of last 10 minutes. (i.e. ).Minimum requests: 120/minWarning: > 2xCritical: > 4x web_slow Detects if the site or the web API is suddenly a lot slower oops! the database is slow again The ratio of successful HTTP responses (1xx, 2xx, 304) over all the requests, during the last minute. (i.e. ).Minimum requests: 120/minWarning: < 85%Critical: < 75% 1m_successful Detects if the site or the web API is performing within limits oops! help us God! state the minimum number of requests required for the alarm to be evaluated. We found that when the site is receiving requests above this rate, these alarms are pretty accurate (i.e. no false-positives). minimum requests alarms are . So, even . netdata user configurable [web_log](https://github.com/firehol/netdata/blob/master/conf.d/health.d/web_log.conf) alarms can be adapted to your needs Enjoy real-time performance and health monitoring!