How to add useful logging to your Drupal 7 code

A lot of times my job involves debugging problems on Drupal 7 production websites and find out why things aren’t working, or even worse: why do they only work some of the time.

The first thing I then do is check the Drupal logs, but sadly developers have a habit of not adding proper logging. So let me give you some pointers on how to add actual useful logging to your code.

If you are reading this post for the sake of analysing Drupal logs, you might also want to read this blog item about debugging Drupal 7 development.

Do I need to log something or just silently fail?

Let me start with stating that not everything that doesn’t go as expected needs to get logged. We want logs yes, but only useful ones.

Translating that to Drupal means that every exception you catch probably doesn’t need to end up in the logs. Example:

function build_some_render_array($field) {
  $build = array(); // Drupal render array

  // more stuff happens here

  $entity = $field['entity'];
  $entity_wrapper = entity_metadata_wrapper('node', $entity);
  try {
    $value = $entity_wrapper->field_some_name->value();

    // more code

    return $result;
  }
  catch (Exception $e) {
    // do nothing
  }
  return drupal_render($build);
}

What this means is that there most likely is no field called “field_some_name” for this entity. If you are passing different types of entities to this function, this can happen. This doesn’t mean something is wrong with this entity, we simply don’t use this field for building up the $build array.

The problem is that if you don’t try/catch this block, you hit the Drupal default exception handler. That results in a white page of death. So we catch it here, but don’t log anything as it’s not something we need to know about in debugging problems.

Use the proper error levels

In includes/common.inc you have this function defined:

/**
 * Returns a list of severity levels, as defined in RFC 3164.
 *
 * @return
 * Array of the possible severity levels for log messages.
 *
 * @see http://www.ietf.org/rfc/rfc3164.txt
 * @see watchdog()
 * @ingroup logging_severity_levels
 */
function watchdog_severity_levels() {
  return array(
    WATCHDOG_EMERGENCY => t('emergency'),
    WATCHDOG_ALERT => t('alert'),
    WATCHDOG_CRITICAL => t('critical'),
    WATCHDOG_ERROR => t('error'),
    WATCHDOG_WARNING => t('warning'),
    WATCHDOG_NOTICE => t('notice'),
    WATCHDOG_INFO => t('info'),
    WATCHDOG_DEBUG => t('debug'),
  );
}

These log levels are not something Drupal came up with, but are actually defined in the syslog RFC 3164 and are considered to be the standard severity levels for UNIX services:

Numerical     Severity
 Code
 0            Emergency: system is unusable
 1            Alert: action must be taken immediately
 2            Critical: critical conditions
 3            Error: error conditions
 4            Warning: warning conditions
 5            Notice: normal but significant condition
 6            Informational: informational messages
 7            Debug: debug-level messages

Adding the severity level is pretty obvious using these functions:

// Normal watchdog() call
function watchdog(
  $type,
  $message,
  $variables = array(),
  $severity = WATCHDOG_NOTICE,
  $link = NULL) {}

// Just pass the exception on to watchdog:
function watchdog_exception(
  $type,
  Exception $exception,
  $message = NULL,
  $variables = array(),
  $severity = WATCHDOG_ERROR,
  $link = NULL) {}

(Notice how watchdog() uses WATCHDOG_NOTICE as default, so you probably want to add a better severity level for all the calls you make in your code that are actually errors)

One of the ideas behind syslog is that you can attach an alert trigger to a severity level, causing it to send out an email or text message. While your Drupal site doesn’t have to use syslog (although it’s awesome and you should really look into it), using the same severity levels is a good way to be able to filter important things in your logs.

If you prefer Drupal sending out alerts when a certain level of severity is found, the watchdog module has hooks that allow you to create triggers. There are also existing modules that do a similar thing.

Use the correct watchdog calls

You always need to add 4 arguments to your watchdog() calls:

  • $type: Most of the time the name of your module or the name of the API you are calling. You are going to use this to filter log messages later, so make sure that if you log 5 steps of your process, all 5 entries use the same $type.
  • $message: As short as can be. In English! And with variable placeholders (values go in $variables)
  • $variables: The variables needed for the $message.
  • $severity: Scroll back up to the previous section. It’s always a good idea to add this yourself and not rely on the watchdog function’s default.
  • $link. Optional, but can be handy: the link to the page that trigged this log entry. For the node module this links to the node page itself.

Tip: Are you logging an error that was caught from an exception? Just use watchdog_exception ($module, $exception).

Good stuff to log

This is a list of things you really want to find in your logs:

CRUD operations

Just about everything you can CRUD should be logged (nodes, users, custom entities, etc …). Add as much possible variables as needed in the log entry, but never log passwords or other secret things!

Example message from the Drupal node module that logs these items with type “content”:

Watchdog log entry of a node edit action

Notice how there is a link at the bottom under “Operations”. This is the $link argument the node module passed to the watchdog() call when making this log entry:

API calls

Just like the previous CRUD items you should also log any outgoing API call your system makes. Yes, for a site that is API-driven this will mean a lot of log records, but trust me, it’s worth it.

Include a FULL list of parameters and a response code, combined with using the proper watchdog severity level:

  • API call that went ok? Log them as notice
  • API call that gave a notice? Log them as warning and include notice message.
  • API call that gave an error? Log them as alert or critical and include the error message.

If an error occurs, it’s even a good idea to write more error information into a separate log table in Drupal as this information is vital in debugging problems. You want this information available for calls in the past, so you can easily see what happened back when your customer filed a bug report.

Cache invalidation messages

The modules we use for integrating Drupal with varnish cache invalidation are varnish and expire. The varnish module handles the backend communication module with the varnish server while the expire module determines what should be invalidated and when.

These 2 modules log their actions to watchdog so you can see what ban commands are set to varnish and which urls expire decided to clear. This information is vital when you are debugging the “why is this page not being refreshed?” problems. Spoiler: it’s most of the time the Drupal anonymous page cache that you are not clearing before the expire call is made.

Calls and results of drush commands

Most of our import and sync scripts are written as separate drush commands and not just run from hook_cron or form submissions on the website. Everytime such a script is called it logs a message, even if it doesn’t do anything. We just want to know that it ran.

If you configure your Drupal cron to run via the Linux crontab as “drush cron”, you can already see that log message in your logs.

Bad stuff appearing in logs

Now this is the real import part. Logs are only good if they only have useful things and are not filled with garbage. This doesn’t mean anything about the total amount of log entries generated. If it’s a lot of entries, but all useful ones, it’s still good logging.

These things you really do not want to have clogging up your logs:

403 and 404 errors

There are a few files browsers and crawlers will request without your site linking to them. The best way to prevent those giving a 404 is to actually make those files, even if they are just empty files:

/robots.txt
/favicon.ico
/apple-touch-icon.png
/apple-touch-icon-precomposed.png
...

There are also scripts that generate all the touch icons for you from a single source image.

Another feature your site’s webserver should have is to turn off access logging for those files. This is an example of an nginx vhost:

location = /favicon.ico { log_not_found off; access_log off; }
location = /robots.txt { allow all; log_not_found off; access_log off; }

Enable the Drupal fast_404

In the default.settings.php there is a clear explanation about fast 404’s:

[php]
/**
 * Fast 404 pages:
 *
 * Drupal can generate fully themed 404 pages. However, some of these responses
 * are for images or other resource files that are not displayed to the user.
 * This can waste bandwidth, and also generate server load.
 *
 * The options below return a simple, fast 404 page for URLs matching a
 * specific pattern:
 * - 404_fast_paths_exclude: A regular expression to match paths to exclude,
 * such as images generated by image styles, or dynamically-resized images.
 * If you need to add more paths, you can add '|path' to the expression.
 * - 404_fast_paths: A regular expression to match paths that should return a
 * simple 404 page, rather than the fully themed 404 page. If you don't have
 * any aliases ending in htm or html you can add '|s?html?' to the expression.
 * - 404_fast_html: The html to return for simple 404 pages.
 *
 * Add leading hash signs if you would like to disable this functionality.
 */
$conf['404_fast_paths_exclude'] = '/\/(?:styles)\//';
$conf['404_fast_paths'] = '/\.(?:txt|png|gif|jpe?g|css|js|ico|swf|flv|cgi|bat|pl|dll|exe|asp)$/i';
$conf['404_fast_html'] = '<html xmlns="http://www.w3.org/1999/xhtml"><head><title>404 Not Found</title></head><body>
<h1>Not Found</h1>
The requested URL "@path" was not found on this server.</body></html>';

/**
 * By default, fast 404s are returned as part of the normal page request
 * process, which will properly serve valid pages that happen to match and will
 * also log actual 404s to the Drupal log. Alternatively you can choose to
 * return a 404 now by uncommenting the following line. This will reduce server
 * load, but will cause even valid pages that happen to match the pattern to
 * return 404s, rather than the actual page. It will also prevent the Drupal
 * system log entry. Ensure you understand the effects of this before enabling.
 *
 * To enable this functionality, remove the leading hash sign below.
 */
# drupal_fast_404();

I can’t explain it better than the function description already says. On very busy sites this can be a big performance increase and save your website from a lot of 404 notices, but it has side-effects you should be aware of.

PHP warnings about variables not defined

These truly are the worst kind. It’s a result of developers not initializing their variables correctly, not adding enough “what if stuff is empty?” checks or simple passing wrong variables to a function.

Examples:

  • Something going wrong in your own module or feature module. You NEED to fix this. In this case it’s a foreach() running on an empty array. This simply needs an if (!empty($array)) added before the foreach():
Warning: Invalid argument supplied for foreach() in feature_site_cron() (rule 1410 of /var/www/sites/all/modules/features/feature_site/feature_site.module)
  • Something going wrong in a contrib module. Find out why is’t going wrong, and if it’s your code or the contrib module’s code. If it’s in the contrib module you should probably fix this and submit a patch to the module on drupal.org:
Notice: Undefined index: google_analytics_reports_cache_length in google_analytics_reports_api_admin_submit() (rule 184 van /var/www/sites/all/modules/contrib/google_analytics_reports/google_analytics_reports_api/google_analytics_reports_api.pages.inc)
  • Something going wrong in core. You need to debug this back to your own code, as it’s probably not a problem in core but your (bad) code triggering it. Is it really a bug in core? Nice, you are now a real Drupal developer, submit a patch on Drupal core and have yourself a cold beer ;)
Notice: Undefined offset: 24 in timer_read() (rule 488 van /var/www/includes/bootstrap.inc)

These are just 3 examples, but you probably get the idea now. This thing should not be appearing in production logs, so check your logs often during development and fix these things!

Yes, it will cost you some time, but you will get that time back in multitudes when you need to debug your production site later on.

Things that cannot be put back in the right context, or are too generic in nature.

I hope it’s pretty clear you should not log things like “Something went wrong” without extra information. But I still see things like that appear on sites, so I’m adding it here for the sake of completeness.

That’s all, I hope you learned something!

Netiquette: email signature separator

It’s time for another netiquette rant: There are way too many people not using the double dash marker as the first line of their email signature.

The Usenet standards specify that a signature block should be delimited from the body of the message by a single line consisting of exactly two hyphens, followed by a space, followed by the end of line (i.e., “– \n”). This latter prescription, which goes by many names, including “sig dashes”, “signature cut line”, “sig-marker”, “sig separator” and “signature delimiter”, allows software to automatically mark or remove the sig block as the receiver desires.

Source: http://en.wikipedia.org/wiki/Signature_block

If you use this as the first line of you email signature, almost all mail software and webmail clients know not to display your signature again in replies and long mail threads. Gmail even displays a small triple-dots icon on the first email instead of showing the signature by default, so you don’t waste screen space on it.

These 3 characters (actually 4 if you include the newline) can make emailing so much more enjoyable, especially as a lot of people feel the need to include 4 different company contacts, 6 social media profiles, 3 inline logos and a 5 line disclaimer text in their emails. So please, add it to your signature. The Internet thanks you.

P.S. The next rant will be about “do not delete older conversations from an email thread”. It’s not 2000 anymore. Bandwidth and disk space cost close to nothing nowadays.

Debugging Drupal 7 development

There are a lot of ways to debug your Drupal code while developing or when it’s running on production. In this blog post I’ll give an overview of all the tools I’ve been using the past years, going from some simple Drupal or PHP commands to external web services.

1. Drupal and PHP commands

The Drupal devel module

The Drupal devel module is basically the first Drupal module you install when you are developing a new site. You generally do NOT wish to enable this on a staging or production site as it’s quite some overhead and might expose backend information to your visitors.

The devel module has all kind of logging and extra debug information:

  • A query log at the bottom of each page
  • Xhprof integration (see later in this blog post)
  • Execution time and memory usage stats per page
  • An extra tab on every node and term page with debug output of the current object (which is an dpm($entity))

It has a big set of PHP functions you can use in your own code to debug things. Check this demo page for an overview and examples.

It also has support for using FireBug console logging if you prefer to have your logs completely outside your document’s html code.

Besides showing debug information, there is also a sub-module to generate content: devel generate. This is something people always forget during development: make sure the amount of content in your development environment reflects the later production amount! This module even generates dummy images, so your content can look pretty live-like.

dpm (debug_backtrace())

This is also available in the devel module as the function ddebug_backtrace(), but I’m highlighting it here as it can be used in other non-Drupal PHP code too.

If you combine the devel module with the powerfull debug_backtrace() function from PHP, you have a neat way for a stack trace in your code. The stack trace is the complete path of functions there were called to finally end up at the point you enter the debug_backtrace in your code:

function foo($bar) {
  dpm (debug_backtrace()); // Show the backtrace up to here
  print_r (debug_backtrace()); // For non-Drupal code
  var_dump (debug_backtrace()); // Probably too much info
  ...
}

2. PHP extensions

Next to Drupal modules and PHP functions, there are also some extra PHP extensions you can install to make developing and debugging a whole lot easier.

xdebug

Xdebug is a PHP extension that allows you to debug and profile PHP scripts with an external program. Profiling means getting a clear overview of what functions were called in your script, how many times, how much memory they used and how much cpu time.

Xdebug also makes your PHP error messages a bit nicer with adding a table layout and a call stack:

A normal PHP fatal error:

A normal PHP fatal error

A PHP fatal error with the xdebug extension installed:

A PHP fatal error with the xdebug extension installed

To enable this you need to enable the xdebug (duh) module and set these options in your php.ini:

display_errors = on
html_errors = on

For an overview of xdebug debugging and profiling, check this blog post again, it has instructions and screenshots how to set it up.

Xdebug is quite a must-have on every PHP development server or environment. Do not install xdebug on your production server! You will never display errors on a production server, nor debug code with an external editor, so don’t waste resources on loading this on a production site.

hprof

Xhprof is a profiling extension for PHP, made by Facebook.

The output is quite raw and it’s not easy to install, but it can provide you with some valuable information. It used to be one of the first PHP profiling tools, and worked quite well, but nowadays there are a lot better tools available (see the next section of this blog item).

I consider this extension to be ‘past its prime’ and it’s only listed here because it’s a popular tool and I wanted to cover it in my list of available tools. There is also Drupal module for it, but I found the xhprof support in the devel module to be better.

The same remark as xdebug goes for xhprof as well: do not install it on a production server!

3. External webservices

New Relic

New Relic is an online tool that you can use to monitor and profile applications and servers. It supports PHP and Linux servers and I’ve gotten quite fond of it lately.

Because it’s a daemonized process, that doesn’t really impact the performance of your server, you can install it on both development and production servers. The main things I nowadays use this tool for are:

  • Profiling code on production servers: why is a page slow?
  • Monitoring performance of applications and servers, combined with alerting: if a site suddenly starts throwing PHP errors, or page loads are below a predefined threshold, we get an email.
  • Continous API and website testing: test scripted scenario’s every x minutes to see if applications are still working like they should.

I will write a separate blog post about New Relic soon, as there is just way too much to cover in this blog post. I do encourage you to make a free account and just started testing on your own. You can install it on your production server or on your local PHP vagrant box.

Blackfire.io

Blackfiro.io is the new kid on the block. It’s made by the creators of the Symfony framework and is mainly targetted at Symfony development, but you can use it for generic PHP development too.

I haven’t tested this for real projects yet, but it looks very promising so I’m mentioning it here. As Drupal 8 is built upon the Symfony kernel I expect this to become a popular debugging tool for Drupal. But we’ll have to see about that when Drupal 8 is actually released.

That’s all, I hope you learned something for this post.

Enforce SSL for an Apache vhost

Here’s a proper way to enforce SSL with the apache2 web server, without hardcoding the ServerName in the config. You can also use this in a .htaccess file:

<IfModule mod_rewrite.c>
   <IfModule mod_ssl.c>
     RewriteEngine on
     RewriteCond %{HTTPS} !^on$ [NC]
     RewriteRule . https://%{HTTP_HOST}%{REQUEST_URI} [L]
    </IfModule>
 </IfModule>

You need the rewrite module and ssl module enabled (obviously), which can be easily enabled with the a2enmod commands:

a2enmod ssl
a2enmod rewrite

Drupal watchdog logging: dblog vs syslog

Sometimes logging events to your Drupal database is not the best way to go and can be quite a performance killer. This is when you want to start using a syslog server.

Database logging (dblog)

A standard Drupal 7 installation comes with database logging enabled, so your watchdog() calls get logged in a permanent storage system. You can then filter these logs and use “drush ws” to view them from a console, and get valuable feedback from it. So it’s a pretty good thing to have.

You can fine-tune this logging on the Drupal log settings page (admin/config/development/logging) as shown below:

Drupal logging settings page

For a development site you mostly turn on the display of warnings and errors, while for a production site it should be turned off. You can also configure this via your settings.php file:

// Disable logging
$conf['error_level'] = 0;

// Set nr. of log items
$conf['dblog_row_limit'] = 10000;

If you have a big site, where a lot of things are logged, like node edit messages, user login messages, API calls, order checkouts etc… you want the setting for “Database log messages to keep” be as big as possible. Only the last log items are kept (cron clears older records for you), but if this number is too low, you might end up losing important data as it’s being pushed out by newer log entries.

For one of our sites this settings somehow got raised to 1000000 (one million), but that had quite a dramatic impact on our server performance. Below is a New Relic performance graph that shows the watchdog table is a huge bottleneck for database queries:

New Relic MySQL queries performance report of a Drupal site that has watchdlog log entries set to 1 million.

As a lot pages on this site add some kind of log action when using them, this was having quite a negative impact on the overall performance of the site.

Syslog

A better way to handle logs for such a big site, is using an external tool like syslog.

Syslog is a UNIX service, installed on every Linux server, that basically listens on a TCP or UDP port for log entries and then does something with it. Most of the time that’s writing the logs to a file (that then gets rotated on a daily base by logrotate), but you can also send them to a remote server, or do both.

If you have a lot of servers you want to monitor from one central place, a remote syslog server that collects all the logs is a good choice. You can set it up yourself using a Linux server, or you can use an external service like Papertrail.

Most consumer-based NAS solutions also have a syslog server installed, like my Synology that I’m using for my home network.

What we did for the website I mentioned before was now:

  • enable the Drupal syslog module
  • disable the Drupal dblog module
  • configure the Linux server to use a remote Papertrail syslog server

The basic configuration of the Drupal syslog module is quite ok, we’re using LOCAL0 to log here and that’s also what we are going to send to Papertrail:

Drupal syslog configuration

The server’s syslog entry looks like this (Papertrail will guide you through this setup process):

# Drupal sites log to LOCAL0 and get their logs sent to Papertrail
local0.* @XXXX.papertrailapp.com:XXXXX

We actually did this for 5 servers for this client, so we can now monitor all their logs via one central website on Papertrail.