Tiger Heron
Tiger Heron
Tiger Heron

High-performance debugging

Average: 5 (1 vote)

In this article, I will show you how to use PHP 5 input filters to support debug, trace and assert statements so that a one-line change disables these statements and restores your script to full production performance.

Background

The typical debug statement looks like this:

echo "<p>Hello, I am a debug statement</p>";

This is the workhorse of PHP debugging—at least, when we aren't using a debugger. We insert it as needed and remove it when done.

Sometimes the debug code takes some work to produce or has some permanent benefits. The easiest way to preserve it is by placing it in a comment:

// echo "<p>Hello, I am a complicated debug statement</p>";

This can be inconvenient if we frequently need to enable debugging or if we need to enable multiple debug statements. So the next variation is to enable debug statements based on a flag:

if ($debugEnabled) echo "<p>Hello, I am a debug statement</p>";

Now, we can turn a number of debug statements on or off with a single setting. But we've also added some overhead that occurs even when the debug statements are disabled. We have also added some extra typing that will make us less inclined to use this tool.

You can gain some flexibility by using:

debug("Hello, I am a debug statement");

This opens the door to creating an elaborate debug system, but adds a performance penalty. Not only do we have the overhead of the function call, but the function argument is evaluated each time the statement is executed, whether debugging is enabled or not. We're back to:

if ($debugEnabled) debug("Hello, I am a debug statement");

PHP provides an assert() function which helps check program logic and can be disabled in php.ini. This function could be subverted to serve as the base for a debug system, but it appears to function much like the debug function above, at least in PHP 5.2.4.

For code that services thousands of requests per hour, the small overhead of these debug statements is significant and they are rarely left in place.

Why use debug statements?

Before going further, I need to answer the obvious question: why not use debuggers, profilers or other tools instead of debug statements?

The answer is that it's not an either-or; each tool has its place. Debug statements are sometimes faster to use or provide information that is easier to understand. For example, the ability to easily trace all database queries and send the output to a log file might be more effective in resolving some problems than stepping through the code with a debugger.

The debug system I will present to you has these goals:

  • It should support debug, trace and assert statements.
  • It should be easily and completely disabled for production code.
  • It should be easy to enter debug statements.

Setting the stage

Let me start with the three simple debugging classes I will use in this article. Each class name is prefixed with "th" for "Tiger Heron".

The first is the thDebug class, which allows us to format the output appropriately for Web or CLI use:

class thDebug
{
    static public function debug($msg)
    {
        if (php_sapi_name() == 'cli') echo $msg . "n";
        else echo '<p>' . nl2br($msg) . '</p>';
    }
}

The second class is thTrace. Tracing can be enabled or disabled, and you can set a threshold for the level of messages that are displayed.

class thTrace
{
    static private $isEnabled = false;
    static private $level = 0;

    static public function enable($isEnabled) { self::$isEnabled = $isEnabled; }
    static public function isEnabled() { return self::$isEnabled; }
    static public function setLevel($level) { self::$level = $level; }

    static public function trace($msg, $level = 0)
    {
        if (self::$isEnabled && $level <= self::$level) {
            thDebug::debug($msg);
        }
    }
}

The final debug class is thAssert, which generates an message only if an assertion fails. The "precondition" and "postcondition" methods are aliases for assert. Assertions, like tracing, can be enabled or disabled at run-time.

class thAssert
{
    static private $isEnabled = false;

    static public function enable($isEnabled) { self::$isEnabled = $isEnabled; }
    static public function isEnabled() { return self::$isEnabled; }
    static public function assert($cond, $file, $line, $title='Assertion')
        { if (!$cond) thDebug::debug("$title failed! $file:$line");  }
    static public function precondition($cond, $file, $line)
       { self::assert($cond, $file, $line, 'Precondition'); }
    static public function postcondition($cond, $file, $line)
      { self::assert($cond, $file, $line, 'Postcondition'); }
}

To enable the entire debug system, we use:

class thDebugManager
{
    static private $isEnabled = false;
    static private $isFilterRegistered = false;

    static public function enable($isEnabled)
    {
        if (!$isFilterRegistered && $isEnabled) {
            require 'thDebugFilter.inc';
            $isFilterRegistered = true;
        }
        self::$isEnabled = $isEnabled;
    }

    static public function isEnabled() { return self::$isEnabled; }
}

The references to the filter are explained in the next section. It might seem to make more sense to place this code in the thDebug class, but we want to avoid loading any debug-related classes when they are not used. The thDebugManager contains the minimum amount of code that we must load all the time.

PHP input filters

We'll now introduce the key PHP 5 feature we'll use for our solution: the input filter. We will create a filter that will transform PHP code as it is loaded, essentially acting as a dynamic pre-processor.

I started with some code from Adam Trachtenberg's book, Upgrading to PHP 5, and tweaked it a bit to create a reusable abstract class. To create a new filter, simply subclass this class and implement the missing function.

abstract class thAbstractStreamFilter
    extends php_user_filter
{
    private $data;
    private $bucket;

    public function onCreate()
    {
        $this->data = '';
        return true;
    }

    public function filter($in, $out, &$consumed, $closing)
    {
        while ($bucket = stream_bucket_make_writeable($in)) {
            $this->data .= $bucket->data;
            $this->bucket = $bucket;
            $consumed = 0;
        }

        $consumed += strlen($this->data);
        $this->data = $this->filterData($this->data);
        $this->bucket->data = $this->data;
        $this->bucket->datalen = strlen($this->data);
        stream_bucket_append($out, $this->bucket);
        $this->data = '';
        return PSFS_PASS_ON;
    }

    abstract public function filterData($data);
}

I'll refer you to Adam's book for the details on input filters.

The Debug filter

Using the thAbstractStreamFilter class, I can finally create the core of my debug solution, the thDebugFilter:

class thDebugFilter
    extends thAbstractStreamFilter
{
    public function filterData($data)
    {
        $data =
            preg_replace(
                '!//CODEs*([^nr]*)!m',
                '$1', $data);
        $data =
            preg_replace(
                '!//DBGs*([^nr]*)!m',
                'thDebug::debug($1);', $data);
        $data =
            preg_replace(
                '!//TRACEs*([^nr]*)!m',
                'if (thTrace::isEnabled()) thTrace::trace($1);', $data);
        $data =
            preg_replace(
                '!//ASSERTs*([^nr]*)!m',
                'if (thAssert::isEnabled()) ' .
                'thAssert::assert($1, __FILE__, __LINE__);', $data);
        $data =
            preg_replace(
                '!//PRECONDs*([^nr]*)!m',
                'if (thAssert::isEnabled()) ' .
                'thAssert::precondition($1, __FILE__, __LINE__);', $data);
        $data =
            preg_replace(
                '!//POSTCONDs*([^nr]*)!m',
                'if (thAssert::isEnabled()) ' .
                'thAssert::postcondition($1, __FILE__, __LINE__);', $data);

        return $data;
    }
}

I've introduced some special "keywords" to PHP: //CODE, //DBG, //TRACE, //ASSERT, //PRECOND and //POSTCOND. When the filter is not used, these all look like PHP comments. When the filter is enabled, each of these statements is transformed into live code.

You could do everything with //CODE. This is used to hide or expose any arbitrary line of code. The other statements reduce the typing needed to use the debug classes.

Here are some examples and their translations

//CODE $x = $x + 10; 
  -> $x = $x + 10;
//DBG "This is a debug statement."
  -> thDebug::debug("This is a statement.");
//TRACE "Level 4 message", 4
  -> if (thTrace::isEnabled()) thTrace::trace("Level 4 message", 4, __FILE__, _LINE__);
//ASSERT $x == 10
  -> if (thAssert::isEnabled()) thAssert::assert($x == 10, __FILE__, __LINE__);
//PRECOND $data == null
  -> if (thAssert::isEnabled()) thAssert::precondition($x == 10, __FILE__, __LINE__);
//POSTCOND $data != null
  -> if (thAssert::isEnabled()) thAssert::postcondition($x == 10, __FILE__, __LINE__);

Before using an input filter, we have to register it. The following line does the job:

stream_filter_register('phpCode.thDebug','thDebugFilter');

Tying it all together with __autoload

Now that we have the filter, how do we use it? Here is one way:

function thRequire($file)
{
    if (thDebugManager::isEnabled()) {
        require 'php://filter/read=phpCode.thDebug/resource=' . $file;
    }
    else {
        require $file;
    }
}

You would call thRequire() in place of any require statement. You would need similar methods for require_once, include and include_once.

I prefer using anothe PHP 5 feature, the __autoload() method. It allows us to dispense with most include and require statements, at least for classes. When a class is needed but not defined, the function registered with __autoload will be given a chance to load the class. Most of my code is objected-oriented, so this works well for me.

Here is the class that that handles class loading. For this article, I have deliberately created a very basic autoloader: it assumes that the class name maps to a file with an "inc" extension and exists in the include path.

class thAutoload
{
    static public function load($className)
    {
        if (thDebugManager::isEnabled()) {
            require
                'php://filter/read=phpCode.thDebug/resource=' .
                $className . '.inc';
        }
        else {
            require $className . '.inc';
        }
        return true;
    }
}

I still need to register my autoload handler with __autoload(), but that would be bad style. There's only room for one __autoload() method, so to allow sharing with other packages, I use spl_autoload_register() instead:

spl_autoload_register(array('thAutoload', 'load'));

If we were really using other autoloaders, we should also check if the file exists and return false if it doesn't, so as to give other autoloaders a chance to try loading the class.

In any case, all the pieces are present and the solution is complete:

  • thAutoload and thDebugManager must be loaded with a normal require or include.
  • To enable debugging, call thDebugManager::enable(true) before loading any classes containing debug statements.

So how well does it perform?

One way to see if I achieved my goals is to itemize all the overhead imposed by the debugging system when it is disabled.

  • The compiler will have to compile the thDebugManager class.
  • The thAutoload class also a small bit of code that is compiled but not used.
  • One if statement and a function call is executed per loaded class.
  • The compiler has to discard some extra comment lines.

To me, the overhead appears acceptably low. However, for the highest performing scripts, where you count nanoseconds, this is still not good enough. Here are options for increasing performance:

  • Using __autoload() is slightly slower than direct loads, so the scripts could use thRequire().
  • The thDebugManager can make the $isEnabled variable public—bad OO style, but it eliminates a function call.
  • Need more speed? Comment out the unused parts of thRequire(). This involves changing a few lines, but it is far easier than commenting out all the debug statements, which is what you would be doing if you used debug statement in the typical fashion.
  • Need even more? Edit all thRequires into straight requires. If we're talking high-performance scripts, you shouldn't have very many of these anyway. While this is not very convenient, you can enable a lot of debugging capability with a few edits.
  • Finally, use an opcode cache. This eliminates all compiler overhead.

Getting the code

I've put all the code together in a ZIP file, along with a sample sorting class that uses the debugging features and a main program that can be run using the command-line version of PHP.

Download high-performance-debugging.zip ( 7KB)

Where to go from here

The ability to trasnform code dynamically opens up a lot of pre-processing possibilities: how about running the code through the C pre-processor? Or a macro processor like m4? Or even adding new syntax to the PHP language? I'll explore some options in later articles.

A special request for opcode cache developers

The extra filter processing creates undesirable overhead that could be easily eliminated with an opcode cache, but it is unclear how many opcode caches support input filters. The dynamic nature of input filters make it almost impossible to determine when the cached code needs to be updated (after all, I could create a filter where the generated code would vary depending on the time of day) and this could discourage developers from adding suport.

I hope instead that the developers will provide some way of enabling this capability with the caveat that the user must manage the cache, either manually or programmatically. Without some caching support, there is a limited future for using this technique in production code.

Post new comment

The content of this field is kept private and will not be shown publicly.
  • Web page addresses and e-mail addresses turn into links automatically.
  • Allowed HTML tags: <a> <em> <strong> <cite> <code> <ul> <ol> <li> <dl> <dt> <dd>
  • Lines and paragraphs break automatically.

More information about formatting options

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.

 

Privacy policy Tiger Heron LLCinfo@tigerheron.com • (503) 771-7724

Copyright © 2005-2007, Tiger Heron LLC