Skip to content

2005

dtracing PHP on Solaris

[Updated: just wanted to point out that Solaris 10 is free]

One of the things that Theo and myself have been salivating over at OmniTI recently is this really cool tool on Solaris (and OpenSolaris) called DTrace.

DTrace is one of those tools that makes you wonder how you did anything without it before you'd heard of it. What is it? You can think of it as being something like strace that's been exposed to ultimate steroid mix during its conception. Why is it better than strace and similar tools? It's non-invasive, fast, scriptable and extensible.

So, why am I posting about it here? I had the great pleasure of sitting down with Bryan Cantrill, (Solaris kernel developer and one of the guys behind dtrace--a very animated, funny, smart guy) at OSCON, and produced a dtrace provider for PHP.

This is implemented as a PECL extension and can be installed with this simple invocation, if you've built your own php with pear support (recommended):

   # pear install dtrace

Once installed, you'll want to add a line to your php.ini file:

   extension=dtrace.so

Once it's loaded, restart your apache server, and you're ready to dtrace. If you run:

   # dtrace -l | grep php

You'll see a bunch of output like the following, for each apache child process:

   34412    php9915         dtrace.so                php_dtrace_execute function-return
   34413    php9915         dtrace.so       php_dtrace_execute_internal function-return
   34415    php9915         dtrace.so                php_dtrace_execute function-entry
   34416    php9915         dtrace.so       php_dtrace_execute_internal function-entry

What this shows is that process ID 9915 is running and offers up 4 possible probe points. The probe points wrap around the Zend engine execution routines (php_dtrace_execute and php_dtrace_execute_internal) and provide two classes of probes; function-entry and function-return. What this means is that we can monitor PHP whenever a function is about to be called (function-entry) and when a function has finished being called (function-return).

These probes have 3 parameters; arg0 is the name of the function being called, arg1 is the name of the file from which the call is being made and arg2 is the line number of that file.

Now, let's say that you want to get an idea of which functions are being called in your application; the following dtrace line counts each call; it won't print out any information immediately, as it is sitting there gathering information; run it for a while and then hit CTRL-C and it will spit out the summary information.

   # dtrace -n function-entry'{@[copyinstr(arg0)] = count()}'

I get this output if I try to set up media wiki:

    dl                                                                1
    extension_loaded                                                  1
    version_compare                                                   1
    phpversion                                                        1
    ob_implicit_flush                                                 1
    install_version_checks                                            1
    is_writable                                                       1
    error_reporting                                                   1
    is_array                                                          1
    header                                                            1
    strpos                                                            1
    php_sapi_name                                                     1
    function_exists                                                   2
    dirname                                                           2
    ini_set                                                           2
    defined                                                           3
    file_exists                                                       4
    main                                                              9
    define                                                           74

Pretty cool huh? We can immediately see that media wiki is calling define() a LOT across the space of just 2 page loads. If you were looking for things to optimize (and if this wasn't the rarely used setup page), then you've very easily gotten an idea of what's going on. You can then refine your dtrace line to home in on the problem areas.

You can also get an idea of code coverage with this one-liner, which will summarize the calls made, grouping the information by filename, and pretty printing a histogram showing the relative number of calls made from the various lines in your app:

   # dtrace -n function-entry'{@[copyinstr(arg1)] = lquantize(arg2, 0, 5000)}'
   /export/home/wez/public_html/mediawiki-1.4.7/config/index.php
           value  ------------- Distribution ------------- count
             114 |                                         0
             115 |@@@@@@                                   2
             116 |                                         0
             117 |@@@                                      1
             118 |                                         0
             119 |@@@                                      1
             120 |@@@                                      1
             121 |                                         0

This (abbreviated) output shows you the number of times a particular line of code was visited in config/index.php of media wiki, rendering the relative incidences as ascii-art bars (the Solaris gang are big ascii-art fanatics).

The really really cool thing about this is that it can aggregate the information across all the apache children running on your machine, transparently. The way that dtrace is implemented means that you can even have this module loaded into php permanently on production machines; it has no overhead when you're not running the dtrace command, and very very tiny overhead when you are.

DTrace is a powerful tool for sysadmins and developers alike; I'm looking forward to making heavy use of it in the near future and beyond. I've barely even scratched the surface of the surface here; if you want to learn more, check out Bryan's more detailed "dtrace and php" blog entry, where he shows how to view the complete call stack through php down into the kernel and back (neat!), and the DTrace Community.

DTrace is available on Solaris 10 and up (including OpenSolaris). I recommend experimenting with it, as it will revolutionize the way that you think about debugging and profiling.

Back from OSCON-let there be slides

[Updated: added alternate formats]

A bit lax in getting these online; I bought a new HD for my laptop while in portland, so it's been mildly difficult to get at the slides themselves.

Here are the slides from my PDO talk (powerpoint).

Here are the slides from my PDO talk (PDF).

Here are the slides from my PDO talk (Flash).

OSCON was a blast; good to see the usual gang again, good to put faces to people that I hadn't previously met, and good to meet new people. Looking forward to going again next year :-)

Responsibility and OpenSource

I just spent a while composing this response as a comment on Marcus Whitney's "Juvenile Demands and Criticism of Open Source Development". Since it turned out quite long, I thought that I'd turn it into a blog entry of my own. Marcus is indirectly referring to someone that I'm going to call Mr. E.

Mr. E seems to have some issues with the PHPSC, and perhaps something of an overlarge ego, both of which are annoying characteristics. However, you need to understand what motivates him. He likes PHP and wants it to be the best, and part of that is keeping its security record in tip-top condition.

When a PHP application is popular and is riddled with holes, PHP itself is tarnished by the reputation of that application. Mr E devotes a lot of his time listening out for news of problems, as well as searching them out and coming up with patches to address them. You can (and probably should) forgive him getting annoyed when he's put out all the effort, often supplying a patch to address the problem, and had no one take any positive action.

While security problems remain unaddressed, in the real world, people are losing money through increased bandwidth costs and system reinstalls--and thats the best case. The worst case is that some sensitive data (credit card numbers, perhaps) is being collected and used illegally. This leads to all kinds of trouble for the victim and the people running the site.

My personal opinion on the responsibilities of OpenSource development is this: it's good, it's free, there's no warranty. If it works for you, that's great. If it breaks, you get to keep the pieces. The author doesn't owe you anything; you get what you pay for. If there are security flaws, so be it; there is nothing that says that the author has to push out a release immediately, nor is there anything that says that they have to handle the matter according to the proper form for disclosure of security problems.

When it comes to popular OpenSource projects, there is usually a decent sized team (eg: more than 1 guy) that looks out for it, fixing bugs, working on new features and so on. Still, they have no legal, binding, responsibility to you, the end user. That's still fine; you're still getting what you paid for.

Here's the differentiation: if the guys behind OpenSource projects behave responsibly, that makes their project into a Great Project.

If they're ignoring security advice and doing nothing about it, that tells you something about the goals of the people developing it. It doesn't necessarily make the project a bad project, it just doesn't make it a Great Project.

So, people installing applications have a choice: they're free to install any app they want, of course, but good sysadmins will generally only want to install Great Projects on their servers. How can they tell if a project is a Great Project or if it's cool sounding project run by people that don't really care about security? If you take a look at the phpBB site, you'd be forgiven for thinking that phpBB sounds like a mature, stable, solid and secure application.

Obviously, you can't trust the marketing material produced by the people that built the project.

So how else can you determine if it is a Great Project ? By reading around and listening to others, that's how.

If nobody spoke out about the security problems of an otherwise Great Project nobody would know about them. And in this regard, our Mr E is doing the right thing--he wants the people behind the project to take action before the problem escalates up to security advisory status and is marked as "vendor took no action". When this happens, the reputation of the PHP project itself also suffers by association.

PDO PECL releases

As I've had a couple of people ask me for them recently, I sat down and cooked up PECL packages for PDO and its drivers tonight (except firebird, but...) so that they can be used with PHP 5.0.3 and up.

The PECL packages are literally the same code thats in CVS HEAD (and thus the CVS snapshots), so if you're running a PHP 5.1 beta, you should stick with the CVS snapshots.

If you're upgrading older PDO releases that you installed via PECL, you will need to upgrade PDO and all the PDO drivers that you had installed (the drivers check to make sure that they are compatible with the version of PDO you have installed when they are loaded; if they are not, then PHP/Apache will refuse to start).

Synergy

I just set up Synergy so that I can use my WinXP laptop keyboard and mouse to slide over onto my multi-head setup at work and control those machines.

I was going to use x2x (which works under cygwin), but it seems to only see the right hand head. A bit of Googling turned up Synergy and, aside from a little bit of a learning curve, it works great.

The trick is to run the server process on the machine that has the physical keyboard and mouse, and run client processes on the machines that you want to incorporate into your virtual multi-monitor setup.

PDO intro on SitePoint

Well, it's nice to have publicity, but the SitePoint intro by David has a few factoids (and user comments) that are slightly "off", so here are some comments of my own to bolster his post:

  • PDO is Beta now, not alpha (if you're running PHP 5.1b3 that is)
  • Use pdo_drivers() if you want to see what drivers are available. The dl() function is not guaranteed to work in a lot of SAPI.
  • David missed out on my article on PDO for IBM developerWorks, which is a bit more up to date than the OTN article.
  • PDO is all C-based native code, and will, one day in the future, eventually replace the traditional database extensions.
  • PDO is data access abstraction rather than database abstraction. They are not the same thing; data access abstraction is making the way you get at the data the same, whereas database abstraction is making databases look the same. Pretty big difference.

Remember: we're relying on you to get out there and play with PDO to uncover any bugs that might be lurking; it works fine for the test cases we have. One of the rules for good QA is to have people other than the people that built something do the testing; they're sure to use things in ways that the architect didn't consider on the first run. Please try it out and report any bugs/strange behaviour to bugs.php.net--Thank you!

EvilDesk, mini release

I've pushed 0.5.1 tonight; if fixes an uninstallation buglet that could leave you without a taskbar after uninstalling EvilDesk.

The only new feature is being able to select how many workspaces the alt-tab task switcher will cycle through; you can use up to 32, with the default being 4.

Visit the ChangeLog

EvilDesk, Release 5

I've updated EvilDesk yet again this weekend. The biggest new thing is making all the hotkeys (aside from alt-tab) user configurable.

Find out more on the EvilDesk Home Page (I've added a ChangeLog section for your tracking pleasure).

EvilDesk, Release 3

[Update: Release 4 is out]

I've updated my EvilDesk and included the user-definable context menu code I mentioned in the comments of my last post.

I've also created a new home for the project, so that I can group the docs together more easily.

I will continue to publish news about updates here on my blog, so if you're already subscribed here, you needn't do anything more to keep up to date on this project.