2004

June 21, 2004
in Blog
1 min read

Dead laptop disk == more linux hacking

Update2: moved code to http://bitbucket.org/wez/toshkey/overview/

Update acpid now handles the brightness controls, displays the battery status in the ps list and emits power warnings once you're down to 15 minutes of power. I've also added a little non-root acpid client that will allow you to run your own stuff in response to hotkey events.

I suffered a dead (nearly; it's on its way out) laptop disk almost a week ago, and have been clawing my way back to normality.

As a side effect, I now own a Toshiba Satellite M30, which apparently has slightly more linux friendly hardware than my other Satellite (the one that's having issues).

One of the cool things is the toshiba_acpi module; it works in this model and allows access to the hotkeys so you can map them exactly as you like. Since running a standalone daemon for this sucks (you can choose either a python script or a slightly-overweight fnfxd), and since there was a feature request on the ToshibaAcpiDriver page for it, I've written this patch that adds toshiba key support to acpid (1.0.3).

Toshiba keys are exposed as button/toshiba events, followed by the 16-bit hex code for the key that was triggered, so stick some scripting magic into /etc/acpi/events and you're happily-a-mapping those keys.

June 16, 2004
in Blog
1 min read

Don't underestimate the power of xargs

I know this sounds lame, but I have never really needed it until now. I've recently been really busy with a new job taking up my working hours, and some renovation on our home taking up my time off, so I haven't been keeping on top of spam mail.

I run postfix as my MTA and have that apply spamc and anomy sanitizer to incoming mail before piping the mail into cyrus imapd. I use IMAP for mail as I could be using any of a number of different machines here to check my mail. This makes training the spam filter a little difficult, so I have a simple training-by-filing setup:

I have INBOX.junk, INBOX.junk.ham and INBOX.junk.spam. The first of these collects spam recognized by spamc during delivery. The user will then copy mail from junk into junk.ham if is was a false positive, or copy mail from INBOX into junk.spam if it wasn't recognized as spam.

Periodically, I run a script to train based on the contents of the ham and spam folders. Now, the problem I had was that I'd left way too much spam hanging around in the spam folder (the script doesn't do any pruning), and so it was failing due to the excessive number of command line arguments.

So, the moral of the story is: always use xargs for maintenance scripts, even if you don't think you'll need it.

June 6, 2004
in Conferences, php
1 min read

Come to the LAMP Area @LinuxTag

I will be giving a talk on PECL in the LAMP Area at LinuxTag this month. The talk will focus on why PECL is good for both users and developers. If you're interested in that, please come along.

I will be there all week, so there should be plenty of time to discuss cool things in PHP and/or hassle me to finish up one of the numerous things that I've started since the last conference run.

I'm looking forward to it: LinuxTag is easily the best conference in Europe (in my experience)--plenty of things to look at, in a good location, and with excellent company.

Hopefully there will be an alternative to Jolt Cola this year...

May 27, 2004
in php
1 min read

Anyone with BSDI?

If so, please, please, please (with sugar on top) try this patch to PHP 5 today.

Thanks!

May 22, 2004
in php
3 min read

First steps with PDO

Developing PDO and releasing an alpha has sparked a lot of interest already (probably helped along by George ;-)) and we got our first "how does it work" e-mail today. As it happens, I've already written a intro to PDO for the OTN but George informs me that they can take a while to publish.

Meanwhile, to avoid being swamped by mail as the word gets out, here are a couple of sample PDO scripts to get you started. Please keep in mind that it is alpha software (although still works pretty well) and requires PHP 5 from CVS (RC3 will work too, but that isn't released until next week).

The API is "grown-up" by default; you get so called "unbuffered" result sets as standard and the prepare/bind/execute API is preferred, although there are some short-cuts already (and some more planned). Note that you don't need to do any quoting manually using bound parameters; it is handled for you. You do need to be careful with magic_quotes though (as always).

<?php
$dbh = new PDO('mysql:dbname=test;host=localhost', $username, $password);
// let's have exceptions instead of silence.
// other modes: PDO_ERRMODE_SILENT (default - check $stmt->errorCode() and $stmt->errorInfo())
//              PDO_ERRMODE_WARNING (php warnings)
$dbh->setAttribute(PDO_ATTR_ERRMODE, PDO_ERRMODE_EXCEPTION);
// one-shot query
$dbh->exec("create table test(name varchar(255) not null primary key, value varchar(255));");
?>

<?php
// insert some data using a prepared statement
$stmt = $dbh->prepare("insert into test (name, value) values (:name, :value)");
// bind php variables to the named placeholders in the query
// they are both strings that will not be more than 64 chars long
$stmt->bindParam(':name', $name, PDO_PARAM_STR, 64);
$stmt->bindParam(':value', $value, PDO_PARAM_STR, 64);
// insert a record
$name = 'Foo';
$value = 'Bar';
$stmt->execute();
// and another
$name = 'Fu';
$value = 'Ba';
$stmt->execute();
// more if you like, but we're done
$stmt = null;
?>

<?php
// get some data out based on user input
$what = $_GET['what'];
$stmt = $dbh->prepare('select name, value from test where name=:what');
$stmt->bindParam('what', $what);
$stmt->execute();
// get the row using PDO_FETCH_BOTH (default if not specified as parameter)
// other modes: PDO_FETCH_NUM, PDO_FETCH_ASSOC, PDO_FETCH_OBJ, PDO_FETCH_LAZY, PDO_FETCH_BOUND
$row = $stmt->fetch();
print_r($row);
$stmt = null;
?>

<?php
// get all data row by row
$stmt = $dbh->prepare('select name, value from test');
$stmt->execute();
while ($row = $stmt->fetch(PDO_FETCH_ASSOC)) {
    print_r($row);
}
$stmt = null;
?>

<?php
// get data row by row using bound ouput columns
$stmt = $dbh->prepare('select name, value from test');
$stmt->execute();
$stmt->bindColumn('name', $name);
$stmt->bindColumn('value', $value)
while ($stmt->fetch(PDO_FETCH_BOUND)) {
    echo "name=$name, value=$value\\n";
}
?>

Oh, how do you get and install it?

Grab a PHP 5 snapshot from http://snaps.php.net (or HEAD from CVS).

   ./configure --prefix=/usr/local/php5 --with-zlib ....
   make
   make install
   export PATH="/usr/local/php5/bin:$PATH"
   /usr/local/php5/bin/pear install -f PDO
   [ now add extension=pdo.so to php.ini ]
   /usr/local/php5/bin/pear install -f PDO_MYSQL
   [ now add extension=pdo_mysql.so to php.ini ]
   /usr/local/php5/bin/php -m

There are other drivers; Search PECL for more. If you're running windows, just grab the win32 snap and the PDO dlls from PECL binaries for PHP 5.

Credits: thanks to Marcus, George, Ilia and Edin.

Please try to avoid asking too many questions about it; documentation will follow as soon as it is ready.

May 18, 2004
in php
1 min read

The Horrors of E-Mail

phpa-30-cover.jpg.t If you've ever wanted to send more than a basic email through PHP, or perform some processing on existing messages, then you might find my article in the May 2004 issue of php|architect a useful resource.

In the article, I provide you with a functional overview of the major RFCs, some hints on things to watch out for and tips on how best to do things to avoid your mail getting mangled in transit.

May 9, 2004
in Blog
1 min read

The Meatrix

I was chatting to Davey just now about how I used to enjoy Rowntrees (now owned by Nestle) Fruit Pastilles before I became a vegetarian (they're off limits now, since they contain gelatin), when I remembered this funny but sobering bit of flash.

May 8, 2004
in php
5 min read

pty support for proc_open() in PHP 5

[Update: this patch is now in CVS and will be in PHP5RC2]

Today started out good, then it got weird. I felt quite privileged, since a newcomer to the PHP world (employed by Zend) took the time to ask me what the procedure was for publishing an extension in PECL. I felt privileged because the last time a new Zend employee appeared no one said anything to anyone; no introductions; karma was granted and a big new extension just appeared in the repository.

So, today was good. I gave the new-comer a friendly welcoming email. As it turned out, their extension duplicated and/or superceded functionality in the (work-in-progress) cvsclient extension, so I suggested merging it. The newcomer had already contacted the author.

"This is great!" I thought to myself. It then became apparent that the new extension did its stuff by wrapping a CVS binary and parsing the output. While this approach works, it's not really the best thing to put into PECL. Here are two good reasons why:

Portability. fork(), exec()ing and piping is painful to get right and keep portable. I have some experience in this matter, having written this, which works under unix and win32.
Security. Doing string parsing and manipulation in C is painful. This pain is just one of the reasons why people use scripting languages instead of hand-coding C CGI. Well, strings in C isn't that painful, but secure, non-exploitable strings in C is. And it's easy to overlook something, or be off-by-one in your calculations for a buffer size.

So we invited the author to join us on IRC and talk about what to do. The conversation, in essence, consisted of Derick and Ilia making their opinions known (they were against the extension, largely due to the reasons I've mentioned above). I'll admit that this can be a bit intimidating (Derick and Ilia aren't known for beating around the bush ;-), but it wasn't a massacre. We learned that:

The extension was written as an exercise in learning how the engine works
it didn't take all that much time to write
yes, it probably could and should go into PEAR instead
HOWEVER! there is no way to emulate a pty from a PHP script, and this is required in order to send a password to the cvs binary

The conversation lulled, and then I got busy with work.

When I returned and read through the backlog, I could see that things had started to get weird. Zeev had arrived and not happy that we hadn't allowed the extension to be committed yet. The reasons against were made clear, and the reasons for were made clear--it's a working extension that could be used by people right now, and the pty problem meant that it was not possible to implement fully in PEAR. As is typical with discussions between the senior PHP people, it wasn't going anywhere ;-)

Now, I'm quite liberal when it comes to OpenSource. So long as something isn't total garbage it has merits. We hadn't seen the code yet but, being an optimist, I decided that Zend wouldn't hire a total idiot to work on the Zend Engine--so the code probably wasn't garbage. I suggested that we merge the code with the cvsclient extension, and look at implementing the features natively over time, as Sara planned to do with cvsclient anyway. This was surely a definite win:

we retain a single CVS extension in PECL. This is good for our users, who don't want a gazillion bajillion variations of the same thing.
our CVS extension gets a load of new features by merging with this new one. Good for us, good for our users.
the purist element of PECL hackers would eventually be satisified once all features were implemented natively.
The effort spent on the new extension is not wasted.

This was, I thought, warmly received as a good idea by all. Things having returned to normality, I resumed work. When I looked back at the channel a few minutes later, I saw that the discussion had continued and concluded with Zeev leaving in what I can only describe as a huff.

What happened next? A new module was added to CVS for "Public Zend Extensions". WTF? :) Totally bewildered by the crazy turn of events, I posted this message in response to the commit. The module was subsequently renamed to non-pecl and karma granted to the PHP core people.

So, what just happened? I'm not entirely sure. IMO, adding "non-pecl" is crazy. What's the point? PHP extensions are either in the main distro, in PECL, or not distributed by php.net (since we're not behind them). How are we going to manage "non-pecl"? How does it fit into our (PHP) procedures for QA, snapshots, distros and mirrors?

It seems a bit hasty.

Is there a point to all this??

Ah yes. Zeev said something along the lines of "if someone ports it to PEAR, you can delete it from CVS". With that in mind, I've written a patch for proc_open that adds pty support.

Using this patch, you can do something like this in your scripts:

<?php
    $p = proc_open("cvs -d:pserver:foobar@cvs.php.net:/repository login",
    array(
        0 => array("pty"),
        1 => array("pty"),
        2 => array("pty"),
    ),
    $pipes);
    ... read stuff from $pipes[1] and $pipes[2] (stdout, stderr) ...
    ... write stuff to $pipe[0] (stdin) ...
  ?>

What this does is similar to creating a pipe to the process, but instead creates master (for your script) and slave (for the process you're running) pty handles using the /dev/ptmx interface of your OS. This allows you to send to, and capture data from, applications that open /dev/tty explicitly--and this is generally done when interactively prompting for a password.

What you can't do is any terminal specific ioctl's from PHP land, so you can't make an xterm from PHP ... yet ;)

Another limitation (although not the fault of PHP), is that the pty stuff isn't portable to win32 (where console applications open CON$). As far as I can tell, there is no way to hook or emulate consoles under windows. Likewise, if your flavour of UNIX doesn't support the Unix98 pts interface, you can't use this feature either. The configure script detects the bits required, and everything is protected by #ifdefs in the code, so if you don't have the syscalls required, things should still build and work just as they did before.

Since we're in feature freeze for RC2, I haven't committed this yet. The patch is against HEAD, but should apply cleanly to most PHP 5 snaps or RC's you have around.

Credit where credit is due: the pty support is based on the equivalent part of the code from Shie's new "non-pecl" cvs extension. Thanks Shie for starting the day nicely, and for writing this code! I have a hunch that it was Shie's first day working for Zend and that this was just as weird a turn of events for Shie as for us.

May 8, 2004
in php
4 min read

Structured Errors in PHP

We don't have them, yet, but we might in PHP 5.1. Here's a possible vision that will make both procedural and OO programmers happy. Before the vision, I'll summarize the main problems that need to be solved.

Problems

Our error reporting mechanism consists only of severity level and textual error message. This makes it hard to handle specific errors from within our code.
Accessing the last error from outside of an error handler is not possible without writing some glue code for yourself. You can turn on the track_errors setting to store the error message (not severity) in a local variable, but this doesn't buy you much.
Since our regular mechanism is deficient, people want to use exceptions for everything. This is a problem, since exceptions are quite expensive and are not suitable to use for everything (such as E_NOTICE severity) for all extensions. It might make sense to do this for a particular extension in a particular piece of code in a particular application, but not globally. In addition, for people using code libraries, the library needs things to work one way, whereas the application wants things to work in another.

Solution

Introduce error code identifiers. These identifiers will be strings prefixed with the name of the extension and a colon. So, errors from the standard extension would have identifiers such as "standard:<errorcode>". These identifiers can be examined in the PHP code more easily than parsing error message text, and since they are prefixed by the extension name, it side-steps the event where two different libraries use the same error number for different error conditions--we won't suffer from those collisions.
The error identifiers would be raised along with the severity and textual message when the extension calls one of the php_error_docref style functions.
The error handling mechanism would populate an $_ERROR super-global with the severity, identifier and textual message. This allows the PHP script to suppress the usual error handler and make decisions based on the info it finds in $_ERROR, if they wish.

For OOP Programmers

Since all errors/warnings/notices are now structured, it would be really easy to map them to exceptions. This mapping needs to controlled within the script using a kind of stacking state. When the engine starts running, the mapping state is set so that no errors are mapped to exceptions.
An application or library could then change this so that errors from a particular extension are mapped to exceptions, or so that all errors are mapped to exceptions by using a simple pattern matching rule. This state needs to be applied to a block of code, so that setting is contained and doesn't mess with that of the calling code. The declare statement is ideal for this.

Sample for procedural programmers

<?php
function do_something_with_a_file($filename)
{
   // ensure that streams functions aren't mapped to exceptions
   // everything else retains its current exception mapping
   declare(exception_map='-standard:streams:*') {
      $fp = @fopen($filename, 'r');
      if (!$fp) {
           if ($_ERROR['code'] == 'standard:streams:E_NOENT') {
               // handle the case where the file doesn't exist
           }
      }
   }
   // now the declare block is finished, pop back to original
   // exception mapping state
}
?>

Sample for OO prgrammers

<?php
function do_something_with_a_file($filename)
{
   // ensure that streams functions are mapped to exceptions
   // everything else retains its current exception mapping
   declare(exception_map='+standard:streams:*') {
      try {
           $fp = @fopen($filename, 'r');
      } catch (Exception $e) {
           if ($e->getCode() == 'standard:streams:E_NOENT') {
               // handle the case where the file doesn't exist
           }
      }
   }
   // now the declare block is finished, pop back to original
   // exception mapping state
}
?>

As I hope you can see, this allows some flexibility in your code. You can code OO-style if you like. You can mix code snippets written using conflicting a style into your application, since well written libraries will localize their error handling preferences.

The exception mapping syntax used in the declare block should be quite simple to grasp; a plus or a minus character indicates if the pattern should be added to mapping list, or excluded from it. The rest of the string is a simple glob-style string where an asterisk acts as a wildcard. To make multiple changes, without using multiple nested declare blocks, simply separate each one by commas in the string:

<?php
  // don't map any errors from ext/standard, except
  // for streams errors
  declare(exception_map='-standard:*,+standard:streams:*') { ... }
?>

I'm fairly happy with this idea; the only thing is that the syntax in the declare block is a bit weird; it might make sense to come up with an alternative language level keyword. The important point is that any changes made to the mapping stack are popped when control leaves that section of code. If we had the finally statement, then we could do something like this:

<?php
function do_something_with_a_file($filename)
{
   // ensure that streams functions aren't mapped to exceptions
   // everything else retains its current exception mapping
   push_exception_map('-standard:streams:*');
   try {
      $fp = @fopen($filename, 'r');
      if (!$fp) {
           if ($_ERROR['code'] == 'standard:streams:E_NOENT') {
               // handle the case where the file doesn't exist
           }
      }
   } finally {
      pop_exception_map();
   }
}
?>

I like this better; it's less stuff to hack into the engine.

In conclusion then, I think this possible solution will please pretty much all the people using PHP, whether they are fans of OO or procedural code--you can write your "Enterprise" level code regardless of your preference, and drop-in well written third-party components without worrying so much about how they handle errors.

I welcome your comments!

May 5, 2004
in Windows
1 min read

Should have known better

I installed Norton Antivirus 2004 Pro edition on my laptop recently (I actually paid for Pro so that I could give my younger sister a copy to run on my old laptop; you get two licenses for less than the cost of two copies of the regular version).

Well, I was surprised to find that NAV keeps Windows Messenger open in the background, so that it can screen it for bad things happening. That wouldn't be so bad except that it eats 20MB.

I was also surprised just now when Outlook Express (a neccessary evil) decided to reply to a message that came in over news:// using Outlook. An ill-fated sequence of events led to this disaster. Having now corrected the configuration of the beast, I'm considering giving Outlook 2003 a second chance.

I just noticed the memory usage:

Outlook.exe 31MB
msmsgs.exe 22MB
winword.exe 23MB

Ouch. I don't even (officially) have Word or msmsgs (messenger) open; they are secretly loaded by other apps.

Surely you don't need much more than a couple of MB to check your email??