Help test Stats 1.5 beta

June 19, 2009 by Andy

At first we thought it was a good idea to use iframes to display reports in the Stats plugin. We’ve seen since then a lot of problems with browsers and cookies. To help resolve these issues, and in anticipation of future features, I am updating the plugin and the WordPress.com stats reporting system to remove the iframes. I just posted 1.5 beta 1. If you host your own WordPress 2.7+ blog and you use the Stats plugin, why not contribute to its development by installing this testing version? Anyone can download the beta but I don’t recommend it unless you are able to cope with potentially unstable software.

  • What are the risks of using this beta?
    You won’t lose any stats. If something goes horribly wrong it’s probably a bad download; just reinstall the latest version of Stats.
  • How does it work?
    The plugin connects to WordPress.com to get the stats reports when you request them. It uses the API key to authenticate.
  • Aside from fixing cookie problems, how is this better?
    Now it’s possible for anyone who can publish posts on your blog to see blog stats. They don’t have to be logged into a WordPress.com account. They only need the publish_posts capability (Author role) to view stats reports.
  • Where did the dropdown blog switcher go?
    Because the plugin uses a single API key to authenticate, the service doesn’t know whether the visitor is the owner of that key or some other user. So it doesn’t make much sense to show the list of blogs belonging to the API key owner. You can still use the switcher if you view your stats on any WordPress.com dashboard.
  • Where did the Stats Access panel go?
    This is also related to single API key authentication. Maybe in future we will bring administrative access back to the plugin. But until then, we have left the Stats Access panel intact on WordPress.com dashboards. You might want to bookmark dashboard.wordpress.com if you need these features on a regular basis.
  • Will this be a required upgrade?
    You mean will older version of stats be broken? Not by 1.5. Later versions may break compatibility but for now you can keep using earlier versions of Stats if you like.
  • What if I install this and still see iframes?
    This happens because your server is unable to connect to WordPress.com. I set it up to use SSL (https) in the hopes that most hosts support this. If yours does not work, I’d like to hear from you and do some testing on your host.

Idea drop: db::multi_query

February 11, 2009 by Andy

We usually do things sequentially in PHP. Any point where we can get two things done at the same time is an opportunity to reduce the total execution time. It is generally safe to execute processes in parallel when no process has a side effect that may impact any other process. Most MySQL SELECT statements fall into this class. Under certain conditions it might help to run these queries in parallel.

So I got to thinking: does PHP provide a function to issue a query and return immediately, instead of waiting while the database churns? Such a function would let me spread a batch of queries across several servers and then collect the results. It should not involve forking or making OS calls. Maybe mysql_unbuffered_query is that function.

Assuming it is, here is the basis for a parallel query system in PHP. It is obviously incomplete. I may complete it and try it out when I need to query a dataset partitioned across separate MySQL instances. The function connect_reserved($query) returns a locked MySQL link identifier, opening new connections as needed. The function release($query) removes the lock, returning the link to the pool of available database connections.

function multi_query($queries);
    foreach ( $queries as $i => $query ) {
        $link = connect_reserved($query);
        $res[$i] = mysql_unbuffered_query($query, $link);
        $ret[$i] = array();
    }

    do {
        foreach ( $res as $i => $r ) {
            if ( $row = mysql_fetch_row($r) ) {
                $ret[$i][] = $row;
            } else {
                release($queries[$i]);
                unset($res[$i]);
            }
        }
    } while ( count($res) );

    return $ret;
}

I have not used mysql_unbuffered_query. For best results, it should return as soon as the server determines that the query is valid. I have assumed that it can return while the database is still looking for records. (If it can not, this whole idea should be forgotten.) This oversimplified diagram helps illustrate how the queries run in parallel. The green line shows the beneficial overlap of query processing time.

quasi-parallel queries in PHP

Persistent PHP processes in Erlang/OTP

February 6, 2009 by Andy

Running PHP code from within Erlang is easy: os:cmd("php -r 'echo \"Hello, World!\";'"). This is fine when you need to run simple commands. When you demand more from PHP, this approach becomes awkward, wasteful, and eventually unusable. If your Erlang-to-PHP calls require large PHP applications, open connections to databases, or somehow incur significant initialization overhead, you should maintain pool of reusable PHP processes.

My first complete application for Erlang/OTP is php_app. It manages a pool of persistent PHP processes and provides a simple API to evaluate PHP code. I designed php_app to be robust and easy to use. It’s so easy, in fact, that I now use it to debug WordPress functions from within Erlang. Here is a sample session using start/0 and eval/1:

$ erl
Eshell V5.6.4  (abort with ^G)
1> php:start().
ok
2> php:eval("echo 'Hello, World!';
2>           trigger_error('Uh-oh!');
2>           return array(true, true);").
{ok,<<"Hello, World!">>,
    [{0,true},{1,true}],
    <<"Uh-oh!">>,continue}
3> 

In the resulting tuple we have the output, the return value, and the last error. The atom continue indicates that the PHP process is eligible for reuse, determined by its size in memory after evaluating my code. In the next example I’ll reserve a PHP process to demonstrate persistence and what happens when we hit the memory limit.

3> Ref = php:reserve().
#Ref<0.0.0.52>
4> php:eval("$a = array_fill(0, 200000, rand());
4>           return count($a);", Ref).
{ok,<<>>,200000,<<>>,continue}
5> php:eval("$a = array_merge($a, array_fill(0, 200000, rand()));
5>           return count($a);", Ref).
{ok,<<>>,400000,<<>>,break}
6> php:eval("return count($a);", Ref).
{ok,<<>>,0,<<"Undefined variable:  a">>,continue}
7> php:release(Ref).
ok
8>

The function reserve/0 removes a PHP process from the pool and returns a key that is used in eval/2. Without a key, we can’t be sure that the same PHP process will evaluate our next string of code. Notice the correct return value of 400000 and the atom break which indicates that the PHP process has been restarted because it exceeded the memory usage limit. Our Ref now points to a fresh PHP process. The reservation remains valid.

There are a few other return tuples: one for timeouts, one for parse errors, and one for exits. That last one includes fatal errors. They can’t be trapped. You’ll just have to refer to your error logs. (You do write code with a terminal tailing all your error logs, don’t you?) Here are some more bullet points to keep in mind:

  • Never define a PHP function without first testing function_exists because you will get a fatal error every time. This is by design.
  • Be mindful of escaping quotes and control characters. User input is the enemy. Test.
  • This app was written by an Erlang novice. Do not underestimate its potential destructive power.
  • Even so, it’s in use on a production system that makes lots of PHP calls.
  • The php module has EDoc for all of the API functions.  The HTML version is included for completeness.
  • If you modify the PHPLOOP, you should restart the PHP processes. Try php:restart_all().

The code is here. All you have to do is compile it. I like to make:all(). The configuration is in php.app.

If you would like to contribute changes to the code or documentation, I will be happy to hear from you. My OTP stuff could benefit from a more experienced set of hands.

Hybrid nodetree plugin for ejabberd

December 21, 2008 by Andy

A current project of mine involves the Jabber extension PubSub (XEP-0060) and I chose ejabberd to power it. Ejabberd, the software that powers jabber.org, is written in Erlang. I am having a fantastic adventure learning Erlang. The beginner mind is where I feel most comfortable. But that’s another post.

Recently I wanted to make part of a nodetree virtual. A virtual nodetree is one that doesn’t store any nodes. Thus you can have infinitely many “virtual” nodes with zero records in the pubsub_node table. For any large set of indentically configured, transient nodes, this is an ideal setup.

Ejabberd only supports one nodetree plugin per virtual host, so I created a nodetree plugin that lets me use as many different nodetree plugins as I like. A simple pattern match determines which nodetree plugin to call and calls it. Here is the pattern matching function:

nodetree(["dogs",_|_]) ->
	nodetree_dogs;
nodetree(["cats",_|_]) ->
	nodetree_cats;
nodetree(_) ->
	nodetree_default.

And this is how the rest of the plugin looks:

create_node(Key, Node, Type, Owner, Options) ->
    Nodetree = nodetree(Node),
    Nodetree:create_node(Key, Node, Type, Owner, Options).

Now I can have as many nodetree plugins as I want on a single virtual host. As a bonus, I don’t have to touch mod_pubsub when I add a new nodetree plugin. All I do is reload the nodetree_hybrid module and mod_pubsub never misses a beat. Erlang, you’re my hero.

Upgrade Memcached Before WordPress

November 19, 2008 by Andy

Self-hosted WordPress and WordPress MU administrators: if you are using the memcached object cache (a prerequisite for batcache), upgrade it before upgrading WordPress. There is a bug that keeps the old db_version in the options cache, preventing WordPress from remembering that it has been upgraded, and this causes it to ask you to upgrade again. In a pinch you can resolve the problem by restarting the memcached daemon.

The memcached object cache can’t be automatically upgraded because it’s not a normal plugin. Make sure to use the right version: 1.0 (sockets) or 2.0 (PECL). Only one line was changed, so you might prefer to update by hand: 1.0, 2.0.

If you aren’t sure whether you are using memcached, look for a file named object-cache.php in wp-content. If that file exists, look inside to see if the plugin name is “Memcached”.