Automattic Stats for self-hosted WordPress

The new Automattic Stats plugin is available for download. It lets self-hosted WordPress bloggers use the exact same traffic metrics system we provide to WordPress.com users. It tracks post and page views, referrers, search terms, and clicks on your external links. It takes moments to install if you already have a WordPress blog and a WordPress.com API key. And it’s totally free.

Although the code is almost exclusively my work, I must give thanks for Matt‘s guidance, Barry‘s systems wrangling, and Rudy’s barbecue, each of which were indispensable.

The rest of this post will cover technical details of the system, how it works and why it’s cool. If you have a question I didn’t answer, leave a comment and I’ll do my best to answer.

How does it work?

The plugin adds a tiny image to your blog () and that image is hosted on our servers. Every time your blog is viewed by a browser with javascript enabled, the browser downloads that image and we see a new line in our server logs. We then process the server logs and insert the data into a big MySQL database that we use to generate the lists and charts on your stats page.

There’s a little more to it than that. The plugin adds the post ID and referrer to the image URL so we know what the visitor is looking at and where they came from. We examine the referrer and if it looks like a search engine, we sift out the search terms and save those instead. Our servers also communicate with your blog from time to time, such as when you update the title of a post.

What makes it fast?

When you run your own stats system, your blog server has to do a lot of extra work to track each visit. We take that load off your server to keep it snappy.

By serving the javascript from stats.wordpress.com, we take advantage of the browser cache so that no matter how many blogs are visited, the script is only loaded once per week.

Clicks are reported asynchronously. Rather than the more common method of mangling URLs and forcing the visitor to wait during a redirect, the click stats are tracked using elements of AJAX. Your hrefs are safe and your visitors experience no delays.

The tracking gif loads fast because WordPress.com infrastructure just plain rocks.

What’s with the smiley face?

When we started developing stats for WordPress.com in 2005, Matt thought it would be cute. That’s his artwork.

No doubt, people will want to hide the smiley face. There are wrong ways to do this. Basically, anything that causes the image not to be loaded by the browser will break your stats.

Applying “display:none” to the image will break your stats. Don’t do it. If you want to hide the smiley face, add this to your stylesheet:

img#wpstats{width:0px;height:0px;overflow:hidden}

Why do my links point to WordPress.com?

All stats reports are rendered by our servers. We designed it this way for a lot of reasons. It’s faster this way because your server doesn’t have to connect to our server every time you look at your stats. It’s also better because we can update the reporting UI without forcing you to upgrade your plugin.

How much traffic can you handle?

The stats hardware is currently handling millions of views every day and we’re nowhere near capacity. We built this system with growth in mind. The software is ready to run on as many servers as we allocate for the purpose. Growing pains are inevitable but if we’ve done our job, you will never feel them.

Can I install this on my non-WordPress sites?

The short answer is that the system only supports WordPress blogs.

The long answer is that anyone with a thorough understanding of WordPress and XMLRPC could clone the plugin to work with other blogging platforms. I can’t prevent it, I won’t discourage it, I do expect it, and I don’t even mind it. There are pitfalls, however, and I do not plan to document the requirements. Here be dragons.

Anyone found abusing the system, causing undue loads on the servers, or inflicting headaches on me or Barry or anyone else, will be subject to having their API Key revoked and their name written in giant, fiery letters across the night sky to be cursed by all who see it. Please don’t abuse this free service.

Why does the date change before/after midnight?

To keep things fast and consistent, we are ignoring time zones and keeping all stats in UTC.