After Sabbatical at Automattic

In May of my 10th year at Automattic the company adopted a paid sabbatical policy: 5 years on, 3 months off. It wasn’t easy to consider how to spend this opportunity. It took a couple of weeks to decide that my first sabbatical should be personal, free-form and soon.

The policy was still new and evolving. One early amendment was a planning requirement: if taking 3 months off, give notice 3 months in advance. I got in before that went into effect (or abused my senior privilege, maybe) and chose July, August and September of 2015.

It took the month of June to clean up after myself, document things and pass leadership of our in-house Hadoop projects to Xiao Yu. I offered to be on call for emergencies but Xiao knew that it was just the workahol talking. No ping would issue.

The first couple of weeks were a period of withdrawal. Soon I was able to stop scanning my inbox for important requests. One work-related tweet momentarily elevated my pulse but it was actually nothing.

We took the kids to Vermont to visit friends and family for a month. This was our shortest Vermont summer trip to date. We used to stay for fall foliage but the kids are in school now. Automattic lets me work from anywhere so I would just rent office space and keep working. But this time it was a real summer vacation for the whole family.

On the way home we crossed into Canada to see Niagara Falls. That’s one box checked, but probably worth doing again in a few years.

Other than a few Swift tutorials and false starts with iOS ideas, I didn’t code at all. Everything I did was in Xcode so my Emacs hands will need some exercise.

One significant event was my oldest child starting kindergarten. Every weekday morning at 7:15 we walk to school. This is my first externally enforced daily routine in more than ten years. This rekindled my interest in knowing what time it is throughout the day. So I got an Apple Watch.

Automattic can change a lot in just three months. People come and go, projects advance, priorities evolve. Three things are making the reintegration easy: full documentation, good search and great coworkers. It’s good to be back.

Want to work here or work with me? Apply!

Batcache for WordPress

[I meant to publicize this after a period of quiet testing and feedback but the watchdogs at WLTC upended the kitten bag and forced my hand. Batcache comes with all the usual disclaimers. If you try it on a production server expect the moon to fall on your head.]

People say WordPress can’t perform under pressure. The way most people set it up, that’s true. For those who host their blog for $7.99 a month (do they also run Vista on an 8086?) the best bet is to serve static pages rather than dynamic pages. Donncha’s WP-Super-Cache does that brilliantly. I’ve seen it raise a server’s capacity for blog traffic by one hundred times or more. It’s a cheapskate’s dream.

WP-Super-Cache is good for anyone with a single web server with a writable wp-content/cache directory. To them, the majority, I say use WP-Super-Cache. What about enterprises with multiple servers that don’t share disk space? If you can’t or won’t use file-based caching, I have something for you. It’s based on what WordPress.com uses. It’s Batcache.

Batcache will protect you

Batcache implements a very simplistic caching model that shields your database and web servers from traffic spikes: after a document has been requested X times in Y seconds, the document is cached for Z seconds and all new users are served the cached copy.

New users are defined as anybody who hasn’t interacted with your domain—once they’ve left a comment or logged in, their cookies will ensure they get fresh pages. People arriving from Digg won’t notice that the comments are a minute or two behind but they’ll appreciate your site being up.

You don’t need PHP skills to install Batcache but you do have to get Memcached working first. That can be easy or hard. We use Memcached because it’s awesome. Once you know how to install it you can create the same kind of distributed, persistent cache that underpin web giants like WordPress.com and Facebook.

What Batcache does

The first thing Batcache does is decide whether the visitor is eligible to receive cached documents. If their cookies don’t show evidence of previous interaction on that domain they are eligible. Next it decides whether the request is eligible for caching. For example, Batcache won’t interfere when a comment is being posted.

If the visitor and the request are eligible, Batcache enters its traffic metering routine. By default it looks for URLs that receive more than two hits from unrecognized users in two minutes. When a URL’s traffic crosses that threshold, Batcache caches the document for five minutes. You can configure these numbers any way you like, or turn off traffic metering and send documents right to the cache.

Once a document has been cached, it is served to eligible visitors until it expires. This is one place where Batcache is different. Most other caches delete cached documents as soon as the underlying data changes. Batcache doesn’t care if it’s serving old data because “old” is relative (and configurable).

What Batcache doesn’t do

It doesn’t guarantee a current document. I repeat this because reliable cache invalidation is a typical feature that was purposefully omitted from Batcache. There is a routine in the included plugin that tries to trigger regeneration of updated and commented posts but in some situations a document will still live in the cache until it expires. This routine will be improved over time but it is only an afterthought.

Batcache doesn’t automatically know the difference between document variants. Variants exist when two requests for the same URL can yield two different documents. Common examples are user agent-dependent variants formatted for mobile devices and referrer-dependent variants with Google search terms highlighted. In these cases you MUST take extra steps to inform Batcache about variants to avoid serving a variant to the wrong audience. The source code includes examples of how to turn off caching of uncommon variants (search term highlighting) or cache common variants separately (mobile versions).

Where Batcache is going

I want to make Batcache easier to configure by adding a configuration page and storing the main settings in memcached as well as the database. This way you won’t have to deploy a code change to update the configuration. However, conditional configurations (e.g. “never cache URLs matching some pattern”) and variant detection will probably always live in PHP.

I want to have Batcache serve correct headers more reliably. On some servers it can detect the headers that were sent with a newly generated page and serve them again from the cache. But when that doesn’t work you will have to take extra steps to serve certain headers. For example you must specify the Content-Encoding header in the Batcache configuration or add it to php.ini. I want this sort of thing to be done automatically for all server setups.

I know that Batcache is not ideal for most WordPress installations. It saves us a lot of headaches and expense at WordPress.com, so maybe it can help other large installations. If you try it, I want to hear from you whether it worked and how well. I am also keen to see what new configurations and modifications you use.

As always, this software is provided without claims or warrantees. It’s so experimental that it doesn’t even have a version number! Until the project grows to need its own blog, keep an eye on the Trac browser for updates.

Austin WordPress Professional Office

We’re thinking of opening an Automattic office in Austin. Being the only local employee, I imagine sharing the space with independent/satellite WordPress professionals. Are you interested?

Candidates should be earning all or part of their income working on WordPress (developing, designing, or servicing) and be able to defend their choice of editor. Benefits may include collaboration, networking, social opportunities, fortune, fame, romance, and French pressed Ruta Maya coffee.

We don’t have any locations in mind yet. Please include your home zip code for geographical tabulation. If you can recommend a cool location with flexible space, good bandwidth, and no long-term commitments, please do.