Erlang is a hoarder

One day you set aside a shoebox to store newspaper clippings. Suddenly you are trapped under an avalanche of whole newspapers and wondering how long your body will lie there before anyone misses you.

That is what kept happening to my Erlang apps. They would store obsolete binary data in memory until memory filled up. Then they would go into swap and become unresponsive and unrecoverable. Eventually somebody would notice the smell and restart the server.

The problem seems to be related to Erlang’s memory management optimizations. Sometimes an optimization becomes pathological. If you store a piece of binary data for a while (a newspaper clipping) Erlang “optimizes” by remembering the whole binary (the newspaper). When you remove all references to that data (toss the clipping) Erlang sometimes fails to purge the data (lets the newspapers pile up everywhere). If nobody shows up to collect the garbage, Erlang dies an embarrassing death.

The first step to recovery is to monitor the app’s memory footprint and log in every so often to sweep out the detritus. It can be tricky to find the PIDs that need attention and tragic if you arrive too late. The permanent solution is to build periodic garbage collection into the app. It’s not hard to do. The only hazard is doing it too often since it incurs some CPU overhead.

Each time I have found an app doing this, I’ve had to locate the offending module and install explicit garbage collection. If there is a periodic event, such as a timeout that happens every second, I’ll use it to call something like this:

gc(Tick) ->
    case Tick rem 60 of
        0 -> erlang:garbage_collect(self());
        _ -> ok
    end.

Today I installed this simple code and here is the result:

Memory footprint reduced drastically
Memory footprint reduced drastically

CPU utilization raised slightly
CPU utilization raised slightly

For the cost of 5% of one CPU core I stopped the cycle of swap and restart. I would like to learn why my binaries are not being garbage collected automatically. The processes involved queue the binaries in lists for a short time, then send them to socket loops which dispose of them via gen_tcp:send/2. Setting fullsweep_after to 0 had no effect. I’ll be interested in any theories. However, I’m not looking for a new solution since mine is satisfactory. I hope other Erlang hackers find it useful.

Persistent PHP processes in Erlang/OTP

Running PHP code from within Erlang is easy: os:cmd("php -r 'echo \"Hello, World!\";'"). This is fine when you need to run simple commands. When you demand more from PHP, this approach becomes awkward, wasteful, and eventually unusable. If your Erlang-to-PHP calls require large PHP applications, open connections to databases, or somehow incur significant initialization overhead, you should maintain pool of reusable PHP processes.

My first complete application for Erlang/OTP is php_app. It manages a pool of persistent PHP processes and provides a simple API to evaluate PHP code. I designed php_app to be robust and easy to use. It’s so easy, in fact, that I now use it to debug WordPress functions from within Erlang. Here is a sample session using start/0 and eval/1:

$ erl
Eshell V5.6.4  (abort with ^G)
1> php:start().
ok
2> php:eval("echo 'Hello, World!';
2>           trigger_error('Uh-oh!');
2>           return array(true, true);").
{ok,<<"Hello, World!">>,
    [{0,true},{1,true}],
    <<"Uh-oh!">>,continue}
3> 

In the resulting tuple we have the output, the return value, and the last error. The atom continue indicates that the PHP process is eligible for reuse, determined by its size in memory after evaluating my code. In the next example I’ll reserve a PHP process to demonstrate persistence and what happens when we hit the memory limit.

3> Ref = php:reserve().
#Ref<0.0.0.52>
4> php:eval("$a = array_fill(0, 200000, rand());
4>           return count($a);", Ref).
{ok,<<>>,200000,<<>>,continue}
5> php:eval("$a = array_merge($a, array_fill(0, 200000, rand()));
5>           return count($a);", Ref).
{ok,<<>>,400000,<<>>,break}
6> php:eval("return count($a);", Ref).
{ok,<<>>,0,<<"Undefined variable:  a">>,continue}
7> php:release(Ref).
ok
8>

The function reserve/0 removes a PHP process from the pool and returns a key that is used in eval/2. Without a key, we can’t be sure that the same PHP process will evaluate our next string of code. Notice the correct return value of 400000 and the atom break which indicates that the PHP process has been restarted because it exceeded the memory usage limit. Our Ref now points to a fresh PHP process. The reservation remains valid.

There are a few other return tuples: one for timeouts, one for parse errors, and one for exits. That last one includes fatal errors. They can’t be trapped. You’ll just have to refer to your error logs. (You do write code with a terminal tailing all your error logs, don’t you?) Here are some more bullet points to keep in mind:

  • Never define a PHP function without first testing function_exists because you will get a fatal error every time. This is by design.
  • Be mindful of escaping quotes and control characters. User input is the enemy. Test.
  • This app was written by an Erlang novice. Do not underestimate its potential destructive power.
  • Even so, it’s in use on a production system that makes lots of PHP calls.
  • The php module has EDoc for all of the API functions.  The HTML version is included for completeness.
  • If you modify the PHPLOOP, you should restart the PHP processes. Try php:restart_all().

The code is here. All you have to do is compile it. I like to make:all(). The configuration is in php.app.

If you would like to contribute changes to the code or documentation, I will be happy to hear from you. My OTP stuff could benefit from a more experienced set of hands.

Hybrid nodetree plugin for ejabberd

A current project of mine involves the Jabber extension PubSub (XEP-0060) and I chose ejabberd to power it. Ejabberd, the software that powers jabber.org, is written in Erlang. I am having a fantastic adventure learning Erlang. The beginner mind is where I feel most comfortable. But that’s another post.

Recently I wanted to make part of a nodetree virtual. A virtual nodetree is one that doesn’t store any nodes. Thus you can have infinitely many “virtual” nodes with zero records in the pubsub_node table. For any large set of indentically configured, transient nodes, this is an ideal setup.

Ejabberd only supports one nodetree plugin per virtual host, so I created a nodetree plugin that lets me use as many different nodetree plugins as I like. A simple pattern match determines which nodetree plugin to call and calls it. Here is the pattern matching function:

nodetree(["dogs",_|_]) ->
	nodetree_dogs;
nodetree(["cats",_|_]) ->
	nodetree_cats;
nodetree(_) ->
	nodetree_default.

And this is how the rest of the plugin looks:

create_node(Key, Node, Type, Owner, Options) ->
    Nodetree = nodetree(Node),
    Nodetree:create_node(Key, Node, Type, Owner, Options).

Now I can have as many nodetree plugins as I want on a single virtual host. As a bonus, I don’t have to touch mod_pubsub when I add a new nodetree plugin. All I do is reload the nodetree_hybrid module and mod_pubsub never misses a beat. Erlang, you’re my hero.