Simple maintenance mode scripts for lighttpd

June 10th, 2008 by Ryan

We recently switched to using lighttpd 1.5. Under lighttpd 1.4, we had a custom 500 page configured for our “maintenance mode”. We’d just take down the fastcgi daemon and if there were any requests while it was down, lighttpd would stop trying to talk to it for 5 seconds and instead serve out our maintenance page. Seemed ok.

Well, with lighttpd 1.5, it doesn’t try to talk to the fastcgi backend again for 60 seconds, and instead of serving back an error it would just leave the socket open, so the user’s browser would essentially hang. Not good. As a result, we wrote a script to swap out our live fastcgi process without dropping any requests (for hot swapping), and we also came up with a real “maintenance mode” for lighttpd (for real downtime like a complicated DB schema upgrade). I’ll share the fastcgi hot-swap script in a future post. Today I’ll discuss our lighttpd maintenance mode.

Our scheme doesn’t require lighttpd 1.5, but it does require that lighttpd be built with LUA support. If you do `lighttpd -V` you should see a line like ‘+ LUA support’. Kevin Worthington has built lighttpd 1.5.0 r1992 RPMs that have LUA/mod_magnet support compiled in.

We got the idea to use mod_magnet like this from John Leach’s blog post on maintenance pages status codes and lighttpd, but we removed all logic from the LUA script and provided a workaround for a lighttpd bug.

The interface to our maintenance functionality is going to be two shell scripts. Turn on maintenance mode with /etc/lighttpd/down and go back live with /etc/lighttpd/up. This simple interface is easy to use, and it abstracts away the exact method of turning things on and off so if we decide to change things later (say, touch and rm a special file or something), we can make the changes in one place.

In lighttpd.conf, we need to load mod_magnet and make sure that our maint.lua script runs for all requests.

# /etc/lighttpd/lighttpd.conf (Sample Code)
server.modules += ( "mod_magnet" )
magnet.attract-raw-url-to = ( "/etc/lighttpd/maint.lua" )

Now I make the maint.lua.up file. It does nothing. We could instead have our script run logic to determine whether or not to serve out the maintenance page, but I don’t really want LUA code running on every request if I can help it. And since I can’t claim to properly know LUA anyway, I want to keep things as simple as possible.

-- /etc/lighttpd/maint.lua.up (Sample Code)
-- This file is deliberately empty.

Now for maint.lua.down. It just serves out /etc/lighttpd/maint.html. Simple, huh? Well, actually, we need to work around lighttpd ticket #1420, so we have that hacky div in there. Oh, and I throw in a header to make it easy to distinguish this planned 503 from other kinds of wild unexpected 503s. /etc/lighttpd/maint.lua.down:

-- /etc/lighttpd/maint.lua.down (Sample Code)
lighty.header["X-Maintenance-Mode"] = "1"
lighty.content = {
    { filename = "/etc/lighttpd/maint.html" },
    "<div style=\"display:none\">",
    <!-- work around lighttpd ticket 1420 -->"
}
return 503

As for /etc/lighttpd/maint.html, put whatever you want in there. That’s your maintenance page.

Now we just need two super simple scripts to swap things out:

#!/bin/bash
# /etc/lighttpd/up (Sample Code)
cp /etc/lighttpd/maint.lua.up /etc/lighttpd/maint.lua
sleep 8
#!/bin/bash
# /etc/lighttpd/down (Sample Code)
cp /etc/lighttpd/maint.lua.down /etc/lighttpd/maint.lua
sleep 8

Why ‘sleep 8’? Well, since we’re going to want to do things like call /etc/lighttpd/down just before bringing down the backend, we want some kind of real guarantee that nobody is going to get the unresponsive browser behavior. We did some tests and it seemed like it took 8 seconds for all the relevant caches to flush so that we consistently got back a 503 from the server. I imagine that would be different for other people.

So that’s it. I hope it’s disappointingly (or perhaps refreshingly) simple. Two simple shell scripts, an exceedingly simple LUA script, and no LUA code to run unless you’re in maintenance mode. The only overhead here is that lighttpd will stat the LUA file occasionally, but it’s good at doing that unobtrusively.