Web proxy in node.js for high availability

written by paul on January 31st, 2010 @ 11:20 PM

Update (3/8/10): Updated code to work with version 0.1.30 of node.js

I’ve been thinking about high availability websites lately. In particular, I want sites that can be upgraded (including database migrations or even infrastructure changes) without downtime.

I’ve also been playing with node.js lately, and I decided to spike out a web proxy that would sit between users and the actual website (eg, a rails app). When performing upgrades, the proxy would hold users connections and wait. Once the upgrade was done, the proxy would forward requests as usual. Users would see an extra long request, but as long as the upgrade was short (eg, less than a minute), the user should not know the site was down.

This type of proxy server seems like a good fit with node. Node’s event model means that there will be very little overhead when holding connections. There are no threads stacking up and waiting. Since everything is non-blocking, this server should scale well.

Here is a very simple version of the code:


var fs = require('fs'),
   sys = require('sys'),
  http = require('http');

http.createServer(function (req, res) {
  checkBalanceFile(req, res);
}).listen(8000);

function checkBalanceFile(req, res) {
  fs.stat("balance", function(err) {
    if (err) {
      setTimeout(function() {checkBalanceFile(req, res)}, 1000);
    } else {
      passThroughOriginalRequest(req, res);
    }
  });
}

function passThroughOriginalRequest(req, res) {
  var request = http.createClient(2000, "localhost").request("GET", req.url, {});
  request.addListener("response", function (response) {
    res.writeHeader(response.statusCode, response.headers);
    response.addListener("data", function (chunk) {
      res.write(chunk);
    });
    response.addListener("end", function () {
      res.close();
    });
  });
  request.close();
}

sys.puts('Server running at http://127.0.0.1:8000/');

Here is a gist if anyone would like to fork.

Basically, I use http.createServer to create a web server on port 8000. On incoming requests, I call checkBalanceFile. This method will try to stat a local file called balance. If it finds it, it will call passThroughOriginalRequest, which forwards the request to another web server on port 2000. If the balance file does not exist, I use setTimeout to call checkBalanceFile again in one second.

With a proxy server like this, the main application can be upgraded by removing the balance file. While the file is missing, the node web server will hold all of the connections and check every second for the reappearance of the balance file. Once it comes back, all requests will be forwarded along and then streamed back to the user.

Currently, this spike only works with GET requests and does not pass any headers through, since I wanted to keep the code simple.

Comments

  • Adam on 01 Feb 00:43

    I'd suggest changing it to require to the file to set it in maintenance mode. That way if you forget the file when setting up a new site you don't have to dig through your stack trying to figure out why requests are hanging.
  • Paul Gross on 01 Feb 10:35

    Adam, that's a good idea, although I wanted to keep the spike simple and server agnostic.
  • jtarchie on 01 Feb 22:46

    Isn't stat a costly operation on a file? I honestly don't know, but it would seem that having to check for an existence of the file for each request would add up a delay.
  • Paul Gross on 01 Feb 23:43

    jtarchie, I'm not sure if stat is costly. You could always change it so it polls for the balance file once a second and stores the result. Something like:
    
    var isBalanced = false;
    
    function checkBalanceFile() {
      var promise = posix.stat("balance");
      promise.addCallback(function () {
        isBalanced = true;
      });
      promise.addErrback(function () {
        isBalanced = false;
      });
      setTimeout(checkBalanceFile, 1000);
    }
    checkBalanceFile();
    
    Then, you could just check isBalanced on each request.
  • Brian on 28 Feb 20:27

    Paul, yes indeed. That's a typical optimization for any sort of expensive and cacheable system call.
  • Paul Gross on 08 Mar 00:26

    The code in the comment above looks like this in node.js version 0.1.30:
    
    var isBalanced = false;
    
    function checkBalanceFile() {
      fs.stat("balance", function(err) {
        isBalanced = !err;
      });
      setTimeout(checkBalanceFile, 1000);
    }
    checkBalanceFile();
    
  • Zhmai on 16 Mar 05:57

    Paul: Nice idea, and shows how clean this is to implement in node. As a newcomer to node, thanks for posting the revised code for node v 0.1.30+ -- it has been a bit of a challenge learning node by example when, alas, so many of the examples out on the Web suffer from being < 0.1.30 syntax -- very thoughtful of you to amend! Curious though, in your proxy, isn't passThroughOriginalRequest assuming a GET request? Shouldn't you map through the callers HTTP method? Or is that happening, and I am missing something?
  • Zhami on 16 Mar 08:19

    Ooops - sorry, missed your last sentence about HTTP method and headers...
  • mobz on 06 Jun 16:33

    var isBalanced = false; setInterval( function() { fs.stat("balance", function(err) { isBalanced = !err; }, 1000); }
  • vicgtor9011 on 19 Jun 08:35

    Now it is all corect.

Post a comment

Options:

Size

Colors