<?xml version="1.0" encoding="UTF-8"?>
<feed xml:lang="en-US" xmlns="http://www.w3.org/2005/Atom">
  <title>Paul Gross's Blog - Home</title>
  <id>tag:www.pgrs.net,2010:mephisto/</id>
  <generator uri="http://mephistoblog.com" version="0.8.0">Mephisto Drax</generator>
  <link href="http://www.pgrs.net/feed/atom.xml" rel="self" type="application/atom+xml"/>
  <link href="http://www.pgrs.net/" rel="alternate" type="text/html"/>
  <updated>2010-03-28T21:20:47Z</updated>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2010-03-28:15398</id>
    <published>2010-03-28T21:17:00Z</published>
    <updated>2010-03-28T21:20:47Z</updated>
    <link href="http://www.pgrs.net/2010/3/28/select_or_label-with-custom-form-builder" rel="alternate" type="text/html"/>
    <title>select_or_label with custom form builder</title>
<content type="html">
            &lt;p&gt;In our web app, we have a common UI pattern: replacing select lists (eg. drop downs) with a label if there is only one item in the list.  For example, when creating a subscription, the user must choose a plan.  Normally, there is a list of plans to choose from.  However, if there is only one plan that the user can choose, we show a label specifying the one plan they&#8217;re getting instead of showing them a list of one.&lt;/p&gt;


	&lt;p&gt;We implemented this with a custom form builder and a method called select_or_label.  It takes the same arguments as select, but only creates the select list if the list of choices has more than one element.&lt;/p&gt;


	&lt;p&gt;The view looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
&amp;lt;% form_for :subscription, :builder =&amp;gt; SelectFormBuilder, :url =&amp;gt; { :controller =&amp;gt; :subscriptions, :action =&amp;gt; :create } do |form| -%&amp;gt;
  &amp;lt;%= form.label :plan %&amp;gt;
  &amp;lt;%= form.select_or_label :plan_id, Plan.all.collect {|p| [p.name, p.id]} %&amp;gt;
&amp;lt;% end -%&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The form builder creates a span and a hidden_field if the size of the choices is 1:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
class SelectFormBuilder &amp;lt; ActionView::Helpers::FormBuilder
  include ActionView::Helpers::TagHelper

  def select_or_label(method, choices, options = {})
    if choices.size == 1
      content_tag(:span, choices.first.first) +
        hidden_field(method, :value =&amp;gt; choices.first.last)
    else
      select(method, choices, options)
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;And the spec:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
require File.dirname(__FILE__) + '/../spec_helper'

describe SelectFormBuilder, :type =&amp;gt; :helper do
  describe &quot;select_or_label&quot; do
    before do
      helper = Class.new { include ActionView::Helpers }.new
      @builder = SelectFormBuilder.new(:subscription, Subscription.new, helper, {}, nil)
    end

    it &quot;returns a span and hidden field if the size of the choices array is only one&quot; do
      html = @builder.select_or_label(:plan_id, [[&quot;name&quot;, &quot;id&quot;]])
      html.should_not have_tag(&quot;select&quot;)

      html.should have_tag(&quot;span&quot;, &quot;name&quot;)
      html.should have_tag(&quot;input[type=hidden][name=?][value=?]&quot;, &quot;subscription[plan_id]&quot;, &quot;id&quot;)
    end

    it &quot;returns a select if the size of the choices array is greater than one&quot; do
      html = @builder.select_or_label(:plan_id, [[&quot;name&quot;, &quot;id&quot;], [&quot;other name&quot;, &quot;other_id&quot;]])
      html.should_not have_tag(&quot;input&quot;)

      html.should have_tag(&quot;select[name=?]&quot;, &quot;subscription[plan_id]&quot;) do
        with_tag(&quot;option[value=?]&quot;, &quot;id&quot;, &quot;name&quot;)
        with_tag(&quot;option[value=?]&quot;, &quot;other_id&quot;, &quot;other name&quot;)
      end
    end
  end
end
&lt;/code&gt;&lt;/pre&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2010-02-28:14647</id>
    <published>2010-02-28T18:24:00Z</published>
    <updated>2010-03-03T02:31:53Z</updated>
    <link href="http://www.pgrs.net/2010/2/28/node-js-redis-and-resque" rel="alternate" type="text/html"/>
    <title>Node.js, redis, and resque</title>
<content type="html">
            &lt;p&gt;&lt;strong&gt;Update (3/2/10):&lt;/strong&gt; Updated code to work with version 0.1.30 of node.js&lt;/p&gt;


	&lt;p&gt;I&#8217;ve continued to play with &lt;a href=&quot;http://nodejs.org/&quot;&gt;node.js&lt;/a&gt;, and I&#8217;ve decided to do a follow up spike to my previous one: &lt;a href=&quot;http://www.pgrs.net/2010/2/1/web-proxy-in-node-js-for-high-availability&quot;&gt;Web proxy in node.js for high availability&lt;/a&gt;&lt;/p&gt;


	&lt;p&gt;The previous spike used node to proxy requests directly to a web server.  This spike uses node to put messages into a (&lt;a href=&quot;http://code.google.com/p/redis&quot;&gt;redis&lt;/a&gt;) queue.  Ruby background workers read from the queue, process the requests, and respond on a different queue.   When node receives the response from the background worker, it sends the response back to the waiting user.&lt;/p&gt;


	&lt;p&gt;Just like my first spike, this type of architecture can be used for high availability web sites.  Since all messages go into a queue and node holds the connections from the users, the site can be upgraded (including database migrations or infrastructure changes) as long as node and redis stay the same.  Once the upgrade is finished, the workers can resume working from the queue.  Users would see an extra long request, but as long as the upgrade was short (eg, less than a minute), the user should not know the site was down.&lt;/p&gt;


A queue has a lot of advantages over a straight proxy:
	&lt;ol&gt;
	&lt;li&gt;Easy to scale up and down by adding or removing workers&lt;/li&gt;
		&lt;li&gt;Can use priority queues to prioritize more important web requests&lt;/li&gt;
		&lt;li&gt;Easy to monitor (eg, how many messages are in the queue, how fast are they being added)&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;Here is a very simple version of the code.  First, the node webserver (using &lt;a href=&quot;http://github.com/fictorial/redis-node-client/&quot;&gt;redis-node-client&lt;/a&gt;):&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;javascript&quot;&gt;
var sys = require('sys'),
   http = require('http'),
  redis = require(&quot;./redisclient&quot;);

var queuedRes = {}
var counter = 1;

http.createServer(function (req, res) {
  pushOnQueue(req, res);
}).listen(8000);

function pushOnQueue(req, res) {
  requestNumber = counter++;

  message = JSON.stringify({
    &quot;class&quot;: &quot;RequestProcessor&quot;,
    &quot;args&quot;: [ {&quot;node_id&quot;: requestNumber, &quot;url&quot;: req.url} ]
  });

  client.rpush('resque:queue:requests', message);
  queuedRes[requestNumber] = res
}

var client = new redis.Client();
client.connect(function() {
  popFromQueue();
});

function popFromQueue() {
  client.lpop('responses', handleResponse);
}

function handleResponse(err, result) {
  if (result == null) {
    setTimeout(function() { popFromQueue(); }, 100);
  } else {
    json = JSON.parse(result);
    requestNumber = json.node_id;
    body = unescape(json.body);
    res = queuedRes[requestNumber];
    res.writeHeader(200, {'Content-Type': 'text/plain'});
    res.write(body);
    res.close();
    delete queuedRes[requestNumber];
    popFromQueue();
  }
}

sys.puts('Server running at http://127.0.0.1:8000/');
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Also available as a &lt;a href=&quot;http://gist.github.com/317707&quot;&gt;gist&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;pushOnQueue() is called on incoming web requests.  This creates a &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt; message and pushes it on the resque:queue:requests queue.  It also puts the res object into a hash so it can be retrieved again on the way back.&lt;/p&gt;


	&lt;p&gt;At the same time, a queue listener is set up using redis.Client().  On
connect, popFromQueue() is called.  This method pops messages from the
responses queue and calls handleResponse().  If the pop did not find a
message, it is scheduled to call again in 100 milliseconds.  If it did find a
message, the message is parsed with &lt;span class=&quot;caps&quot;&gt;JSON&lt;/span&gt;, the requestNumber is pulled out, and
the original res object is pulled out of the queuedRes hash.  The res object is then
sent the body of the message from the queue, which makes it back to the user.&lt;/p&gt;


	&lt;p&gt;On the other side, I have a ruby worker using
&lt;a href=&quot;http://github.com/defunkt/resque&quot;&gt;resque&lt;/a&gt;:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;javascript&quot;&gt;
class RequestProcessor
  @queue = :requests

  APP = Rack::Builder.new do
    use Rails::Rack::Static
    use Rack::CommonLogger
    run ActionController::Dispatcher.new
  end

  RACK_BASE_REQUEST = {
    &quot;PATH_INFO&quot; =&amp;gt; &quot;/things&quot;,
    &quot;QUERY_STRING&quot; =&amp;gt; &quot;&quot;,
    &quot;REQUEST_METHOD&quot; =&amp;gt; &quot;GET&quot;,
    &quot;SERVER_NAME&quot; =&amp;gt; &quot;localhost&quot;,
    &quot;SERVER_PORT&quot; =&amp;gt; &quot;3000&quot;,
    &quot;rack.errors&quot; =&amp;gt; STDERR,
    &quot;rack.input&quot; =&amp;gt; StringIO.new(&quot;&quot;),
    &quot;rack.multiprocess&quot; =&amp;gt; true,
    &quot;rack.multithread&quot; =&amp;gt; false,
    &quot;rack.run_once&quot; =&amp;gt; false,
    &quot;rack.url_scheme&quot; =&amp;gt; &quot;http&quot;,
    &quot;rack.version&quot; =&amp;gt; [1, 0],
  }

  def self.perform(hash)
    url = hash.delete(&quot;url&quot;)

    request = RACK_BASE_REQUEST.clone
    request[&quot;PATH_INFO&quot;] = url
    response = APP.call(request)

    body = &quot;&quot; 
    response.last.each { |part| body &amp;lt;&amp;lt; part }

    hash[&quot;body&quot;] = URI.escape(body)
    cmd = &quot;redis-cli rpush responses #{hash.to_json.inspect}&quot; 
    system cmd
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Also available as a &lt;a href=&quot;http://gist.github.com/317709&quot;&gt;gist&lt;/a&gt;.&lt;/p&gt;


	&lt;p&gt;The worker can be started with:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
env QUEUE=requests INTERVAL=1 rake environment resque:work
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;This worker uses resque, which polls the queue and calls perform when a
message is received.  The perform method builds the Rack request and runs the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; from
the message through Rails.  It then pushes the response body onto the
responses queue using redis-cli.&lt;/p&gt;


	&lt;p&gt;As before, this spike only works with &lt;span class=&quot;caps&quot;&gt;GET&lt;/span&gt; requests and does not pass any headers through to keep the code simple.  Comments and forks are welcome.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2010-02-01:13677</id>
    <published>2010-02-01T05:20:00Z</published>
    <updated>2010-03-08T06:19:30Z</updated>
    <link href="http://www.pgrs.net/2010/2/1/web-proxy-in-node-js-for-high-availability" rel="alternate" type="text/html"/>
    <title>Web proxy in node.js for high availability</title>
<content type="html">
            &lt;p&gt;&lt;strong&gt;Update (3/8/10):&lt;/strong&gt; Updated code to work with version 0.1.30 of node.js&lt;/p&gt;


	&lt;p&gt;I&#8217;ve been thinking about high availability websites lately.  In particular, I want sites that can be upgraded (including database migrations or even infrastructure changes) without downtime.&lt;/p&gt;


	&lt;p&gt;I&#8217;ve also been playing with &lt;a href=&quot;http://nodejs.org&quot;&gt;node.js&lt;/a&gt; lately, and I decided to spike out a web proxy that would sit between users and the actual website (eg, a rails app).  When performing upgrades, the proxy would hold users connections and wait.  Once the upgrade was done, the proxy would forward requests as usual.  Users would see an extra long request, but as long as the upgrade was short (eg, less than a minute), the user should not know the site was down.&lt;/p&gt;


	&lt;p&gt;This type of proxy server seems like a good fit with node.  Node&#8217;s event model means that there will be very little overhead when holding connections.  There are no threads stacking up and waiting.  Since everything is non-blocking, this server should scale well.&lt;/p&gt;


	&lt;p&gt;Here is a very simple version of the code:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;javascript&quot;&gt;
var fs = require('fs'),
   sys = require('sys'),
  http = require('http');

http.createServer(function (req, res) {
  checkBalanceFile(req, res);
}).listen(8000);

function checkBalanceFile(req, res) {
  fs.stat(&quot;balance&quot;, function(err) {
    if (err) {
      setTimeout(function() {checkBalanceFile(req, res)}, 1000);
    } else {
      passThroughOriginalRequest(req, res);
    }
  });
}

function passThroughOriginalRequest(req, res) {
  var request = http.createClient(2000, &quot;localhost&quot;).request(&quot;GET&quot;, req.url, {});
  request.addListener(&quot;response&quot;, function (response) {
    res.writeHeader(response.statusCode, response.headers);
    response.addListener(&quot;data&quot;, function (chunk) {
      res.write(chunk);
    });
    response.addListener(&quot;end&quot;, function () {
      res.close();
    });
  });
  request.close();
}

sys.puts('Server running at http://127.0.0.1:8000/');
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Here is a &lt;a href=&quot;http://gist.github.com/291468&quot;&gt;gist&lt;/a&gt; if anyone would like to fork.&lt;/p&gt;


	&lt;p&gt;Basically, I use http.createServer to create a web server on port 8000.  On incoming requests, I call checkBalanceFile.  This method will try to stat a local file called balance.  If it finds it, it will call passThroughOriginalRequest, which forwards the request to another web server on port 2000.  If the balance file does not exist, I use setTimeout to call checkBalanceFile again in one second.&lt;/p&gt;


	&lt;p&gt;With a proxy server like this, the main application can be upgraded by removing the balance file.  While the file is missing, the node web server will hold all of the connections and check every second for the reappearance of the balance file.  Once it comes back, all requests will be forwarded along and then streamed back to the user.&lt;/p&gt;


	&lt;p&gt;Currently, this spike only works with &lt;span class=&quot;caps&quot;&gt;GET&lt;/span&gt; requests and does not pass any headers through, since I wanted to keep the code simple.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2010-01-15:13250</id>
    <published>2010-01-15T06:43:00Z</published>
    <updated>2010-01-15T06:44:46Z</updated>
    <link href="http://www.pgrs.net/2010/1/15/rake_commit_tasks-now-supports-git" rel="alternate" type="text/html"/>
    <title>rake_commit_tasks now supports git</title>
<content type="html">
            &lt;p&gt;The &lt;a href=&quot;http://github.com/pgr0ss/rake_commit_tasks&quot;&gt;rake_commit_tasks&lt;/a&gt; plugin now has preliminary support for git.  &lt;a href=&quot;http://github.com/pgr0ss/rake_commit_tasks&quot;&gt;rake_commit_tasks&lt;/a&gt; is a rails plugin which contains a set of rake tasks for checking your project into source control (git or subversion).&lt;/p&gt;


	&lt;p&gt;The workflow for committing and pushing with git is slightly different from subversion.  The current steps of &#8220;rake commit&#8221; with git are roughly:&lt;/p&gt;


	&lt;ol&gt;
	&lt;li&gt;Resets soft back to origin/branch_name (git reset&#8212;soft origin/branch_name)&lt;/li&gt;
		&lt;li&gt;Adds new files to git and removes deleted files (git add -A .)&lt;/li&gt;
		&lt;li&gt;Prompts for a commit message&lt;/li&gt;
		&lt;li&gt;Commits to git (git commit -m &#8217;...&#8217;)&lt;/li&gt;
		&lt;li&gt;Pulls changes from origin and does a rebase to keep a linear history (git pull&#8212;rebase)&lt;/li&gt;
		&lt;li&gt;Runs the default rake task (rake default)&lt;/li&gt;
		&lt;li&gt;Checks cruisecontrol.rb to see if the build is passing&lt;/li&gt;
		&lt;li&gt;Pushes the commit to origin (git push origin branch_name)&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;The &#8220;git reset&#8212;soft&#8221; in #1 is used to collapse unpushed commits.  Each time &#8220;rake commit&#8221; is run, any commits that have not been pushed are undone and the changes are put into the index.  Then, the &#8220;git add -A .&#8221; adds the new changes.  Now, the &#8220;git commit&#8221; command will create one commit with all of the unpushed changes.&lt;/p&gt;


	&lt;p&gt;This collapsing comes in handy when &#8220;rake commit&#8221; fails (for example, a broken test).  Once the test is fixed, the fix should go into the same commit as the original work.  Without the &#8220;git reset&#8221; command, there will be two commits (the original, and the one with the fix).&lt;/p&gt;


	&lt;p&gt;The &#8220;&#8212;rebase&#8221; flag is used in #5 when running &#8220;git pull&#8221; to keep a linear history without merge commits.  If someone else has committed and pushed, a normal &#8220;git pull&#8221; will create a merge commit merging the other person&#8217;s work with your own.  The &#8220;git pull&#8212;rebase&#8221; undoes the local commit, does a &#8220;git pull&#8221; and then replays the local commit on top.  Merge commits are useful when there are multiple streams of work, such as a release branch.  However, when everyone is working in master, they merely clutter the history.&lt;/p&gt;


	&lt;p&gt;Comments and patches are welcome.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2009-06-27:9076</id>
    <published>2009-06-27T20:17:00Z</published>
    <updated>2009-06-27T20:18:43Z</updated>
    <link href="http://www.pgrs.net/2009/6/27/railsconf-presentation" rel="alternate" type="text/html"/>
    <title>RailsConf Presentation</title>
<content type="html">
            &lt;p&gt;This post is late, but the slides from our RailsConf presentation are online:&lt;/p&gt;


	&lt;p&gt;&lt;a href=&quot;http://en.oreilly.com/rails2009/public/schedule/detail/8706&quot;&gt;Rails in the Large:How We&#8217;re Developing the Largest Rails Project in the World&lt;/a&gt;&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2009-02-20:5413</id>
    <published>2009-02-20T20:04:00Z</published>
    <updated>2009-02-20T21:17:54Z</updated>
    <link href="http://www.pgrs.net/2009/2/20/useful-unix-tricks-part-3" rel="alternate" type="text/html"/>
    <title>Useful unix tricks - part 3</title>
<content type="html">
            &lt;p&gt;Here is part 3 of &lt;a href=&quot;http://www.pgrs.net/2007/9/6/useful-unix-tricks&quot;&gt;Useful unix tricks&lt;/a&gt; and &lt;a href=&quot;http://www.pgrs.net/2007/10/8/useful-unix-tricks-part-2&quot;&gt;Useful unix tricks &#8211; part 2&lt;/a&gt;.&lt;/p&gt;


	&lt;h3&gt;!! is the previous command in the shell history&lt;/h3&gt;


	&lt;p&gt;It is pretty common to want to rerun the previous command, possibly with something new on the beginning or end.  !! is that command in the history.  For example:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% tail foo                          
tail: cannot open `foo' for reading: Permission denied

% sudo !!                           
sudo tail foo
hello world
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;As you can see, I forgot to sudo the first command.  Now, I want to rerun it with a sudo at the front, so I can just do &#8220;sudo !!&#8221; and press enter.  The shell will print out the command it is running, followed by whatever it would print normally.&lt;/p&gt;


	&lt;h3&gt;Tail multiple files at once&lt;/h3&gt;


	&lt;p&gt;The tail command can take multiple files, and it will show the output of each one.  You can combine this with the -f flag, and tail will intersperse the output of each file in real time.  This is incredibly handy for looking at log files.  For example, we can tail both the apache and rails logs to see the requests:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
==&amp;gt; log/production.log &amp;lt;==

Processing MephistoController#dispatch (for 127.0.0.1 at 2009-02-20 13:33:31) [GET]
  Parameters: {&quot;action&quot;=&amp;gt;&quot;dispatch&quot;, &quot;path&quot;=&amp;gt;[&quot;2008&quot;, &quot;7&quot;, &quot;19&quot;, &quot;capistrano-with-pairing-stations&quot;], &quot;controller&quot;=&amp;gt;&quot;mephisto&quot;}
Completed in 784ms (View: 0, DB: 260) | 200 OK [http://www.pgrs.net/2008/7/19/capistrano-with-pairing-stations]

==&amp;gt; /var/log/apache2/access.log &amp;lt;==
127.0.0.1 - - [20/Feb/2009:13:33:31 -0600] &quot;GET /2008/7/19/capistrano-with-pairing-stations HTTP/1.1&quot; 200 16049 &quot;http://www.pgrs.net/&quot; &quot;Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.0.6) Gecko/2009011912 Firefox/3.0.6 Ubiquity/0.1.5&quot; 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;As you can see, tail prints ==&gt; &amp;lt;== to show which file the output is for.&lt;/p&gt;


	&lt;h3&gt;Use vim -b to show nonprintable characters&lt;/h3&gt;


	&lt;p&gt;Sometimes a file will have nonprintable characters, such as windows line breaks.  Most editors won&#8217;t show them, but you can use &#8220;vim -b&#8221; to see and edit them.  The -b flag tells vim to use binary mode.  For example, here is a file with windows line endings:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% cat foo.txt 
Hello
World

% vim -b foo.txt
Hello^M
World^M
^M
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;As you can see, the vim binary mode can see the line endings whereas cat cannot.&lt;/p&gt;


	&lt;h3&gt;** is a recursive wildcard in zsh&lt;/h3&gt;


	&lt;p&gt;You can use the recursive wildcard ** in zsh to do complex matching.  For example, let&#8217;s say that you want to search all ruby files in the current project for the string &lt;span class=&quot;caps&quot;&gt;RAILS&lt;/span&gt;_ENV.  Normally, you would do something like:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% find . -name '*.rb' | xargs grep RAILS_ENV
./config/environment.rb:# ENV['RAILS_ENV'] ||= 'production'
./test/test_helper.rb:ENV[&quot;RAILS_ENV&quot;] = &quot;test&quot; 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In zsh, you can accomplish the same with a much simpler command:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% grep RAILS_ENV **/*.rb
config/environment.rb:# ENV['RAILS_ENV'] ||= 'production'
test/test_helper.rb:ENV[&quot;RAILS_ENV&quot;] = &quot;test&quot; 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The wildcard **/*.rb recursively matches any files that end in .rb, so there is no need for a find command.&lt;/p&gt;


	&lt;p&gt;If there are a lot of files, you will occasionally get the error:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% grep RAILS_ENV **/*.rb
zsh: argument list too long: grep
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;This means that the **/*.rb match returned too many arguments to handle.  In this case, you can use echo and xargs to get the job done, which is still simpler than the find command:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% echo **/*.rb | xargs grep RAILS_ENV
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;find -X will show bad filenames&lt;/h3&gt;


	&lt;p&gt;It is pretty command to run find and then pass the arguments into xargs.  However, if any filenames contain spaces or quotes, xargs will fail.  You can use find -X to find any paths that will fail.  find will warn on these paths and then skip them:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% find -X .
.
find: ./filename with spaces: illegal path
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;If you want to use xargs with these files, use the -print0 option to tell find to use a &lt;span class=&quot;caps&quot;&gt;NUL&lt;/span&gt; character instead of a space, and xargs -0 to tell xargs to parse on &lt;span class=&quot;caps&quot;&gt;NUL&lt;/span&gt; instead of space:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% find . -print0 | xargs -0 echo
. ./filename with spaces
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;cd &#8211; will return to the previous folder&lt;/h3&gt;


	&lt;p&gt;Passing &#8211; into the cd command will return you to the last folder you were in:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
/tmp% pwd
/tmp

/tmp% cd ~

~% cd -
/tmp

/tmp% 
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;Use ctrl+z and kill %1 to kill a process that will not die&lt;/h3&gt;


	&lt;p&gt;Sometimes, you run a command and pressing ctrl+c will not kill it.  When that happened, I use to open up another terminal window to kill -9 the process until someone showed me the following trick:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% sleep 1000
^Z
zsh: suspended  sleep 1000
% kill -9 %1
% 
[1]  + killed     sleep 1000
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Pressing ctrl+z suspends the process and returns you to a terminal prompt.  Then, kill -9 %1 sends the kill -9 signal to job #1, which is our suspended process.&lt;/p&gt;


	&lt;h4&gt;pwdx shows the working directory of a process&lt;/h4&gt;


	&lt;p&gt;It can be really useful to see the working directory of a running process.  For example, you can see which release a ruby process is running:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% sudo pwdx 23961
23961: /var/www/myapp/releases/20081231200733
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Unfortunately, I haven&#8217;t found pwdx for the mac.  If anyone knows how I can install it, please let me know.&lt;/p&gt;


	&lt;h3&gt;Use sh -x to debug shell scripts&lt;/h3&gt;


	&lt;p&gt;If you want to know what commands a shell script runs, run it with the -x flag.  For example, say we have a shell script with two echos.  Compare the output with and without the -x flag:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% sh foo.sh 
hello
world

% sh -x foo.sh
+ echo hello
hello
+ echo world
world

% zsh -x foo.sh
+foo.sh:1&amp;gt; echo hello
hello
+foo.sh:2&amp;gt; echo world
world
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;As you can see, the -x flag shows which command is being run.  zsh takes it a step farther and shows the script and line number as well.&lt;/p&gt;


	&lt;h3&gt;sysctl replaces /proc on macs&lt;/h3&gt;


	&lt;p&gt;The /proc filesystem is a great way to find out about a linux machine.  For example, you can &#8220;cat /proc/cpuinfo&#8221; to find out how many processors are on the box.  However, macs don&#8217;t have /proc.  You can use sysctl instead.  The -a flag prints out all keys and values:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% sysctl -a
kern.ostype = Darwin
kern.osrelease = 9.6.0
kern.osrevision = 199506
kern.version = Darwin Kernel Version 9.6.0: Mon Nov 24 17:37:00 PST 2008; root:xnu-1228.9.59~1/RELEASE_I386
...
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;You can also just get a single value with the -n flag.  For example, this command will print out the number of cpu cores:&lt;/p&gt;


&lt;pre&gt;&lt;code&gt;
% sysctl -n hw.ncpu
2
&lt;/code&gt;&lt;/pre&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2009-02-20:5411</id>
    <published>2009-02-20T19:15:00Z</published>
    <updated>2010-01-15T05:35:21Z</updated>
    <link href="http://www.pgrs.net/2009/2/20/automerging-now-in-rake_commit_tasks" rel="alternate" type="text/html"/>
    <title>Automerging now in rake_commit_tasks</title>
<content type="html">
            &lt;p&gt;The &lt;a href=&quot;http://github.com/pgr0ss/rake_commit_tasks&quot;&gt;rake_commit_tasks&lt;/a&gt; plugin now supports automatically merging changes from branch to trunk.  I describe the feature and the use case at &lt;a href=&quot;http://www.pgrs.net/2007/10/16/automatically-merge-changes-from-branch-to-trunk&quot;&gt;Automatically merge changes from branch to trunk&lt;/a&gt;, although the merging code now uses &#8220;svn merge&#8221; instead of &#8220;svn diff&#8221; in order to keep svn mergeinfo.&lt;/p&gt;


	&lt;p&gt;Basically, if you branch to release code and then fix a bug on the branch, the change will automatically be merged over to the trunk when you run a &#8220;rake commit.&#8221;  Just set &lt;span class=&quot;caps&quot;&gt;PATH&lt;/span&gt;_TO_TRUNK_WORKING_COPY to the location of the trunk checkout in your Rakefile.&lt;/p&gt;


	&lt;p&gt;If you are curious, you can check out the &lt;a href=&quot;http://github.com/pgr0ss/rake_commit_tasks/commit/66a3a4867e8141345c4490d7725d7b82067d9c43&quot;&gt;commit&lt;/a&gt; at github.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2009-01-10:4624</id>
    <published>2009-01-10T23:43:00Z</published>
    <updated>2009-01-10T23:44:22Z</updated>
    <link href="http://www.pgrs.net/2009/1/10/flight-delays-application-overhaul" rel="alternate" type="text/html"/>
    <title>Flight delays application overhaul</title>
<content type="html">
            &lt;p&gt;In &lt;a href=&quot;http://www.pgrs.net/2007/7/29/flight-delay-information-for-united-flights&quot;&gt;Flight delay information for United flights&lt;/a&gt;, I talked about an application I wrote to show United flight delays over time.  I have now completely rewritten that application to allow comparison of multiple flights on one graph.&lt;/p&gt;


	&lt;p&gt;People that travel for a living know that early morning flights tend to be less delayed than evening flights.  In the morning, the planes are usually already at the airport, so there is no chance of an incoming flight delay.  There are no lines of planes waiting to take off yet, so the time between leaving the gate and getting into the air tends to be a lot less.&lt;/p&gt;


	&lt;p&gt;The difference in these times can be dramatic.  Here is a report comparing an early morning flight with an evening flight from Newark to Chicago (two heavily delayed airports): &lt;a href=&quot;http://delays.pgrs.net/delay_report?flight[0][day_of_week]=4&amp;amp;flight[0][dest]=ORD&amp;amp;flight[0][flight_num]=655&amp;amp;flight[0][origin]=EWR&amp;amp;flight[1][day_of_week]=4&amp;amp;flight[1][dest]=ORD&amp;amp;flight[1][flight_num]=635&amp;amp;flight[1][origin]=EWR&quot;&gt;Flight Delays&lt;/a&gt;&lt;/p&gt;


	&lt;p&gt;The report shows a table of min, max, and median delays:&lt;/p&gt;


	&lt;table&gt;
		&lt;tr&gt;
			Flight
			Min Delay
			Median Delay
			Max Delay
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;Flight 655 (EWR -&amp;gt; &lt;span class=&quot;caps&quot;&gt;ORD&lt;/span&gt;) on Thursday at 07:38 PM&lt;/td&gt;
			&lt;td&gt;-14&lt;/td&gt;
			&lt;td&gt;40&lt;/td&gt;
			&lt;td&gt;239&lt;/td&gt;
		&lt;/tr&gt;
		&lt;tr&gt;
			&lt;td&gt;Flight 635 (EWR -&amp;gt; &lt;span class=&quot;caps&quot;&gt;ORD&lt;/span&gt;) on Thursday at 05:58 AM&lt;/td&gt;
			&lt;td&gt;-26&lt;/td&gt;
			&lt;td&gt;0&lt;/td&gt;
			&lt;td&gt;157&lt;/td&gt;
		&lt;/tr&gt;
	&lt;/table&gt;




	&lt;p&gt;&lt;br /&gt;
And here are the two graphs shown in the report above:&lt;/p&gt;


	&lt;p&gt;
&lt;/p&gt;


	&lt;p&gt;As you can see from the first graph, day by day, flight 655 (departing around 7:38 PM) is almost always more delayed than flight 635 (departing around 5:58 AM).&lt;/p&gt;


	&lt;p&gt;The second graph shows a histogram.  You can see that flight 635 is clustered more heavily to the left (-40 to 20) which shows that it is generally between 40 minutes early and 20 minutes late.  Flight 655 is much more spread out to the right, which shows that it has far more delays.  On one day, it was over 220 minutes late!&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-12-31:4488</id>
    <published>2008-12-31T20:15:00Z</published>
    <updated>2008-12-31T20:16:07Z</updated>
    <link href="http://www.pgrs.net/2008/12/31/strange-behavior-with-define_method-and-the-wrong-number-of-arguments" rel="alternate" type="text/html"/>
    <title>Strange behavior with define_method and the wrong number of arguments</title>
<content type="html">
            &lt;p&gt;I noticed the other day that methods defined using define_method have very strange behavior when given the wrong number of arguments.  For example, here is a class with a bunch of methods defined using define_method:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
class Foo
  define_method :no_args do
    p &quot;no args&quot; 
  end

  define_method :one_arg do |one|
    p one
  end

  define_method :two_args do |one, two|
    p one
    p two
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Now, if we call no_args with an argument, it will silently ignore the argument:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
&amp;gt;&amp;gt; Foo.new.no_args(1)
&quot;no args&quot; 
=&amp;gt; nil
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;However, if we have a method that expects one argument but receives either none or more than one, we get a warning:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
&amp;gt;&amp;gt; Foo.new.one_arg
./foo.rb:6: warning: multiple values for a block parameter (0 for 1)
    from (irb):3
nil
=&amp;gt; nil

&amp;gt;&amp;gt; Foo.new.one_arg(1,2,3)
./foo.rb:6: warning: multiple values for a block parameter (3 for 1)
    from (irb):2
[1, 2, 3]
=&amp;gt; nil
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;In the second case, it took all three arguments and passed them as an array into the method expecting one argument.&lt;/p&gt;


	&lt;p&gt;It gets even stranger with a method that expects two arguments.  Now, we actually get errors:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
&amp;gt;&amp;gt; Foo.new.two_args
ArgumentError: wrong number of arguments (0 for 2)
    from (irb):2:in 'two_args'
    from (irb):2

&amp;gt;&amp;gt; Foo.new.two_args(1,2,3)
ArgumentError: wrong number of arguments (3 for 2)
    from ./foo.rb:10:in 'two_args'
    from (irb):3
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;I&#8217;m not sure why a one argument method gives a warning while a two argument method gives an error.  Clearly, define_method is very different from using def.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-12-22:4299</id>
    <published>2008-12-22T17:43:00Z</published>
    <updated>2008-12-22T17:43:51Z</updated>
    <link href="http://www.pgrs.net/2008/12/22/mephisto-with-phusion-passenger" rel="alternate" type="text/html"/>
    <title>Mephisto with Phusion Passenger</title>
<content type="html">
            &lt;p&gt;I recently upgraded my blogging software, &lt;a href=&quot;http://mephistoblog.com&quot;&gt;Mephisto&lt;/a&gt;, from 0.7.3 to 0.8.1.  One thing I noticed is that they moved the cached files from public to a cache subfolder containing the site.  For example, on a new installation, the cached index page is in public/cache/unusedfornow.com/index.html.&lt;/p&gt;


	&lt;p&gt;Mephisto writes a cached page for every page visited.  This means that any subsequent requests for this page can be served directly by apache from the cached file rather than going through the whole rails stack (all the way down to the database).  This is much faster and uses less memory.&lt;/p&gt;


	&lt;p&gt;I run my blog in Apache with &lt;a href=&quot;http://www.modrails.com&quot;&gt;Phusion Passenger&lt;/a&gt;.  The problem with this new cache location is that Passenger only looks in public for cached files.  This means that the cached pages are ignored and every request is being served by Rails.  After searching google and working some mod_rewrite magic, I came up with the following solution.  Here is the Apache virtual host configuration for my blog:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;apache&quot;&gt;
&amp;lt;VirtualHost *:80&amp;gt;
    ServerName pgrs.net
    ServerAlias www.pgrs.net

    DocumentRoot /var/www/mephisto-0.8.1/public

    RailsAllowModRewrite on
    RewriteEngine On

    # Rewrite / to index.html
    RewriteRule ^/$ /index.html [QSA] 

    # Rewrite /some_page to /some_page.html
    RewriteRule ^([^.]+?)/?$ $1.html [QSA]

    # If cached file exists, serve it and stop processing
    RewriteCond %{DOCUMENT_ROOT}/cache/unusedfornow.com%{REQUEST_FILENAME} -f
    RewriteRule ^(.*)$ /cache/unusedfornow.com$1 [L]

    ErrorLog /var/log/apache2/pgrs-error.log
    CustomLog /var/log/apache2/pgrs-access.log combined
&amp;lt;/VirtualHost&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The first 3 lines are standard Phusion Passenger configuration: &lt;a href=&quot;http://www.modrails.com/documentation/Users%20guide.html#_deploying_a_ruby_on_rails_application&quot;&gt;Deploying a Ruby on Rails application&lt;/a&gt;.  Then, I turn on mod_rewrite.  The first two sets of mod_rewrite configuration cascade and turn the request into what the filename will look like.  So / becomes /index.html, and /2008/10/29/deploying-trunk-or-tags-with-capistrano becomes /2008/10/29/deploying-trunk-or-tags-with-capistrano.html.&lt;/p&gt;


	&lt;p&gt;The final set checks if this file exists under /var/www/mephisto-0.8.1/public/cache/unusedfornow.com (the -f flag), and if it does, tells apache to serve this file.  The [L] tells mod_rewrite that this is the last rule, so it should stop processing now.  If the file does not exist, the request falls through mod_rewrite and Passenger picks it up and serves it through Rails.&lt;/p&gt;


	&lt;p&gt;I verified that this works by looking at the response headers in Firefox (Tools -&amp;gt; Page Info -&amp;gt; Headers) of any given blog page.  The first time, there is a &#8220;X-Powered-By: Phusion Passenger (mod_rails/mod_rack) 2.0.3&#8221; header.  Once I refresh, the X-Powered-By header is gone since the request never makes it to Passenger.  Apache is once again doing the hard work, and Rails is only used when the request is new or dynamic (such as searching).&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-10-29:3944</id>
    <published>2008-10-29T01:34:00Z</published>
    <updated>2008-10-29T01:34:24Z</updated>
    <link href="http://www.pgrs.net/2008/10/29/deploying-trunk-or-tags-with-capistrano" rel="alternate" type="text/html"/>
    <title>Deploying trunk or tags with capistrano</title>
<content type="html">
            &lt;p&gt;On my current project, we use capistrano for all of our deployments.  In the simplest case, you tell capistrano the &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; of your repository, and then you deploy by performing a checkout from this repository:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
set :repository,  &quot;http://www.example.com/svn/myproject/trunk&quot; 
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;However, putting this line in the capistrano recipe only lets you deploy from trunk.  We needed the ability to deploy either the trunk or a tag of our choice.  We generally deploy the trunk to development servers and the latest tag to staging and production servers.&lt;/p&gt;


	&lt;p&gt;We started out with something more complicated, but with the help of &lt;a href=&quot;http://weblog.jamisbuck.org/&quot;&gt;Jamis Buck&lt;/a&gt; on the &lt;a href=&quot;http://groups.google.com/group/capistrano?pli=1&quot;&gt;capistrano mailing list&lt;/a&gt;, we came up with the following solution:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
set :repository_root, &quot;http://www.example.com/svn/myproject&quot; 
set(:tag) { Capistrano::CLI.ui.ask(&quot;Tag to deploy (or type 'trunk' to deploy from trunk): &quot;) }
set(:repository) { (tag == &quot;trunk&quot;) ? &quot;#{repository_root}/trunk&quot; : &quot;#{repository_root}/tags/#{tag}&quot; }
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;This deploy script will prompt the user to enter either a tag name or the word trunk.  It will then use that variable to set the repository to the correct path.  The output of a deploy will look like:&lt;/p&gt;


&lt;pre&gt;
% cap deploy
  * executing `deploy'
...
  * executing `deploy:update'
 ** transaction: start
  * executing `deploy:update_code'
Tag to deploy (or type 'trunk' to deploy from trunk): trunk
  * executing &quot;svn checkout -q  -r2210 http://www.example.com/svn/myproject/trunk /var/www/myproject/releases/20081029012754 &#38;&#38; (echo 2210 &amp;gt; /var/www/myproject/releases/20081029012754/REVISION)&quot; 
...
&lt;/pre&gt;

	&lt;p&gt;Capistrano evaluates variables lazily.  It will only fetch the repository variable if it needs it, which will then fetch the tag variable, which will then prompt the user.  Therefore, if you run a command that does not require the repository, it will not prompt.  For example, running the following command will not prompt the user:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
cap deploy:restart
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Next, we created a convenience rake task to deploy the trunk without prompting:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
namespace :deploy do
  task :trunk do
    sh &quot;cap -s tag=trunk deploy&quot; 
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;This rake task sets the tag variable on the command line.  Therefore, capistrano will not need to evaluate the set(:tag) command and will deploy the trunk without prompting.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-10-01:3447</id>
    <published>2008-10-01T00:26:00Z</published>
    <updated>2008-10-01T00:28:59Z</updated>
    <link href="http://www.pgrs.net/2008/10/1/finding-nonprintable-characters-with-a-test" rel="alternate" type="text/html"/>
    <title>Finding nonprintable characters with a test</title>
<content type="html">
            &lt;p&gt;Our current application includes a lot of static content created by content editors.  They check in static &lt;span class=&quot;caps&quot;&gt;HTML&lt;/span&gt; files, and we include these files in various parts of the application.  The problem is that they sometimes copy and paste from applications such as Outlook or Word, which can introduce unprintable characters into the application.  These characters show up strangely on the website.&lt;/p&gt;


	&lt;p&gt;After this happened a couple of times, we decided to write a test to ensure that we would always catch the unprintable characters:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
class NonPrintableCharactersTest &amp;lt; Test::Unit::TestCase
  def test_for_non_printable_characters_in_content
    assert_equal &quot;&quot;, `find #{RAILS_ROOT}/content -name '*.html' | xargs grep -n '[^[:space:][:print:]]'`
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;We use find to get a list of all of the html files in the content folder.  Then, we pipe this to grep, using the regular expression &lt;pre&gt;'[^[:space:][:print:]]'&lt;/pre&gt; which matches anything except spaces or printable characters.  The output of this test looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
Loaded suite test/non_printable_characters_test
Started
F
Finished in 0.86005 seconds.

  1) Failure:
test_for_non_printable_characters_in_content(NonPrintableCharactersTest) [test/non_printable_characters_test.rb:5]:
&amp;lt;&quot;&quot;&amp;gt; expected but was
&amp;lt;&quot;/some/path/to/content/tmp.html:48:character �&amp;lt;/span&amp;gt;&amp;lt;/p&amp;gt;\n&quot;&amp;gt;.

1 tests, 1 assertions, 1 failures, 0 errors
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The failure message shows the file and line with the character, so it is easy to fix.&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-09-12:3170</id>
    <published>2008-09-12T20:10:00Z</published>
    <updated>2008-09-12T20:14:11Z</updated>
    <link href="http://www.pgrs.net/2008/9/12/testing-page-caching-with-spidertest" rel="alternate" type="text/html"/>
    <title>Testing page caching with SpiderTest</title>
<content type="html">
            &lt;p&gt;The website I&#8217;m currently working on is similar to an online brochure.  The data on the site changes hourly, but every user sees the same thing.  As a result, we decided to use page caching to dramatically speed up the site.  Once a page is visited, the html is written out to disk and all subsequent requests are served by apache.  The setup of this approach is detailed elsewhere (for example, &lt;a href=&quot;http://www.railsenvy.com/2007/2/28/rails-caching-tutorial&quot;&gt;Rails Envy: Ruby on Rails Caching Tutorial&lt;/a&gt;).&lt;/p&gt;


	&lt;p&gt;Setting up caching was easy, but we wanted to ensure that we did not make any mistakes.  All pages should be cached, since any miss will result in a much higher load on our rails application.  I&#8217;ve written previously about our internationalization test (&lt;a href=&quot;http://www.pgrs.net/2008/8/29/improved-internationalization-test&quot;&gt;Improved internationalization test&lt;/a&gt;) which spiders the site (using  &lt;a href=&quot;http://caboose.org/articles/2007/2/21/the-fabulous-spider-fuzz-plugin&quot;&gt;SpiderTest&lt;/a&gt;) looking for non localized text.  Since we were already visiting every page, it seemed like a good place to add a check for page caching.  Spidering the site again would make our test suite too long.&lt;/p&gt;


	&lt;p&gt;The consume page method is called for every page that is visited by the spider.  We expanded the implementation by adding a call to assert_page_is_cached:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
def consume_page(html, url)
  html.gsub!(&quot;http://www.example.com&quot;, &quot;&quot;)
  unless redirect?(html) || asset?(url)
    assert_page_has_been_moved_to_language_file(html, url)
    assert_page_is_cached(url)
  super
end

def assert_page_is_cached(url)
  path = ActionController::Routing.normalize_paths([ActionController::Base.page_cache_directory + url])[0]
  page = path.ends_with?(&quot;.html&quot;) ? path : &quot;#{path}.html&quot; 
  assert_true File.exists?(page), &quot;Page NOT cached: #{url} (looking in #{page})&quot; 
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;We also had to add new lines to our setup to turn on caching (since it is normally off in test mode):&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
def setup
  FileUtils.rm_rf ActionController::Base.page_cache_directory
  ActionController::Base.perform_caching = true
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Since we run this test as its own suite, the test is totally isolated from other tests.  There is no need to implement a teardown.&lt;/p&gt;


	&lt;p&gt;The full test, including the internationalization testing from before looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
require 'hpricot'

class InternationalizationText &amp;lt; ActionController::IntegrationTest
  include Caboose::SpiderIntegrator

  def setup
    FileUtils.rm_rf ActionController::Base.page_cache_directory
    ActionController::Base.perform_caching = true
    blank_out_localization
    blank_out_html_escape
  end

  def blank_out_localization
    GLoc::InstanceMethods.class_eval do
      alias :old_l :l
      def l(symbol, *arguments)
        &quot;&quot; 
      end
    end
  end

  def blank_out_html_escape
    ERB::Util.class_eval do
      alias :old_html_escape :html_escape
      def html_escape(s)
        &quot;&quot; 
      end

      alias :h :html_escape
    end
  end

  def test_all_text_has_been_moved_to_language_file
    get '/'
    assert_response :success
    spider(@response.body, '/', :verbose =&amp;gt; true)
  end

  def consume_page(html, url)
    html.gsub!(&quot;http://www.example.com&quot;, &quot;&quot;)
    unless redirect?(html) || asset?(url)
      assert_page_has_been_moved_to_language_file(html, url)
      assert_page_is_cached(url)
    super
  end

  def redirect?(html)
    html.include?(&quot;&amp;lt;body&amp;gt;You are being&quot;)
  end

  def asset?(url)
    File.exist?(File.expand_path(&quot;#{RAILS_ROOT}/public/#{url}&quot;))
  end

  def assert_page_has_been_moved_to_language_file(page_text, url)
    doc = Hpricot.parse(page_text)
    assert_does_not_contain_words doc.at(&quot;title&quot;).inner_text, url
    body = doc.at('body')
    (body.search(&quot;//script[@type='text/javascript']&quot;)).remove
    assert_does_not_contain_words(body.inner_text, url)
    assert_attribute_does_not_contain_words body, url, 'title'
    assert_attribute_does_not_contain_words body, url, 'alt'
  end

  def assert_attribute_does_not_contain_words body, url, attribute
    body.search(&quot;//*[@#{attribute}]&quot;) do |element|
      assert_does_not_contain_words element.get_attribute(attribute), url
    end
  end

  def assert_does_not_contain_words text, url
    match = text.match(/[A-Za-z]([A-Za-z]| )*/)
    fail &quot;Found text that was not in the language file: #{match[0].inspect} on #{url}&quot; if match
  end  

  def assert_page_is_cached(url)
    path = ActionController::Routing.normalize_paths([ActionController::Base.page_cache_directory + url])[0]
    page = path.ends_with?(&quot;.html&quot;) ? path : &quot;#{path}.html&quot; 
    assert_true File.exists?(page), &quot;Page NOT cached: #{url} (looking in #{page})&quot; 
  end
end
&lt;/code&gt;&lt;/pre&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-09-06:3108</id>
    <published>2008-09-06T00:35:00Z</published>
    <updated>2010-06-23T03:19:17Z</updated>
    <link href="http://www.pgrs.net/2008/9/6/capistrano-dry-run" rel="alternate" type="text/html"/>
    <title>Capistrano dry run</title>
<content type="html">
            &lt;p&gt;I submitted a patch to Capistrano to add a &#8220;&#8212;dry-run&#8221; option (or -n for short).  This flag causes capistrano to print out all of commands it will run without actually running them.  It is an easy way to see what the cap task will do to your servers before you run it.&lt;/p&gt;


	&lt;p&gt;My patch was accepted and released as part of Capistrano 2.5.0.  You can read more about the new features at:&lt;/p&gt;


	&lt;p&gt;&lt;a href=&quot;http://capify.org/2008/8/29/capistrano-2-5-0&quot;&gt;http://capify.org/2008/8/29/capistrano-2-5-0&lt;/a&gt;&lt;/p&gt;


	&lt;p&gt;and see the details of my commit at github:&lt;/p&gt;


	&lt;p&gt;&lt;a href=&quot;http://github.com/capistrano/capistrano/commit/7279a3858e2bcebe84735223d5f8b4397c4ad85b&quot;&gt;http://github.com/capistrano/capistrano/commit/7279a3858e2bcebe84735223d5f8b4397c4ad85b&lt;/a&gt;&lt;/p&gt;
          </content>  </entry>
  <entry xml:base="http://www.pgrs.net/">
    <author>
      <name>paul</name>
    </author>
    <id>tag:www.pgrs.net,2008-08-29:2863</id>
    <published>2008-08-29T05:31:00Z</published>
    <updated>2008-08-29T05:33:19Z</updated>
    <link href="http://www.pgrs.net/2008/8/29/improved-internationalization-test" rel="alternate" type="text/html"/>
    <title>Improved internationalization test</title>
<content type="html">
            &lt;p&gt;I wrote previously about how we test the internationalization of our website in &lt;a href=&quot;http://www.pgrs.net/2008/7/11/testing-internationalization-language-files&quot;&gt;Testing internationalization language files.&lt;/a&gt;  Basically, we generate a blank language file with all of the values for all of the labels set to blank.  We switch the site to this language, and then we spider the site looking for text.&lt;/p&gt;


	&lt;p&gt;Over the past couple of months, we have improved our internationalization test and removed some of the existing limitations.&lt;/p&gt;


	&lt;h3&gt;Manually marking nonlocalizable content&lt;/h3&gt;


	&lt;p&gt;One of the limitations of the approach detailed in the previous article is that we had to manually mark content on the page that should not be internationalized by adding a class to the html:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;html&quot;&gt;
&amp;lt;span class=&quot;nonlocalizable&quot;&amp;gt;&amp;lt;%= @building.address %&amp;gt;&amp;lt;/span&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

The basis of our new test is the idea that all text on the page is one of two types:
	&lt;ol&gt;
	&lt;li&gt;Labels and static text that live in the language files, which are inserted into the page using the &lt;a href=&quot;http://wiki.rubyonrails.org/rails/pages/GLoc&quot;&gt;GLoc&lt;/a&gt; method l()&lt;/li&gt;
		&lt;li&gt;Text that the application produces, which should be html escaped using the h() method in the views or helpers&lt;/li&gt;
	&lt;/ol&gt;


	&lt;p&gt;Therefore, if we intercept both of these types of text, we can find anything that is not localized or escaped.&lt;/p&gt;


	&lt;p&gt;Our new test setup looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
def setup
  blank_out_localization
  blank_out_html_escape
end

def blank_out_localization
  GLoc::InstanceMethods.class_eval do
    alias :old_l :l
    def l(symbol, *arguments)
      &quot;&quot; 
    end
  end
end

def blank_out_html_escape
  ERB::Util.class_eval do
    alias :old_html_escape :html_escape
    def html_escape(s)
      &quot;&quot; 
    end

    alias :h :html_escape
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;We redefine the l() method to return an empty string, so anything that is localized will no longer show up on the page.&lt;/p&gt;


	&lt;p&gt;The h() or html_escape() methods are used to escape strings for the web (for example, converting &#8216;&amp;lt;&#8217; into &#8217;&amp;amp;lt;&#8217;).  We also redefine these methods to return empty strings.  Now, all text on the webpage should be blanked out.&lt;/p&gt;


	&lt;p&gt;We then spider the site as before, which walks every page and checks for non blank text.&lt;/p&gt;


	&lt;p&gt;It is possible to restore the l() and h() methods in the teardown:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
def teardown
  restore_html_escape
  restore_localization
end

def restore_html_escape
  ERB::Util.class_eval do
    alias :html_escape :old_html_escape 
  end
end

def restore_localization
  GLoc::InstanceMethods.class_eval do
    alias :l :old_l
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;However, I think it is safer to run this test in its own test suite in a separate ruby process.  That way, the l() and h() monkey patching cannot accidentally affect other tests:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
namespace :test do
  Rake::TestTask.new(:'internationalization' =&amp;gt; [&quot;environment&quot;, &quot;load_test_data&quot;]) do |t|
    t.libs &amp;lt;&amp;lt; &quot;test&quot; 
    t.pattern = &quot;test/acceptance/internationalization_test.rb&quot; 
    t.verbose = true
  end

  Rake::TestTask.new(:'acceptance' =&amp;gt; [&quot;environment&quot;, &quot;load_test_data&quot;]) do |t|
    t.libs &amp;lt;&amp;lt; &quot;test&quot; 
    t.pattern = FileList[&quot;test/acceptance/**/*_test.rb&quot;].exclude(&quot;test/acceptance/internationalization_test.rb&quot;)
    t.verbose = true
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;Now, we no longer need to mark any content as nonlocalizable.  If the test fails, we either forgot to add a label to the language file, or we forgot to escape the text in the page:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
&amp;lt;%= l(:name_label) %&amp;gt;

or

&amp;lt;%= h(@building.address) %&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;Redirects&lt;/h3&gt;


	&lt;p&gt;We noticed that Rails would send redirects as:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;html&quot;&gt;
&amp;lt;html&amp;gt;&amp;lt;body&amp;gt;You are being &amp;lt;a href=&quot;http://www.example.com/some/new/location&quot;&amp;gt;redirected&amp;lt;/a&amp;gt;.&amp;lt;/body&amp;gt;&amp;lt;/html&amp;gt;
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The http://www.example.com &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt; was tripping up SpiderTest, so we removed that part of each &lt;span class=&quot;caps&quot;&gt;URL&lt;/span&gt;.  Furthermore, we skip our page checking on redirect pages and assets:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
def consume_page(html, url)
  html.gsub!(&quot;http://www.example.com&quot;, &quot;&quot;)
  unless redirect?(html) || asset?(url)
    assert_page_has_been_moved_to_language_file(html, url)
  super
end

def redirect?(html)
  html.include?(&quot;&amp;lt;body&amp;gt;You are being&quot;)
end

def asset?(url)
  File.exist?(File.expand_path(&quot;#{RAILS_ROOT}/public/#{url}&quot;))
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;Alt and title attributes&lt;/h3&gt;


	&lt;p&gt;We discovered with the original test that we were not testing alt and title attributes on the page.  For example, if you hover over a link, it will show the title.  We also want these strings internationalized, so we added them to the test with the following code:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
assert_attribute_does_not_contain_words body, url, 'title'
assert_attribute_does_not_contain_words body, url, 'alt'

def assert_attribute_does_not_contain_words body, url, attribute
  body.search(&quot;//*[@#{attribute}]&quot;) do |element|
    assert_does_not_contain_words element.get_attribute(attribute), url
  end
end
&lt;/code&gt;&lt;/pre&gt;

	&lt;h3&gt;Better error messages&lt;/h3&gt;


	&lt;p&gt;We noticed that if you accidentally forget to internationalize a string like &#8220;Please enter your username,&#8221; the test would fail with a message of &#8220;Found text that was not in the language file: Please.&#8221;  We thought it would be better to show the full string, so we replaced the regex:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
/\w+/
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;with&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
/[A-Za-z]([A-Za-z]| )*/
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;The second one matches all word characters or spaces, so it will pick up the entire phrase.&lt;/p&gt;


	&lt;h3&gt;Final result&lt;/h3&gt;


	&lt;p&gt;The final test looks like:&lt;/p&gt;


&lt;pre&gt;&lt;code class=&quot;ruby&quot;&gt;
require 'hpricot'

class InternationalizationText &amp;lt; ActionController::IntegrationTest
  include Caboose::SpiderIntegrator

  def setup
    blank_out_localization
    blank_out_html_escape
  end

  def blank_out_localization
    GLoc::InstanceMethods.class_eval do
      alias :old_l :l
      def l(symbol, *arguments)
        &quot;&quot; 
      end
    end
  end

  def blank_out_html_escape
    ERB::Util.class_eval do
      alias :old_html_escape :html_escape
      def html_escape(s)
        &quot;&quot; 
      end

      alias :h :html_escape
    end
  end

  def test_all_text_has_been_moved_to_language_file
    get '/'
    assert_response :success
    spider(@response.body, '/', :verbose =&amp;gt; true)
  end

  def consume_page(html, url)
    html.gsub!(&quot;http://www.example.com&quot;, &quot;&quot;)
    unless redirect?(html) || asset?(url)
      assert_page_has_been_moved_to_language_file(html, url)
    super
  end

  def redirect?(html)
    html.include?(&quot;&amp;lt;body&amp;gt;You are being&quot;)
  end

  def asset?(url)
    File.exist?(File.expand_path(&quot;#{RAILS_ROOT}/public/#{url}&quot;))
  end

  def assert_page_has_been_moved_to_language_file(page_text, url)
    doc = Hpricot.parse(page_text)
    assert_does_not_contain_words doc.at(&quot;title&quot;).inner_text, url
    body = doc.at('body')
    (body.search(&quot;//script[@type='text/javascript']&quot;)).remove
    assert_does_not_contain_words(body.inner_text, url)
    assert_attribute_does_not_contain_words body, url, 'title'
    assert_attribute_does_not_contain_words body, url, 'alt'
  end

  def assert_attribute_does_not_contain_words body, url, attribute
    body.search(&quot;//*[@#{attribute}]&quot;) do |element|
      assert_does_not_contain_words element.get_attribute(attribute), url
    end
  end

  def assert_does_not_contain_words text, url
    match = text.match(/[A-Za-z]([A-Za-z]| )*/)
    fail &quot;Found text that was not in the language file: #{match[0].inspect} on #{url}&quot; if match
  end  

end
&lt;/code&gt;&lt;/pre&gt;

	&lt;p&gt;These modifications have improved the quality of the internationalization test, and this test has been very useful at catching text that we forget to internationalize.&lt;/p&gt;
          </content>  </entry>
</feed>
