Oct 162012
 

This post is cross-posted at Scaling PostgreSQL at Braintree: Four Years of Evolution.

We love PostgreSQL at Braintree. Although we use many different data stores (such as Riak, MongoDB, Redis, and Memcached), most of our core data is stored in PostgreSQL. It’s not as sexy as the new NoSQL databases, but PostgreSQL is consistent and incredibly reliable, two properties we value when storing payment information.

We also love the ad-hoc querying that we get from a relational database. For example, if our traffic looks fishy, we can answer questions like “What is the percentage of Visa declines coming from Europe?” without having to pre-compute views or write complex map/reduce queries.

Our PostgreSQL setup has changed a lot over the last few years. In this post, I’m going to walk you through the evolution of how we host and use PostgreSQL. We’ve had a lot of help along the way from the very knowledgeable people at Command Prompt.

2008: The beginning

Like most Ruby on Rails apps in 2008, our gateway started out on MySQL. We ran a couple of app servers and two database servers replicated using DRBD. DRBD uses block level replication to mirror partitions between servers. This setup was fine at first, but as our traffic started growing, we began to see problems.

2010: The problems with MySQL

The biggest problem we faced was that schema migrations on large tables took a long time with MySQL. As our dataset grew, our deploys started taking longer and longer. We were iterating quickly, and our schema was evolving. We couldn’t keep affording to take downtime while we upgraded or even added a new index to a large table.

We explored various options with MySQL (such as oak-online-alter-table), but decided that we would rather move to a database that supported it directly. We were also starting to see deadlock issues with MySQL, which were on operations we felt shouldn’t deadlock. PostgreSQL solved this problem as well.

We migrated from MySQL to PostgreSQL in the fall of 2010. You can read more about the migration on the slides from my PgEast talk. PostgreSQL 9.0 was recently released, but we chose to go with version 8.4 since it had been out longer and was more well known.

2010 – 2011: Initial PostgreSQL

We ran PostgreSQL on modest hardware, and we kept DRBD for replication. This worked fine at first, but as our traffic continued to grow, we needed some upgrades. Unlike most applications, we are much heavier on writes than reads. For every credit card that we charge, we store a lot of data (such as customer information, raw responses from the processing networks, and table audits).

Over the next year, we performed the following upgrades:

  • Tweaked our configs around checkpoints, shared buffers, work_mem and more (this is a great start: Tuning Your PostgreSQL Server)
  • Moved the Write Ahead Log (WAL) to its own partition (so fsyncs of the WAL don’t flush all of the dirty data files)
  • Moved the WAL to its own pair of disks (so the sequential writes of the WAL are not slowed down by the random read/write of the data files)
  • Added more RAM
  • Moved to better servers (24 cores, 16 disks, even more RAM)
  • Added more RAM again (kept adding to keep the working set in RAM)

Fall 2011: Sharding

These incremental improvements worked great for a long time, and our database was able to keep up with our ever increasing volume. In the summer of 2011, we started to feel like our traffic was going to outgrow a single server. We could keep buying better hardware, but we knew there was a limit.

We talked about a lot of different solutions, and in the end, we decided to horizontally shard our database by merchant. A merchant’s traffic would all live on one shard to make querying easier, but different merchants would live on different shards.

We used data_fabric to introduce sharding into our Rails app. data_fabric lets you specify which models are sharded, and gives you methods for activating a specific shard. In conjunction with data_fabric, we also wrote a fair amount of custom code for sharding. We sharded every table except for a handful of global tables, such as merchants and users. Since almost every URL has the merchant id in it, we were able to activate shards in application_controller.rb for 99% of our traffic with code that looked roughly like:

class ApplicationController < ActionController::Base
  around_filter :activate_shard
 
  def activate_shard(&block)
    merchant = Merchant.find_by_public_id(params[:merchant_id])
    DataFabric.activate_shard(:shard => merchant.shard, &block)
  end
end

Making our code work with sharding was only half the battle. We still had to migrate merchants to a different shard (without downtime). We did this with londiste, a statement-based replication tool. We set up the new database servers and used londiste to mirror the entire database between the current cluster (which we renamed to shard 0) and the new cluster (shard 1).

Then, we paused traffic[1], stopped replication, updated the shard column in the global database, and resumed traffic. The whole process was automated using capistrano. At this point, some requests went to the new database servers, and some to the old. Once we were sure everything was working, we removed the shard 0 data from shard 1 and vice versa.

The final cutover was completed in the fall of 2011.

Spring 2012: DRBD Problems

Sharding took care of our performance problems, but in the spring of 2012, we started running into issues with our DRBD replication:

  • DRBD made replicating between two servers very easy, but more than two required complex stacked resources that were harder to orchestrate. It also required more moving pieces, like DRBD Proxy to prevent blocking writes between data centers.
  • DRBD is block level replication, so the filesystem is shared between servers. This means it can never be unmounted and checked (fsck) without taking downtime. We become increasingly concerned that filesystem corruption would go unnoticed and corrupt all servers in the cluster.
  • The filesystem can only be mounted on the primary server, so the standby servers sit idle. It is not possible to run read-only queries on them.
  • Failover required unmounting and remounting filesystems, so it was slower than desired. Also, since the filesystem was unmounted on the target server, once mounted, the filesystem cache was empty. This meant that our backup PostgreSQL was slow after failover, and we would see slow requests and sometimes timeouts.
  • We saw a couple of issues in our sandbox environment where DRBD issues on the secondary prevented writes on the primary node. Thankfully, these never occurred in production, but we had a lot of trouble tracking down the issue.
  • We were still using manual failover because we were scared of the horror stories with Pacemaker and DRBD causing split brain scenarios and data corruption. We wanted to get to automated failover, however.
  • DRBD required a kernel module, so we had to build and test a new module every time we upgraded the kernel.
  • One upgrade of DRBD caused a huge degradation of write performance . Thankfully, we discovered the issue in our test environment, but it was another reason to be wary of kernel level replication.

Given all of these concerns, we decided to leave DRBD replication and move to PostgreSQL streaming replication (which was new in PostgreSQL 9). We felt like it was a better fit for what we wanted to do. We could replicate to many servers easily, standby servers were queryable letting us offload some expensive queries, and failover was very quick.

We made the switch during the summer of 2012.

Summer 2012: PostgreSQL 9.1

We updated our code to support PostgreSQL 9.1 (which involved very few code changes). Along with the upgrade, we wanted to move to fully automated failover. We decided to use Pacemaker and these great open source scripts for managing PostgreSQL streaming replication: https://github.com/t-matsuo/resource-agents/wiki. These scripts handle promotion, moving the database IPs, and even switching from sync to async mode if there are no more standby servers.

We set up our new database clusters (one per shard). We used two servers per datacenter, with synchronous replication within the datacenter and asynchronous replication between our datacenters. We configured Pacemaker and had the clusters ready to go (but empty). We performed extensive testing on this setup to fully understand the failover scenarios and exactly how Pacemaker would react.

We used londiste again to copy the data. Once the clusters were up to date, we did a similar cutover: we paused traffic, stopped londiste, updated our database.yml, and then resumed traffic. We did this one shard at a time, and the entire procedure was automated with capistrano. Again, we took no downtime.

Fall 2012: Today

Today, we’re in a good state with PostgreSQL. We have fully automated failover between servers (within a datacenter). Our cross datacenter failover is still manual since we want to be sure before we give up on an entire datacenter. We have automated capistrano tasks to orchestrate controlled failover using Pacemaker and traffic pausing. This means we can perform database maintenance with zero downtime.

One of our big lessons learned is that we need to continually invest in our PostgreSQL setup. We’re always watching our PostgreSQL performance and making adjustments where needed (new indexes, restructuring our data, config tuning, etc). Since our traffic continues to grow and we record more and more data, we know that our PostgreSQL setup will continue to evolve over the coming years.

[1] For more info on how we pause traffic, check out How We Moved Our Data Center 25 Miles Without Downtime and High Availability at Braintree

Jul 072012
 

There are two main types of unit tests: state based and interaction based. State based tests rely on the verification of state. These tests typically perform some operation and then check the state of the resulting object:

user = User.new :title => 'manager'
user.promote!
user.title.should == 'senior manager'

In contrast, interaction based tests rely on verification of the interaction between objects. This is generally done with mocks or stubs:

user = User.new
Mailer.should_receive(:send_email).with(user)
user.activate!

This test ensures that the activate! method interacts with the Mailer, without actually sending an email.

Interaction based testing has its uses, but in general, I prefer state based testing. I think that tests full of mocks are brittle and hard to read. Furthermore, when they fail, it can be difficult to understand why.

In my testing, I try to replace interaction based testing with state based when possible. I’m going to walk through an example where a typical test would use mocks, but I’m going to use a state based approad instead.

Say I have a Rails 3 application that accepts social security numbers. These are sensitive, so I want to make sure I don’t log them to the application log. A simple testing approach (using rspec) might look like:

require 'spec_helper'
 
describe MyController do
  describe "create" do
    it "filters sensitive data from the log" do
      Rails.logger.should_receive(:info).with { |message| message.include?("123-45-6789") }.never 
      post :create, :myobj => {:social_security_number => "123-45-6789"}
    end
  end
end

One problem with this test is that it’s a negative test. It can pass for the wrong reasons. For example, if the controller action blows up before it would normally log, the test may pass even though it would log under normal circumstances. It’s generally better to write a possitive test:

require 'spec_helper'
 
describe MyController do
  describe "create" do
    it "filters sensitive data from the log" do
      Rails.logger.should_receive(:info).with(include('"social_security_number"=>"[FILTERED]"'))
      post :create, :myobj => {:social_security_number => "123-45-6789"}
    end
  end
end

This test is better, but still not great. One problem is that Rails.logger.info is now mocked out, so there will be no logging which can make debugging a failing test more difficult. Another problem is that if the test fails, the test output only says expectation not met. It does not show you what was actually logged:

  1) MyController create filters sensitive data from the log
     Failure/Error: Rails.logger.should_receive(:info).with(include('"social_security_number"=>"[FILTERED]"'))
       (#<ActiveSupport::TaggedLogging:0x007fcf39201a10>).info(include "\"social_security_number\"=>\"[FILTERED]\"")
           expected: 1 time
           received: 0 times
     # ./spec/controllers/my_controller_spec.rb:7:in `block (3 levels) in <top (required)>'

Here is the state based test I prefer:

require 'spec_helper'
 
describe MyController do
  describe "create" do
    it "filters sensitive data from the log" do
      post :create, :myobj => {:social_security_number => "123-45-6789"}
 
      log_line = Rails.logger.log_history.grep(/Parameters:/).first
      log_line.should include('"social_security_number"=>"[FILTERED]"')
    end
  end
end

This test actually grabs the log line from a log history. If we don’t find the log line, it will be obvious. If we do find the line, we can compare the expected output with the actual one.

Now, the test failure contains both the expected and actual:

  1) MyController create filters sensitive data from the log
     Failure/Error: log_line.should include('"social_security_number"=>"[FILTERED]"')
       expected "  Parameters: {\"myobj\"=>{\"social_security_number\"=>\"123-45-6789\"}}" to include "\"social_security_number\"=>\"[FILTERED]\""
     # ./spec/controllers/my_controller_spec.rb:9:in `block (3 levels) in <top (required)>'

We can monkey patch Rails.logger to record the log lines in spec_helper.rb:

module LogHistory
  def add(_, _, message = nil, &block)
    log_history << message
    super
  end
 
  def log_history
    @log_history ||= []
  end
 
  def clear_log_history
    log_history.clear
  end
end
 
RSpec.configure do |config|
  config.before(:suite) do
    Rails.logger.extend(LogHistory)
  end
 
  config.before(:each) do
    Rails.logger.clear_log_history
  end
end

Now, every log line is recorded per test and cleared out in between. This pattern also works well for other external dependencies, such as database statements or queries sent to a search tool. They can be recorded and asserted against directly.

Jun 032012
 

If you’re like me, you’ve switched all of your websites to paperless statements. No more bank, credit card, or utility statements arrive in the mail. But you would still like to have copies of these important documents, so you log in to each website and download the PDF statements.

Downloading these statements is tedious and time consuming. That’s where statement_hoarder comes in. It automates this process across many different sites.

To get started, follow the instructions on GitHub: statement_hoarder

If you are interested in contributing (for example, adding support for new websites), please help out on Github. Feel free to open issues, submit pull requests, and help make it better.

Mar 082012
 

This post is cross-posted at Data Migrations for NoSQL with Curator.

The NoSQL movement has brought us a wave of new data stores beyond the traditional relational databases. These data stores come with their own tradeoffs, but they provide some incredible benefits. At Braintree, we are moving in the direction of using Riak as our next generation data store. We love its focus on scalability and availability. Servers can fail without causing any downtime, and we can add more capacity by simply adding more servers to the cluster.

One great feature of relational databases, however, is the consistency in the shape of the data. You know if you have a people table, every row has the same columns. Some fields might be null, but there won’t be any surprises. Furthermore, if you want to rename or modify a column, it’s a simple operation. In the case of PostgreSQL and other databases, a rename is nearly instantaneous. We lose this ability with Riak and most NoSQL databases. We can easily add attributes (columns), but we cannot easily rename them or change the data within each document (row).

Since our apps are always evolving at Braintree, we needed a way for our data to keep up with our code. Our solution is something we’re calling lazy data migrations, and we’ve built it into our repository and model framework, curator. You can read more about curator on our blog at Untangle Domain and Persistence Logic with Curator.

The problem

Say we have a collection of people in Riak. This is analogous to a people table in a relational database. When we first built the app, we added fields for first_name and last_name:

person = Person.new(:first_name => "Joe", :last_name => "Smith")

Some time has passed, our app has data, and we now realize that names are a pain. What do we do with middle names? What about people with multiple first or last names? We want to just simplify the system and collect only a name. We no longer care about a separate first and last name. The problem is we have a ton of data in the old format. How do we handle that old records have a first_name and last_name, but going forward, we want just name?

In a relational database, we would simply write a database migration that looks like:

ALTER TABLE people ADD COLUMN name VARCHAR;
UPDATE people SET name = first_name || ' ' || last_name;
ALTER TABLE people DROP COLUMN first_name, DROP COLUMN last_name;

This migration might take a while to run, but once it’s done, we know that all data has been migrated. We can then change all of our code to only deal with name, knowing we no longer have first_name or last_name.

In a NoSQL database like Riak, we cannot simply change the schema. We have to come up with a different solution. Here are the steps we went through in trying to come up with the solution that made its way into curator:

Solution attempt 1: Scattered conditionals

The first solution is to make the Person class smart enough to handle both cases.

class Person
  attr_accessor :first_name, :last_name, :name
end

We can populate whatever fields we get back from the data store. Then, when we want to do something with the name, we have to use code like:

if person.name
  puts "Name is #{person.name}"
else
  puts "Name is #{person.first_name} #{person.last_name}"
end

The problem with this approach is that we have to use branching code like this whenever we want to use the name. It quickly gets messy.

Solution attempt 2: Gathered conditionals

The second solution is to move this logic to the place where we read the Person out of the data store:

attributes = fetch_from_riak
if attributes[:name]
  person = Person.new(:name => attributes[:name])
else
  person = Person.new(:name => "#{attributes[:first_name]} #{attributes[:last_name]}")
end

Now, we only have to do it once and we can change our Person class to only know about name.

This solution works well, but what happens a year down the road when we’ve made lots of data changes to many different models? We don’t want a bunch of conditionals all over our persistence code.

Our solution: Lazy data migrations

We pulled the idea from solution 2 into the idea of a migration (similar to ActiveRecord migrations). Migrations target a given collection at a given version. They look like this:

class ConsolidateName < Curator::Migration
  def migrate(attributes)
    first_name = attributes.delete(:first_name)
    last_name = attributes.delete(:last_name)
    attributes.merge(:name => "#{first_name} #{last_name}")
  end
end

This migration is stored in db/migrate/people/0001_consolidate_name.rb. We’ve also added the concept of a version to each Model. By default, models start at version 0. When they are read from the Repository, the attributes are run through any migrations that are a greater version (based on the version in the filename):

person = PersonRepository.find_by_key("person_id")
person.version #=> 1

Now, the migration logic is isolated from the rest of the application. The rest of the app can safely assume that all Person objects have only a name:

class Person
  current_version 1
  attr_accessor :name
end

We mark the Person class with current_version 1 to signify that new instances start at version 1, since they have a name attribute rather than first_name/last_name.

These migrations run when models are read, so they are lazy. Data will migrate as it’s used, and update when saved. This means that, unlike with relational databases, the website can be up and serving requests while the data is migrated.

If you want to force the data to migrate (and not wait for all data to be used), you can simply find models who haven’t been migrated and save them. The version attribute is indexed by default:

PersonRepository.find_by_version(0).each do |person|
  PersonRepository.save(person)
end

Testing

Unlike ActiveRecord migrations, curator migrations have no side effects. They simply accept a hash and return a new hash. This makes them easy to call from a unit test:

require 'spec_helper'
require 'db/migrate/people/0001_consolidate_name'
 
describe ConsolidateName do
  describe "migrate" do
    it "concatenates first_name and last_name" do
      attributes = {:first_name => "Joe", :last_name => "Smith"}
      ConsolidateName.new(1).migrate(attributes)[:name].should == "Joe Smith"
    end
  end
end

Limitations

Curator migrations are lazy, so at any given time you might have documents with different versions in the data store. This is not normally a problem since the migrations will run as soon as the objects are read. However, if you add a migration that changes an indexed field, you cannot rely on that index to return all of the correct values until you migrate them all. In this case, you might want to force migration by reading and saving all of the documents.

Next Steps

You can see these migrations in action in the curator_rails_example.

Let us know what you think about lazy data migrations in curator. Feel free to open issues on GitHub, submit pull requests, and help us make it better.

Feb 212012
 

This post is cross-posted at http://www.braintreepayments.com/devblog/untangle-domain-and-persistence-logic-with-curator.

The problem

Ruby on Rails is a great web framework, and we use it extensively at Braintree to build web apps. One criticism of Rails, however, is that it encourages tight coupling of domain and persistence layers. The convention in Ruby on Rails is for domain objects to extend ActiveRecord::Base and directly gain persistence. This makes building small apps easy, but as applications grow, the tight coupling starts to make things more difficult.

Domain objects in more complicated systems no longer have a simple mapping into a database row. For example, money might be represented as cents and currency in the database, but as a Money object in the domain layer. Now, the domain objects have to handle both the money logic and the money persistence. Even if the code is abstracted away by a gem, it still lives in the domain object. Take a look at the methods on a brand new model in Rails:

>> class Bar < ActiveRecord::Base
>> end
 
>> (Bar.methods.sort - Object.methods).count
=> 391

These methods are in addition to anything you add to the model.

The solution

There’s nothing in Rails that says you have to tie your domain objects to persistence. It’s merely a convention. At Braintree, we’re using a new convention for new projects. Our domain objects do not contain any persistence logic. Instead, we’ve introduced a repository layer which is in charge of the persistence of these domain objects. This pattern is well-documented in Domain Driven Design.

Introducing a Repository layer into our applications has a number of benefits. The main benefit is that it separates out different kinds of logic. Domain logic goes into the domain objects, and can be tested in isolation without any persistence. Persistence logic goes into the repository objects, which only deal with saving and retrieving objects.

The second big benefit is that abstracting persistence logic into a repository allows us to swap persistence back-ends without much trouble. The most notable case is that we can swap in an in-memory data store for testing (or for most tests), but use Riak for the real application. Since the repository interacts with the back-end data store, the application code is the same regardless of back-end. It also allows us to use different back-ends for different kinds of data but still have a consistent pattern around persistence.

We’ve extracted this model and repository framework from our current applications and released it as curator. Curator currently supports Riak as the persistence back-end, but more back-ends are coming soon.

Domain objects include Curator::Model, which just gives helper methods like an initialize that sets instance variables. It also adds some helper methods for Rails.

class Note
  include Curator::Model
  attr_accessor :id, :title, :description, :user_id
end
 
note = Note.new(:title => "My Note", :description => "My description")
puts note.description

These model classes don’t have a lot of extra stuff:

>> class Bar
>>   include Curator::Model
>> end
 
>> Bar.methods.sort - Object.methods
=> [:_to_partial_path, :current_version, :model_name, :version]

Repositories include the Curator::Repository module:

class NoteRepository
  include Curator::Repository
  indexed_fields :user_id
end

Repositories have save, find_by_id, and find_by methods for indexed fields:

note = Note.new(:user_id => "my_user")
NoteRepository.save(note)
 
NoteRepository.find_by_id(note.id)
NoteRepository.find_by_user_id("my_user")

As persistence gets more complicated, repositories can implement their own serialize and deserialize methods to handle any case. For example, suppose our note object contains a PDF:

class NoteRepository
  include Curator::Repository
 
  def self.serialize(note)
    attributes = super(note)
    attributes[:pdf] = Base64.encode64(note.pdf) if note.pdf
    attributes
  end
 
  def self.deserialize(attributes)
    note = super(attributes)
    note.pdf = Base64.decode64(attributes[:pdf]) if attributes[:pdf]
    note
  end
end

As you can see, all persistence logic around PDFs lives in the Repository. The Note class does not care how PDFs are stored.

Curator in action

If you want to see a simple, fully functional Rails application using curator, check out curator_rails_example. I will detail the relevant bits below.

Thanks to Rails 3.x, you can include ActiveModel::Validations in any class to get validations. So if you want your note class to validate, it would look like:

 class Note
   include Curator::Model
   include ActiveModel::Validations
   attr_accessor :id, :title, :description
   validates :title, :presence => true
 end

You can also build forms from these objects. new.html.erb would look like:

<%= form_for @note do |f| %>                                  
  <%= f.error_messages %>                                     
  <dl>                                                        
    <dt><%= f.label :title %></dt>                            
    <dd><%= f.text_field :title %></dd>                       
    <dt><%= f.label :description %></dt>                      
    <dd><%= f.text_area :description, :size => "60x12" %></dd>
  </dl>                                                       
  <%= f.submit "Create" %>                                    
<% end %>

And the controller looks like:

class NotesController < ActionController::Base
  def new
    @note = Note.new
  end
 
  def create
    @note = Note.new(request.POST[:note])
    if @note.valid?
      NoteRepository.save(@note)
      redirect_to notes_path
    else
      render :new
    end
  end
end

That’s it. Rails makes it really easy to build forms and validate against our models.
Next Steps

Please check out the curator code and let us know what you think. Like all software, curator is a work in progress, so feel free to open issues on Github, submit pull requests, and help us make it better.

Dec 232011
 

I have wanted a Home theater PC for awhile to play movies, music, Hulu, and more through my TV. I thought briefly about buying a prebuilt device (like a Roku), but I decided to hack one together myself. My main motivation was flexibility and the fact that I like to tinker. Furthermore, I wanted a device with a hard drive that I could load up with files and play directly to a TV.

I asked around for recommendations, and several of my coworkers recommended Zotac. After looking through their options, I decided on a Zotac ZBOX ID41 (here it is on Amazon).

Setting up Ubuntu

I decided to run Ubuntu 11.10 Oneiric Ocelot for the operating system. The Zotac ZBOX has no CD drive, so I downloaded the Ubuntu ISO from the website, and I used UNetbootin to make a bootable USB drive.

I plugged an HDMI cable from the Zotac into my TV, and I plugged a wired USB keyboard and mouse in the Zotac. I inserted the USB drive and booted it up. It automatically booted from the drive, and I selected Install.

Once the installation was finished, I had to go to Settings -> Sound and choose the HDMI output. Then, sound correctly played through my TV.

Setting up XMBC

There is a lot of software for home theater PCs. I chose XBMC since it seems very polished and has a lot of plugins for different video sources (such as Youtube). It’s also optimized for a remote instead of requiring a full keyboard.

I had to add an extra apt repository in order to install the latest version of XMBC (11.0 beta1). Here are the commands I ran:

sudo add-apt-repository ppa:nathan-renniewaldock/xbmc-stable
sudo apt-get update
sudo apt-get install xbmc

Then, I launched XBMC from Ubuntu. I was greeted with a screen that looked like:

I went to Settings and changed my region to where I live. I also told XBMC to update my library on start. Then, I went to Weather and entered my zip code to get local weather. Next I went to Videos -> Add ons and installed a bunch of video add ons (such as Youtube, HGTV, etc).

At this point, I had to manually run XBMC whenever I restarted the computer. There are many strategies to boot directly into XBMC. I decided to just add XBMC as a Startup Item in Ubuntu. I also allowed Ubuntu to boot without a password. So Ubuntu boots up and then runs XBMC when it’s finished. If I need to go back to Ubuntu, I can just exit from XBMC.

Uploading media

XMBC uses filenames to figure out which movie or TV show a file represents. For example, with default settings, episodes of a TV show need to be named in one of the following formats:

foo.s01e01.*
foo.s01.e01.*
foo.s01_e01.*
foo_[s01]_[e01]_*
foo.1x01.*
foo.101.*

These files should be in a folder with the name of the show. The full set of instructions can be found on the XBMC website at TV Shows (Video Library).

I created a Movies directory and a TV Shows directory under /home/paul/Videos and uploaded my media. Then, I added these directories to XMBC and set the content type to Movies and TV Shows respectively (see Media Sources).

Finally, I quit XBMC and restarted it. It scanned my directories and added the movies and TV shows to the library.

Hulu

Unfortunately, there are no officially supported Hulu plugins for XBMC. I managed to get an unoffical plugin working with some extra effort using Bluecop’s beta video plugin repository.

I downloaded the zip file from the site above and saved it to the Zotac. Then, in XBMC, I went to Settings -> Add ons and chose to install add ons from a zip file. At this point, the Hulu plugin appeared in the list of available plugins to install. I installed the plugin, but it still didn’t work.

By looking in the XBMC log (~/.xbmc/temp/xbmc.log), I discovered that Hulu requires a newer version of rtmpdump than Ubuntu ships with.

I built the latest rtmpdump from source with the following commands:

sudo apt-get install libssl-dev
git clone git://git.ffmpeg.org/rtmpdump
cd rtmpdump
make sys=posix
cp librtmp/librtmp.so.0 /usr/lib/i386-linux-gnu/librtmp.so.0

The last step copies the new librtmp on top of the existing one, which isn’t great. Unfortunately, XBMC runs with a specified library path, so I couldn’t find an easy way to add a different directory to the path.

Now, Hulu works. Unfortunately, I have to browse for videos to play. If I search for a video, and then play it, I get the following error in the xbmc.log:

ERROR: Error Type: <type 'exceptions.AttributeError'>
ERROR: Error Contents: _Info instance has no attribute 'videoid'
ERROR: Traceback (most recent call last):
                      File "/home/paul/.xbmc/addons/plugin.video.hulu/default.py", line 56, in <module>
                        modes ( )
                      File "/home/paul/.xbmc/addons/plugin.video.hulu/default.py", line 36, in modes
                        stream_media.Main()
                      File "/home/paul/.xbmc/addons/plugin.video.hulu/resources/lib/stream_hulu.py", line 148, in __init__
                        self.queueAD(video_id,2,1)
                      File "/home/paul/.xbmc/addons/plugin.video.hulu/resources/lib/stream_hulu.py", line 420, in queueAD
                        u += '&videoid="'+urllib.quote_plus(common.args.videoid)+'"'
                    AttributeError: _Info instance has no attribute 'videoid'

Browsing works just fine, though, so I only browse for videos.

Next Steps

I’m still using a wired USB keyboard and mouse, but I’m planning to get a home theater friendly wireless keyboard/mouse combo. Something along the lines of a Rii Mini Wireless Keyboard.

There are also hundreds of XBMC plugins I haven’t checked out yet, as well as other ways to customize XBMC.

Dec 152011
 

We have a suite of ruby applications at Braintree. There are many times when we need to run more than one of these apps at the same time. For example, we might need to run our login app in addition to whichever app we are working on.

We use foreman as our process launcher, but foreman does not handle running other projects in other rvm gemsets (and possibly ruby versions). One of my fellow developers, Tony Pitluga, released a gem called subcontractor that handles these cases.

For example, our Procfile looks like:

app: rails server
login: subcontract --rvm "--with-rubies rvmrc" --chdir ../login --signal INT -- rails server -p 4000

This runs the current app on port 3000 and the login app on port 4000 with a different home folder, ruby and gemset. The login line is roughly equivalent to running cd ../login && rvm --with-rubies rvmrc exec rails server -p 4000 The option --with-rubies rvmrc tells rvm to use the .rvmrc file to load the correct ruby and gemset. If a project doesn’t have an .rvmrc file, you can specify the ruby version:

another_app: subcontract --rvm ruby-1.8.7-p249@another_app --chdir ../another_app --signal INT -- rails server -p 4000
Dec 062011
 

I just released version 1.0 of speclj-growl. This is a plugin for the speclj test framework for clojure which adds Growl popups for each spec run. For those unfamiliar with Growl, it is a notification system for Mac OS X which allows applications to show custom messages in system popups.

speclj comes with autotest built in. You can start an autotest process which will monitor the filesystem for changes. When tests or code change, the tests will be automatically run again. speclj-growl adds to this process by showing the test results in a growl popup, so you can stay within your editor, glance at the corner of the screen, and see whether your tests passed or failed.

For example, say you have a clojure project using both speclj and speclj-growl (in my example, the speclj-growl project itself). The command lein spec -a -f growl will start autotest (-a) and add the growl formatter (-f growl):

% lein spec -a -f growl

----- Tue Dec 06 23:03:32 CST 2011 -------------------------------------------------------------------
took 0.89041 seconds to determine file statuses.
reloading files:
  /Users/paul/speclj-growl/spec/speclj/report/growl_spec.clj

Growl Reporter
  report-runs
  - growls summary information for no test runs
  - growls a successful run
  - growls an unsuccessful run

Finished in 0.02542 seconds
3 examples, 0 failures

A growl message also appears on screen when the specs finish:

Now, if I change a spec to intentionally break it, the terminal automatically updates with the new spec output and a new growl message appears:

----- Tue Dec 06 23:04:58 CST 2011 -------------------------------------------------------------------
took 0.00414 seconds to determine file statuses.
reloading files:
  /Users/paul/speclj-growl/spec/speclj/report/growl_spec.clj

Growl Reporter
  report-runs  - growls summary information for no test runs (FAILED)
  - growls a successful run
  - growls an unsuccessful run

Failures:

  1) Growl Reporter report-runs growls summary information for no test runs     Expected: <"Failure">
          got: <"Success"> (using =)
     /Users/paul/speclj-growl/spec/speclj/report/growl_spec.clj:28

Finished in 0.00705 seconds
3 examples, 1 failures

This cycle continues every time the source or spec files change.

Check out the code and installation instructions on github: https://github.com/pgr0ss/speclj-growl

Thanks to Micah Martin for both speclj and help with speclj-growl.