This post is cross-posted at Safe Operations For High Volume
We use PostgreSQL extensively at
Braintree, and it backs many of our highly available services (including
our main payments API).
We are constantly building and refining our products, and this often
means evolving our database schema. In general, PostgreSQL is great at
this, and we can make many different types of schema changes without
downtime. There are some gotchas, however, that this post will cover.
We almost never take scheduled downtime of our payments API. This means
we run our database schema migrations while the gateway is up and
serving requests. We have to be very careful about what database
operations we run. If we run a bad command, it can lock out updates to a
table for a long time.
For example, if we create a new index on our customers table, we cannot
create new customers while that index is building. Anyone who tries to
perform a customer create will block, and possibly time out, causing a
In general, we are ok with database operations taking a long time.
However, any operation that locks a table for updates for more than a
few seconds means downtime for us.
You can learn more about our high availability approaches: Ruby Conf
Australia: High Availability at
We derived the lists below through extensive testing, trial and error.
Here’s what we can safely do in a migration without downtime:
|Can do this|
|Add a new column|
|Drop a column|
|Add an index concurrently|
|Drop a constraint (for example, non-nullable)|
|Add a default value to an existing column|
Here’s the stuff we cannot do, and our current workarounds:
|Cannot do this on big tables||Our workaround|
|Add an index||Add the index using the CONCURRENTLY keyword|
|Change the type of a column||Add a new column, change the code to write to both columns, and backfill the new column|
|Add a column with a default||Add column, add default as a separate command, and backfill the column with the default value|
|Add a column that is non-nullable||Create a new table with the addition of the non-nullable column, write to both tables, backfill, and then switch to the new table [^1]|
|Add a column with a unique constraint||Add column, add unique index concurrently, and then add the constraint onto the table [^2]|
|VACUUM FULL [^3]||We use pg_repack instead|
If you have other workarounds, please share them in the comments.
CREATE UNIQUE INDEX CONCURRENTLY token_is_unique ON large_table(token); ALTER TABLE large_table ADD CONSTRAINT token UNIQUE USING INDEX token_is_unique;