My current project has a CruiseControl.rb build dedicated to testing our database rollback scripts. Since this does not seem to be a common practice, I thought that I would share our experience.
We have one DDL file and one DML file per release. These are written using rails migration syntax, but we hand over only SQL to the client for releasing. The client wants a rollback script for each release to rollback the database schema and data (as much as possible) to the previous release.
Rollback scripts are hand created and prone to bugs, so we wanted to test them.
We set up a cruise loop specifically designed to test the rollback script. Our criteria for determining if the rollback was successful is:
- The schema of the old version matches the schema of the rolled back database
- The content tables are identical to the old versions. These are tables full of static data that we show on the page. They change with each release as the web pages evolve.
- We can run the old version of the tests against the rolled back database.
Say the team is currently working on release 3. Here is what the cruise build loop does:
- Builds the database to release 2 by running each version of the DDL and DML in succession (eg, DDL v1, DML v1, DDL v2, DML v2)
- Dumps the database schema to a file using “rake db:schema:dump”
- Copies the static data tables to backup tables, eg: content -> content_last_release
- Builds the database to release 3 (DDL v3, DML v3)
- Runs the rollback script
- Dumps the database schema again and compares it with the saved file from step #2
- Checks that the static tables match the backups (content and content_last_release are identical)
- Runs the test suite from the release 2 branch on the current database
When comparing the content tables, it is tempting to write code that pulls the data from both tables as arrays and then does an assert_equal:
content = [..] content_last_release = [..] assert_equal content_last_release, content
However, when this assertion fails, the output will be two huge arrays that are mostly similar. It is very hard to look at the output and figure out what is different. We used code like this instead:
content = [..] content_last_release = [..] assert_equal , content - content_last_release assert_equal , content_last_release - content
Now, only the differences are displayed when it fails.
The cruise loop has been incredibly helpful. We found numerous bugs in the rollback script. Additionally, we no longer forget to update it when we make database changes.
Obviously, there is always some data that cannot be rolled back. The rolled back database will not be identical to the database before the upgrade. However, this cruise loop ensures that it is as similar as possible.