Zero downtime database migrations
Last updated in October 2020.
All developers will at some time or other have to deploy code that changes the structure of a database. You might have downtime during these deploys due to database changes being required by the new code that you just deployed. There are easy ways to work around those downtimes and provide you a real zero downtime deployment process.
How does that work? I found out that to explain the basics of how to handle database migrations properly, it works better with some examples.
Before jumping to the examples, let me tell you that we, at Rainforest, are using Rails and its built-in database migration system, so this might be slightly different for you but the idea is the same. We also use Postgres which doesn't create the obstacles that some other database systems have, for example with locking tables on schema changes. If you use MySQL, you should definitely take a look at the Large Hadron Migrator which is used by a bunch of people including GitHub and Shopify.
Before we get started, if you remember one thing from this blog post it should be plan your database migrations carefully. It takes more time to do migrations with no downtime but it is worth it.
Case #1: I want to add a new column
This is an easy one.
Before, you would have added a new column, let's call it 'status' and deployed the code at the same time. For a brief moment, your could have tried to insert data in a column that is not existing (yet) and experience errors which could make your application unusable. Let's see how we can do this safely without impacting our clients.
- Add a migration to create that new column.
- Deploy that new code with this migration, after getting through your usual process (feature branch, pull request, code review, QA, etc.).
- Run the migration to create the new column.
- Get back to your code editor and add your new feature using that new column.
- Ship it.
As you can see, we did not add code to use it in the first place but just added the new column, and then in a second phase we actually started to use it. There was no downtime and no client was harmed in the making of this example.
Case #2: I want to change a column type
This is already way more complex.
Let's say you have a column for which you want to change the type. There could be a couple of reasons for that, but let's say you have a "text" column you are using to store serialized JSON. Hey, this could be a native Postgres JSON column, right? Let's do the change.
We'll have to go through a few phases here
- Add a migration to create the new column called "my_column_JSON"
- Ship it!
- Add a migration to migrate all the data from "my_column" to the new column. You should probably do it in chunks. What we do at Rainforest is usually kick off a background job that will do the migration in batches of 100 or 1000. Once that background job is done, it's kicking off another one, and so on. The idea is to prevent any transactions taking too long. Again, the idea is to have little to no impact on clients and overall performance.
- As part of this step, you also need to add a trigger of some sort updating the JSON version when a row is added or updated. In Rails, this could be as simple as adding this to your model: ruby class MyModel serialize :my_column before_save do my_column_JSON = my_column end end
- Ship it!
- Rename the columns: the old one to "my_column_deprecated_27_06_2014" and the new one to "my_column".
- Ship it!
- Remove all the hacks from the phase 3.
- Ship it!
We're done, you can move on to your next feature!
Case #3: I want to rename a column
You probably got the point now, right? So let's make this one more straightforward.
- Create the column with the new name you want
- Add some sort of trigger to fill the new column as the new rows are created (as shown in the previous example)
- Migrate all the current data
- Change the code to use that new column
- Rename the old column to a "my_old_column_deprecated"
Bonus: I want to delete a column
"Hey JP, there is no downtime usually when I do this!". Correct. I just wanted to tell you a little trick we use at Rainforest. Don't you like free tips?
If you just remove the column, well, this is not reversible and you can't rollback as the data will have been deleted.
It's simple: don't delete it just yet - just rename the column to "ZZZ_my_column_deprecated_20201007" and boom, you can rollback! ZZZ also means it won't show first, and is easy to find.
Some more tips
I hope that will help you to get your deployment process as frictionless as possible.
Thanks for reading.