Implementing Continuous Delivery: How We Ship Code at Rainforest
Our basic process and tools for shipping code at Rainforest haven't changed much in the last year or so (we use Github for hosting our code, CircleCI for CI and continuous deployment, and Heroku for our infrastructure), but we've been tweaking and improving things as needed. Most notably, we've recently automated a few more parts of our deployment process and open-sourced the results as a gem called Circlemator (pull requests welcome!).
Continuous Delivery at Rainforest
It should come as no surprise that we're big fans of continuous delivery here at Rainforest and try to adopt it as the basis for our deployment strategy. But what does continuous delivery mean from a practical perspective? For us, continuous delivery has two equally important aspects:
- Eliminating “bad shipping roadblocks”.
- Streamlining and automating “good shipping roadblocks”.
What makes a roadblock good or bad? "Good roadblocks" are hurdles that prevent bugs and low-quality code from reaching production, such as code review, unit tests, and of course QA 😉. Bad roadblocks, on the other hand, are all the "button-pushing" and bureaucratic activities that traditional deployment processes are rife with: cutting releases, copying code, manually restarting servers, tweaking configurations, and the like.
A Code Commit's Incredible Journey
A Rainforest code commit has to jump through quite a few hoops before it makes it to production:
- The code starts out on a feature branch (we never commit directly to the develop or master branches). Once the author is convinced that it’s ready to ship, they open a pull request to the develop branch.
- At least one developer reviews the code; if there are any issues, the original author make revisions until the reviewer gives a 👍.
- Our unit test suite is run against the feature branch on CircleCI.
- Once unit tests pass and the reviewer is satisfied, the code is merged to the develop branch.
- A developer (typically the author or the merger) opens up a release pull request from develop to master, with a brief description of the changes.
- The unit tests are run again (in case of issues introduced by integration with other pull requests); the code is deployed to our QA environment; and our Rainforest QA suite is run against the QA environment.
- If the unit tests and QA suite pass, the code is merged to the master branch (more on that in a bit).
- Once merged to the master branch, the unit tests are run a final time and the new code is deployed to our production environment on Heroku.
(For more details on our branching strategy and how we do code reviews, see our blog post on how we use version control.)
It's worth noting that these steps involve a mixture of human intervention and automation: steps 1 and 2 are necessarily manual, while steps 3, 6, and 8 are easily automatable with a CI server such as CircleCI. But what about 4 and 7? Should code be merged automatically? And is step 5 even necessary at all?
Eliminating a Bad Roadblock: Auto-Merging to Master
The question of "how automatic" your deployment process is depends on a number of factors, such as your risk tolerance and your confidence in your testing and monitoring tools. We've kept step 4 (merging to develop) a manual process for now, since automating it has limited benefits and invites potential mishaps.
Merging to master (step 7), on the other hand, was until recently a "bad roadblock": it generally involved a developer keeping tabs on the build process and hitting the merge button once it finished. Even worse, sometimes the build would succeed without anyone merging, leading to lost opportunities to ship.
Since code review takes place before code hits develop, the final merge wasn't really adding any assurances; it seemed like a good candidate for automation. To eliminate the roadblock, we wrote a small utility (part of Circlemator) to self-merge the release pull request at the end of the build. This may seem like a small change, but it's had a surprisingly positive impact on our release cadence.
There are still some situations where releasing automatically could be a bad idea (for instance, at the end of a day when no one will be around to monitor production). We kept step 5 as a compromise: if no release pull request is open, Circlemator won't merge to master. This allows us to keep our development cadence running smoothly without shipping to production under special circumstances.
Streamlining Good Roadblocks
A Dash of Automated Code Review
Code review is an inherently manual process, since it involves human judgment. But that doesn't mean it can't be streamlined with a healthy dose of automation.
Typical reviews often include a fair amount of bikeshedding about code style and formatting, as well as checking for common "gotchas". To minimize trivialities and keep code reviews focused on important issues, we introduced automatic style-checking through Rubocop for our Ruby projects. We use a standardized style configuration (open sourced as rf-stylez) and included a task in Circlemator to make comments on pull requests when there are style violations. (There are commercial products that do similar things, but we found the open source Pronto gem to be sufficient for our needs.)
Finding the Right Reviewer
Code reviews, while awesome, can quickly turn into a major bottleneck if they're "hierarchical"—i.e. each review has to come from a "more senior" developer than the code author. To avoid this, our code review process (like the rest of our development process) is based almost entirely on peer feedback.
Still, any codebase will have a few areas that require extra scrutiny—a peer code review may not be enough for security-related code, for instance. To help make sure changes to safety-critical code doesn't accidentally slip past scrutiny, we wrote a small utility called Commentator (naturally included in Circlemator) that we use to add checklists to pull requests when particular files change. For instance, any changes to our payment code have to be looked at by two developers, including our CTO.
Our deployment strategy works well overall, but there's still plenty of room for improvement. Here are some improvements we're planning to implement:
Our unit test suite is pretty extensive, and is getting a bit slow as it expands: even heavily parallelized through CircleCI, it takes about 20 minutes. On top of that, our Rainforest test suite takes another 20 minutes on average. That means that the total time for a commit to get to production once pushed is about an hour and a half in the best case—acceptable for most features, but a bit on the slow side for bug fixes. That increases the temptation to break process in "special circumstances", something we try to avoid whenever possible.
There are a few steps we're planning to take in the near future to speed things up:
- The obvious first step is to run our unit tests and our Rainforest tests at the same time (right now they’re run sequentially), which would shave about 15 minutes from our develop build.
- Our unit test suite could do with a lot of optimization work; we haven’t yet put in the time to speed things up, but there’s probably some low-hanging fruit.
- If all else fails, we can always increase parallelism on CircleCI.
I'd ideally like to get our feature branch builds under 10 minutes and our develop builds to 20 minutes.
More Automated Code Review
Integrating Benchmarks Into the Build
So far, all of our automated "code hurdles" involve quality and bug prevention; we don't have any checks for runtime speed. Integrating an automated benchmark suite into the build would be a great way to make sure we don't inadvertently introduce performance regressions, as well as making sure we're hitting our overall production performance targets.
Long Term: Moving to a "Train"
Right now our code is released in batches from our develop branch, so each release will typically involve a number of pull requests (and merging a new pull request to develop will cancel previous develop builds). We've considered moving to a "train" model instead, where multiple develop builds can be run in parallel and released sequentially. There are a number of technical challenges involved, however, so we're not sure whether it's worth the development effort at the moment (improving build speed seems like lower hanging fruit).
Tell Us How You Do Continuous Delivery!
We're pretty happy with our deployment strategy —- it strikes a balance between shipping speed and quality control that we're comfortable with for now. Every development team is different, though, and we're always curious to know how other teams deploy code. In particular, we're always looking for more things to automate (assuming it's worth the effort). If you can think of anything, let us know!.