Optimal Environment Setup
In this post we're going to look at optimal environments for web apps. This is part two in a series - the first post looks at what are environments for?
To start with lets run over an optimal version of the most basic environments that everyone should have.
Production is where your website runs and serves customers. It's sometimes also referred to as 'live'. An optimal Production environment should have the following qualities:
- Stable – doesn’t break repeatedly
- Performant – fast enough not to annoy your users
- Backed-up regularly – you can recover in-case of a disaster / hack
- Easy to manage – simple, doesn’t require constant maintenance
- Repeatable – as in, repeatably creatable (avoid hand-made unicorns)
To achieve these, you'll need to pick everything from your hosting provider to software stack with care - much like baking a cake - to end up with a great result.
Backed-up regularly is the key. Getting this wrong (and being unlucky) can be lethal for your company.
Whatever your production environment ends up being (software- and hardware-wise), it will become a defacto template for later environments. Therefore making Production choices that fit within your other constraints (e.g. are your developers using Mac / Windows machines?) will make adopting the optimal set of non-production environments easier.
Let's look at those non-prod environments.
Development & Test
Assuming you're no longer editing code in Production, the next two most common and essential environments are test and development:
- Test – for running tests against
- Development – for a working local version of your project
For best results, you should run the same software versions as in Production.
How far should you go?
It's easy to get carried away and create environments for every possible use-case. However, setting them up takes time away from developing your actual product and so should be balanced against any potential upside. Since this is highly context-dependent, I'm going to outline the optimal set of environments for individuals, small teams and more advanced or complex projects separately.
Optimal environment setup for Individuals
Assuming you've got the basics of test, development and production, what should you do for the simplest of projects? For smaller projects, generally one with a single developer, it's all too easy to go overboard with environments. I suggest you keep it simple, and just add a Staging environment to the mix.
Staging is simple: it should be as close a copy to Production as you can afford. Staging's aim is to allow you to test larger and / or more dangerous changes in as similar a setting to Production as possible.
An optimal Staging setup:
- Is running the code which you are planning to release
- Has a recent (optionally sanitized1), version of Production data
- Includes the same services and versions as Production (except when testing upgrades of those services i.e. does Apache 2.2 -> 2.4 really work?)
- Has the same OS as Production
- Is running on the same hardware as Production
If you can't do all of the above, due to time or money, start from the top and work your way down.
The second item, having a recent copy of production data is surprisingly effective at catching missed errors fast. Without this, it's common to find migrations which work locally but fail in Production, causing all kinds of issues. Having a database that's very similar to production (a few days - a week old) makes it more likely that migrations will work as they have to run over much of the same data.
Note: this can become impractical at larger scale; moving the data, let alone sanitation can take too long to be practical. In this case, taking a subset of your production data can be a good halfway house.
For Small Teams
For smaller teams, adding an extra environment for QA is essential. It's different to staging in that it should have a known and consistent state; having this allows members of your team to manually (or hopefully for you, automatically via Rainforest) test your software in a rigorous way.
A great QA environment setup should have the following properties in order of importance:
- Be running the code which you are planning to release
- Have a known consistent state that is set or reset during the deploy process
- Run the same services and versions as production (unless you’re testing a service upgrade)
- Have the same OS and hardware as Production
Number 2 is something that catches people out. It's essential for any software that has state to avoid confusion and missed bugs. The easiest way to achieve this is to reset your database and seed it with known data after each deploy.
Getting this setup can be as easy as using basic tools like rake db:seed, or if you have a lot of data to seed - e.g. Rainforest has > 1000 accounts with data for QA testing, which takes ~25 minutes to regenerate - then a database backup restore is faster, though less flexible.
We use the awesome CircleCI to run our tests and manage our deployments. This is a cut-down version of what we run live. It does the following:
- Runs our tests (including Jasmine)
- Development branch is deployed to two Heroku apps; herokuapp-stg and herokuapp-qa
- Staging has it’s DB migrated, and is then restarted to make sure all the changes are noticed
- For QA we additionally un-gzip a seeded dumb file, which provides ~1000 consistent accounts for our testing.
- Finally, Rainforest is triggered (we test Rainforest’s QA env with Rainforest. Meta++)
This is our circle.yaml file:
- git fetch --unshallow
- git push -f email@example.com:herokuapp-stg.git $CIRCLE<em>SHA1:master
- heroku run rake db:migrate --app herokuapp-stg: timeout: 1800
- heroku restart --app herokuapp-stg
- git push -f firstname.lastname@example.org:herokuapp-qa.git $CIRCLE</em>SHA1:master
- heroku pg:reset DATABASE --app herokuapp-qa --confirm herokuapp-qa
- gzip -dc db/seeded.dump.gz | heroku pg:psql --app herokuapp-qa
- heroku run rake db:migrate --app herokuapp-qa: timeout: 1800
- heroku restart --app herokuapp-qa
- "curl https://herokuapp-qa.herokuapp.com/ > /dev/null"
- bundle exec rainforest run --token $RF_TOKEN --tag run-me --conflict abort --fail-fast -fg
- git fetch --unshallow
- git push -f email@example.com:herokuapp-prd.git $CIRCLE_SHA1:master
- heroku run rake db:migrate --app herokuapp-prd: timeout: 1800
- heroku restart --app herokuapp-prd
In addition to automating deployment and resetting databases, an oft-missed essential is documentation. Having a super-fancy deployment process is great, but it's a hindrance if (when?) it fails when you're on vacation and your whole team is blocked figuring things out. Document it.
For Complex Projects
More complicated projects, or teams with more developers have some interesting additional requirements to make them optimal.
Some important questions to be considered when designing an environment setup for complex projects and / or larger teams:
- Who really needs prod access?
- Key storage: where are keys stored? Does every developer need access?
- Have you automated your backups? Do you test them regularly? Is this automated?
Optimal things, in order of importance:
- Test each Pull Request like you would with Staging. With a larger team this provides an extra layer on top of code review and is faster3. This results in less things being missed before getting to Staging / QA, which makes everyone else more efficient.
- Make separate accounts (not just keys) for each environment. Specifically for services such as AWS, Stripe, Mixpanel; things that matter, that affect your business if they go wrong. Splitting accounts is partly for isolation, mainly for security – this mitigates many types of accidents as well as malicious changes to Production environments.
- Automate your backups AND automate the testing of them. The most basic way to test this is to restore it to staging and have a culture of using staging in your team.
- Document all the things. The bare minimum is:
How do new team members get started? Cover:
- What’s expected of them to contribute (e.g. tested code -> pull request tagging @rainforestapp/dev)
- Who’s on call
- SEV levels and what they mean and are4
- How to deal with indicents. Who to call? What SEV things are. Playbooks. If you’re interested in this topic more, watch this talk by Blake Gentry: Every Minute Counts: Coordinating Heroku’s Incident Response.
Regarding #1: If you're lucky enough to use Heroku, Fourchette is a life changer; once you try it, you won't go back. We've got an upcoming post on this subject, so subscribe at the bottom to get emailed when it comes out.
We'll be expanding on all the types of project individually in more detail over the coming months - let us know via twitter which you'd like first and you should subscribe to Deployment Academy to get our latest content first!
So now that you are convinced, where should you start? It depends on your needs and budget, but some great resources to read are attached below.
- I say optionally as some projects don’t have data needing sanitation, where as some will require it. If you’re storing data that isn’t “internal”, you should probably sanitize it.
- The answer is no.
- No more pulling a branch, migrating, checking something and having to rollback. No Sir.
- We have SEV1-4, with SEV1 being the highest (everyone gets woken up). We’ll do a post on this later.