In this blog post we are going to take a glance at Rainforest's virtual machine (VM) infrastructure. We will explain why we use VMs in the first place, what requirements we have for them, how the build process works and how we decide which VMs should be running at a given time.
Let’s start with a question: "Why do we need virtual machines?". One might wonder why we bother maintaining an infrastructure of thousands of concurrent VMs — it's a considerable effort. The main reason is that with VMs we can have the tests up and running on an environment that has a clean state. If we relied on our testers' own workstations, factors including settings, plugins installed and additional software would have an impact on the test results.
In order to run the tests on a clean state, we use a fresh virtual machine for each test. After the test is finished, the VM is destroyed and another one is launched. The number of VMs we create daily is a bit short of 200k!
Using VMs allows us to have more control over the scenarios we want to test. We may want to run tests on an environment with a combination of software installed on it. Using VMs gives us a perfectly consistent experience between each test, since they are all running the same (virtual) hardware. We can then rule out failures that might happen in one test but not another due to, for example, a bug that is specific to a graphics card. The same applies to the device drivers.
There is another benefit of using the VMs: as long as there’s a VM available, you can start testing right away. There’s no need to change any settings, open or close applications, or clear cookies. VMs make the whole testing experience faster.
To cover all of the test cases our customers need while keeping the experience consistent, we have many different VM templates. These templates usually consist of a specific version of an operating system and a browser (or application, e.g. on mobile tests). We do this to make sure that when a test is started, the test only has what’s necessary to run through the steps. The browser or application is already launched, so they don’t have to wait for it to spin up to get started. Think of it as a combination of a particular software stack where you want to test your application: Windows 7 Chrome 57, Windows 10 Edge, Android Phone, Android Tablet etc. This leaves us with about 50 different VM templates, each one for a particular scenario.
Every Rainforest customer has access to the templates for the most common software configurations that are found "in the wild," but we also build custom VMs to cover more specific needs from our clients. Those can include additional software, different settings, etc.
We use a system to create the VM templates that is built on top of Packer. This allow us to create a new template, or update an existing one, in a completely programmatic way in just a few minutes. No human intervention required besides testing the template at the end. Additionally, we can reuse parts of the build when they are required by multiple templates. For example, there are quite a few similarities between our Windows 7 Chrome and Firefox virtual machines, so we have common parts of the build scripts that are used on both. This helps us keep the configuration updated with less effort.
We want our VMs to be launched before a test is actually requested to shorten the time it takes for the overall test runs to be completed. So, after a test is finished and the VM is destroyed, we immediately launch a new one.
How do we know what VM template we should launch? We have a scheduler that knows which VMs should be running at a given time. It works by analyzing the trends from the last day, plus looking at the queue of work, and then starts the right VMs accordingly.
This approach has some drawbacks, as you can have VMs launched that are not the ones that are actually required for the requested tests. To prevent this, the scheduler knows monitors what is needed and will destroy VMs that are standing by and launch whatever is currently needed.
The effort of maintaining our own virtual machine infrastructure is well worth it for us. We can have more control over each individual virtual machine as well as the whole fleet, knowing exactly what is installed in the VMs and what is running at any given time. It makes it easier for us to reproduce bugs, which are easier to troubleshoot because we have a better understanding of the environment in which they were discovered. We can also predict and reduce the overall running time of tests because we control all of the parts.
We hope you enjoyed this introduction to the Rainforest VM infrastructure! We’d love your feedback on how to make our processes even better. Make sure to subscribe to get blog updates if you like the post — we’ll be providing more detailed and technical information about our VMs in future posts.