Software testing: how we test our platform on a daily basis
When developing a new feature, you often check to see if your feature functions: does it do what you expect it to do, does it break when you do something unexpected. Developers repeat this process over and over again until the feature is ready and can be deployed.
But this way of testing has drawbacks: what if the developer thought of a scenario A, but forgot to think about scenario B (or C, or D)? What if a feature breaks functionality somewhere else on the site? Should you check each scenario that can happen on your website? If we did, we probably spend our entire day testing instead of developing, so we usually opt for local testing only (that what the feature affects), so developers cannot guarantee that their feature doesn't change other parts of the site.
To make things clear, there is no way to achieve a 100% guarantee, but the next best thing is a test strategy that consists of multiple forms of automated testing. Each type of test serves another purpose.
At Seams-CMS, we primarily focus on the following tests:
- Unit tests
- Regression tests
- End-to-end tests
A unit-test is a test where we focus on the smallest parts of the code. This could be as small as a single function. These tests are usually created by the developer who created the function or feature. The sole purpose of these unit-tests is to check that when we pass a function a specific value, we get the expected value back. We also check what happens when we pass unexpected values (or even no values at all): does the function behave as expected?
Usually, these unit-tests will run fast and without external dependencies. For instance, when we have a function that would save some data to a database, we don't check to see if the data is being saved in the database, but instead, we would check to see if the function calls the correct database functions by "mocking" the database. With mocking, we replace components such as database, but also other parts of the code with dummy code, which we can control. So when checking a function that would send out a confirmation email to a user, we check if that function creates an email with the expected content if the subject is correct if we send it to the right person, etc. So unit tests are by definition fast and contained, meaning we can run them over and over again during the development process (in some cases, tests can run simultaneously with development, so you get instant feedback on your code).
At Seams-CMS, every time new code is deployed to our platform, we run our unit tests. They must pass 100% before the code gets deployed.
Sometimes, when you discover a bug, you need to fix that code. But how do we know that that code fixes the bug? To know for sure, we usually create a regression test (which is a unit-test) that encapsulates the bug: the test triggers the bug, so by definition, these tests will fail.
At this point, we can fix the bug, possibly write some more tests if needed, but once fixed, the regression test would pass.
An extra benefit of this is that we now have a test that checks for that bug and makes sure that the bug does not return (which could happen for any reason).
End-to-end tests are tests that will test a whole system from start to finish (or end to end). It sits on the chair of a user and behaves the same way as an end-user would do, expecting to see everything an end-user would see. So if there is an end-to-end test where we register as a new user, we expect to be able to log in as that user and to receive an email with information for that user, etc.
Often, end-to-end tests for web applications are done through so-called headless browsers. These are applications that can control browsers programmatically. So you could create a test that would open your website in a browser and programmatically check to see if there is a menu present, if there is a "welcome" text, if there are buttons to go to other pages etc.
At Seams-CMS, we use the Cypress Suite to test our platform.
Our tests are run every night (or whenever a developer wants) and consists of pretty much the following steps:
- Create a completely isolated platform environment. It automatically creates a duplicate version of our platform for the sole purpose of testing. For this, we use Docker containerization.
- Initialize our test platform with data. This data is partially random content and partially fixed content so we can create tests that expect certain users and content to be present.
- Split all the tests into 20 different pieces, and create 20 platforms on 20 different virtual machines.
- Collect the results from all 20 machines, and aggregate them into a big report.
- Store the report and notify developers about the results.
The reason we test on 20 different machines is that end-to-end tests can take up a long time. Since we don't want to wait too long, we run tests parallel. This means that we have to write tests in such a way that they don't interfere with others (which is how tests should be created typically, but which often fails). Due to this parallelization, we can run our tests in less than 15 minutes, instead of the hours we would have to way for them otherwise. If, after writing more and more tests, we pass our 15-minute threshold, we can increase the number of machines (30 instead of 20, for instance), to make sure we stay below 15 minutes.