In our previous article, Testing in the DevOps Pipeline, we discussed release pipelines and the automated test cases which help make them a reality. We started with these topics because they touch on some of the more exciting and new ways that adopting DevOps methodologies impacts the entire software development lifecycle. Being able to automatically build, test, and deploy code changes at the touch of a button has transformed many historically-more-manual processes, like testing.
In this brave new world of automated tests and deployment is there still room for more “traditional” testing practices? And, does having more automated testing lead to more frequent and higher quality releases?
Not a Magic Bullet
While the ability to rapidly repeat tests for each new build — or on a schedule — provides a high level of confidence in system stability, automated testing should not be seen as a magic bullet.
Automated tests provide their greatest return on investment when they are reserved primarily for regression and smoke test suites while leaving most new feature testing manual.
When determining which test cases should be automated and which should be left manual, the upfront cost of automation can’t be ignored. In the time it takes to go through the entire process of creating a new automated test for a single new feature, a manual tester could have verified multiple new features.
With the upfront cost of time in mind, we recommend automating test cases that are either mission-critical or that are going to need multiple repetitions and keeping one-off feature testing manual.
A hybrid approach
While many within the industry believe more automation leads to faster releases, we find that a more “hybrid approach” often leads to faster and higher-quality releases.
In fact, some of our clients that have managed to make multiple daily deploys have done so by committing the resources needed to fully adopt the hybrid testing approach. These clients have established a separate Scrum team solely dedicated to the development of testing frameworks and automated regression and smoke test suites.
Having a team dedicated to this work increases confidence in the overall system stability with each new build, freeing the feature teams to concentrate on new features. As a result, the testers on the feature teams have the overall system stability they need to confidently push out new builds and not worry about attempting to automate incremental releases. In doing so, the teams are able to get new builds out much more quickly, deploying multiple production changes during work hours each day. Additionally, it can lower the cost and technical background needed for testers on the feature teams; instead of an SDET on each team, we can have a few SDETs on the framework team and primarily manual testers on the feature teams.
The test automation pyramid
As part of evaluating the cost of automation, another model to keep in mind is the test automation pyramid. This model is a way of visualizing which aspects of the system under test should be automated the most.
Unit tests constitute the base of the pyramid and receive the most automation, while User Interface (UI) tests receive the least. There are three main metrics that collectively determine how much automation should be used for a given area of the system:
- How long it takes to write the test
- How long it takes to run the test
- How much effort is needed to maintain the test
Unit tests are relatively quick to write, extremely fast to run, and are reliable tests that should be updated as part of the development process.
On the other end of the spectrum, UI automaton can take quite a while to write. Running tests that require WebDriver instances to open browsers and navigate web pages is a slow process, and the nature of UI automation is particularly brittle and requires a lot of maintenance.
For these reasons, we encourage an abundance of automation at the unit and service layers while UI automation should be reserved for rarer occasions that justify the upfront and upkeep costs.
However, while we discourage being overly zealous with automation at the UI level, that doesn’t mean it shouldn’t be tested; this is where our traditional testers once again have a crucial role to play.
Where do testers test? Answer: everywhere
In our previous article, we talked about the benefits of release pipelines and the automated testing that gives us the confidence needed to use them. While these practices have been incredibly helpful, automating and streamlining the deployment process, there’s still often a need for manual verification.
As a result, most of these pipelines do not push directly to a production environment. A common pattern illustrating the typical flow of a build from its creation to final deployment in production might look something like this: Dev -> Testing -> Staging -> Prod. Each of these stops in new environments serve as manual gates where someone (usually a tester) must sign off on the build before allowing it to continue to flow through to the higher environments.
Let’s take a look at how a new build might flow through this process.
A build is created by a developer and tested locally in a dev environment. At this stage, the developer is expected to execute unit tests and perform additional cursory testing before pushing to a testing environment. This deployment process should be automated via the help of a build server (e.g. Jenkins) and include automated smoke tests that ensure basic stability of each new build. Once deployed to the test environment, the tester can now run any automation that is not part of the pipeline and begin the process of manual verification.
These intermediate environments between the developer’s local environment and production are crucial to the pragmatic approach of blending manual and automated testing. The automated smoke tests in the pipeline during deployment to the intermediate environments gives confidence of general build stability, while the feature-specific manual testing gives confidence to the new feature itself. Should any of the automated tests fail during deployment, or if the tester finds a bug in an intermediate environment, an issue is raised with the developer and the process starts again from the beginning.
Once the feature team tester has signed off on the release, the build can continue to a staging environment. Staging environments mirror production and, as the name implies, are used to stage changes before they finally go live. In blue/green deployment environments, the staging environment will be the currently inactive server cluster. The release build is deployed to the inactive cluster, and we are provided with a final opportunity for manual testing before going live.
If everything continues to go smoothly in staging, the clusters are flipped, the inactive servers go live, and the previously-active production servers become inactive and ready for use as the next release’s staging environment.
Even after the build is live in production, the job of the tester is not yet finished. Final manual testing of the build and monitoring for any errors or outages should be standard post-release procedure.
Isn’t this risky?
Monitoring your system in production for outages and increased error rates is a critical component to successfully deploying more frequently. While most of this article has explored the benefits of manual testing new features, the potential risks of not automating these tests should be acknowledged. Any features not included in the automated regression tests are features that could possibly break with future builds. Monitoring for outages and increased error rates is a way to mitigate the risk of any issues going unnoticed.
This idea of increased risk might make some uncomfortable, but the unfeasibility of full test coverage gets at the heart of what makes a good tester.
Testers have long understood that it’s impossible to exhaustively test a non-trivial system, and therefore the role of the tester is to analyze risk and come up with the most pragmatic test plan. There is inevitably a tradeoff between the risks of missing a bug and the time-to-market needs of the organization. Expecting to automate every test will cause unnecessary delays in the process and may even decrease the overall quality of the system.
Now more than ever
The skills testers have cultivated for decades regarding risk analysis and test coverage are needed now more than ever and should not be ignored in an attempt to automate everything. By extending those skills to the decisions about which tests to automate, it’s possible to create a test plan that pragmatically blends the best of both automated and manual testing and that mitigates potential risk with post-release monitoring.