All Things Perf Episode 2: Performance engineering and testing for micro services
6:35 mins
This is the second in a series of podcasts aimed at disseminating the tools, technologies and techniques adopted by the Predix Application Services Engineering Performance Engineering team to a wider audience. If you have missed the first episode, you can find it on the Predix resources page as an audio file and blog post. This second episode discusses on how we try to use performance testing as a checkpoint in the CI/CD pipeline. Please listen to or read it and provide your feedback on the content and future topics of interest.
Introduction
Hello and welcome to the second edition of “All things Perf” podcast. I am Siva Balan, Sr. Staff Performance engineer with the Predix Application Services Engineering team based out of the Software CoE in San Ramon, CA. This podcast series will take a stab at explaining how we approach performance engineering and testing of various micro services being developed as part of Predix.
In the first edition of the podcast, we discussed about technology choices for Performance engineering and testing of Predix micro services. If you have not heard that yet, I would highly encourage you to listen to it. This second episode in the series will focus on how we will be using performance testing as a checkpoint for a particular micro service in the Predix Application Services suite. We will discuss how we introduce non-functional testing as a requirement in the CI/CD pipeline for a build to be promoted to Production deployment.
The steps in CI/CD pipeline
The CI/CD process consists of 2 main components. Continuous Integration and Continuous Deployment. First comes the CI process. The CI process kicks in as soon as a developer checks-in a functional piece of code to a branch where a CI tool like Jenkins or Atlassian Bamboo is monitoring for any new check-ins. In our case, we will use Jenkins as the CI tool of choice. Jenkins checks-out the code from the source control repository, which in our case will be GitHub, and it builds it as a first step. It then runs static code analysis, unit tests and code coverage reports and if everything passes, it marks that build as successful and pushes the build to the artifactory.
The next step in the process is the integration tests. The integration tests are run by Jenkins using the artifact from artifactory from the unit tests, making sure all the tests are successful and the build is now ready to go through non-functional testing. Performance testing will act as a bridge between CI and CD. Only if the non-functional tests are successful, the build will be promoted to the CD pipeline that will deliver it to Production. We will talk about the CD pipeline in another episode of the podcast. We will focus on how we will use performance testing as a deciding factor for a build to be deployed to Production.
Performance testing process
Once the build has passed integration tests, it is now ready to go through non-functional testing. This involves the micro service to be deployed to the Cloud Foundry environment, bound to any databases, messaging and caching services, bound to logging and monitoring services and then tested for non-functional requirements. We also need to load any testing data before the actual tests can be run. Once the environment is ready for performance testing, we typically run 3 types of tests. The first is the Capacity test to determine that the application is performing optimally given a certain set of workload on 1 JVM instance. The second is a Scalability test to determine how the application scales on multiple JVM instances given the increase in the workload. The third and last one is an Endurance test that can run for several hours or more to identify any resource leaks. We want to ideally run all three tests for every build and make sure all SLAs are being satisfied before it is deployed to Production. So let's now see how we can automate this process of performance testing a build and mark it as Production ready.
Test automation
With a combination of Jenkins and some shell scripting, it should be possible to automate the performance testing process. Once the integration tests are done and a successful build is pushed into artifactory, that artifact will be used for performance testing. A Jenkins job will first download that build from the artifactory and push it a performance testing space in Cloud Foundry with custom manifest.yml file specifically tailored for non-functional testing. Then comes the process of binding this pushed micro service to specific DB, messaging and caching services and loading performance test data as necessary. Then this micro service will also be bound to performance test specific logging and monitoring services. As discussed in previous episode, we will be using ELK stack for logging service and new relic for monitoring service.
One thing to keep in mind is that at the start of every performance test, the database, messaging and caching service will not be re-used from previous tests, as we will be loading fresh test data for every test. But the logging and monitoring services will be re-used, as we want to maintain a history of logs and resource utilization metrics for comparison purposes. So when we bind services at the start of the test, we need to pay attention to what services need to be recreated and what need to be re-used from previous tests.
Once the environment is prepared for testing, we are now ready to start the tests. We can either run all 3 types of tests in parallel or one after the other. Given that we are running tests in a highly scalable Cloud Foundry environment, we should be able to run all 3 types of tests in parallel. We will use Jenkins to kick off the tests and monitor them. Once the tests are done, we will have some custom scripts to evaluate the SLAs of each test and if all SLAs are satisfied, we will mark the tests as passed and the build will be marked to be ready for Production deployment. Even if one of the 3 tests fails the SLA, the build will be marked as failed and will require further evaluation.
This process will ensure that only well-tested builds are pushed to Production and if they fail any of the non-functional tests, it should not be deployed to Production. The CD process will not deploy a build to Production unless the non-functional test Jenkins job marks the build as successful.
Conclusion
I hope that gave you some ideas on how non-functional testing can be used as a checkpoint in your CI/CD pipeline when deploying your apps or micro services in Cloud Foundry. There are different ways to do this but the key takeaway should be that your apps or micro services should never be deployed to Production unless it has gone through rigorous non-functional testing. It will save you a lot of nights and weekends on calls and pager-duty if non-functional testing is part of your CI/CD pipeline.
Thanks for listening and I hope you all enjoyed this podcast episode. We would love to have your feedback on this podcast and your suggestions on topics in performance engineering you would like to hear more about. You can reach me at balan@ge.com with your comments and suggestions. Until next time, this is Siva Balan, signing off from San Ramon, CA. Thank you.