My earlier blog, “From Laboratories to DevOps: Factories for Continuous Testing,” discussed the importance of and suggested practices for a fully automated, orchestrated software DevOps laboratory infrastructure, akin to a modern factory staffed by robots churning out products at a dizzy pace. The blog applies to development of software for networks, and the practices increasingly are important as the network becomes more software-based thanks to software-defined networking (SDN), network functions virtualization infrastructure (NFVI) and cloud-based networking. The edges of mobile networks, even the radio access networks, are becoming more software-dependent due to the need to position computing resources closer to mobile customers to resolve latency, bandwidth and spectrum concerns.
However, the need for highly automated software processes does not stop within the development lab.
At a recent DevOps Networking event in Santa Clara, California, Google Network Operations Manager Anees Shaikh indicated that Google’s live network has many thousands of nodes and management elements, supported by millions of lines of software-based configuration files that are updated tens of thousands of times per month. The sheer volume and speed of software updates for large networks is not feasible using manual methods. Automation tools for live network software updates are vital.
Typical live network DevOps environments use DevOps continuous tool chains such as Git for software version management, YAML for defining abstract configuration definitions, Jinja2 to translate the configurations into concrete templates that can be consumed by real devices and Ansible for automating the distribution and activation of config files to the network nodes. But what about testing? How can a network operator be confident that the many updates will perform as expected once deployed across the live network? This requires much more than simple syntax and semantics checking of the config file entries.
Recommended Practices for Continuous Testing of Live Network DevOps Environments
For continuous testing of live network DevOps environments to work, here are some best practices to consider:
- Lab-to-live transition: Prior to deployment of any updates to the live network software updates should be tested in a lab environment using production-equivalent equipment and topologies.
- What to test? The actual tests to be executed depend on the risk of changes that are being deployed. If a small, low-risk configuration change is all that is being deployed, then a simple configuration check may be sufficient. On the other hand, if a change has the potential to disrupt the network performance significantly, then a much more comprehensive performance test is needed.
- Test orchestration tools: The test environment prior to deployment will vary depending on the tests that are selected. Test environment orchestration tools are needed to set up network test topologies that match the tests rapidly—and release them just as rapidly so the test resources can be used for the next tests.
- Elastic on-demand topologies: The ability to scale test resources horizontally and vertically, on-demand, is essential to be able to match the speed of testing required for short duration deployment cycles.
- Automate live network tests during deployment: To verify software updates do not interrupt or degrade network performance, functional and performance tests between pairs of nodes across the network mesh are required as the updates are deployed in stages across the live network nodes, clusters, centers and regions.
- Automate live network testing after deployment: After verifying that the deployments are successful, network tests need to be scheduled periodically to ensure latent problems with the new software do not impact the network.
This is just a partial list of suggestions for continuous testing essentials for live network DevOps.