As we close out 2019, we at staging-devopsy.kinsta.cloud wanted to highlight the five most popular articles of the year. Following is the second in our weeklong series of the Best of 2019.
Burnout is a recurring theme at DevOpsDays events around the world. The modern developer faces constantly and rapidly-evolving challenges: the agile movement reducing the development lifespan, the DevOps movement shifting production operational responsibilities into development processes and the complexity of cloud offering new architectures from SaaS through to FaaS, as well as microservices.
As these shifts occur, often ignored are the operational skills needed to support an application running in production. Developers expect to support production and be on call 24/7, but are IT leaders giving them the necessary tools and training? More and more, we need developers to be prepared and able to seamlessly oscillate between focused specialist and broad generalist. This article highlights a few key areas where software engineers need to bolster their skill sets to remain agile and effective in building applications.
DevOps and SRE Assume Broader Skills
Non-developers may interpret agile computing as developers creating a world where they get to code rather than document, follow process or deliver to plan. In reality, agile took the lengthy cycle of waterfall releases and replaced it with the frenetic continuous delivery of valuable software. DevOps was born from the need to deliver application updates rapidly and regularly.
The monitor phase of the DevOps infinite loop originally focused on ensuring application problem detection and diagnosis was possible. Moving from detection to prevention, DevOps practitioners had to shift their testing and quality processes. Developers were asked to code the application, while also creating the testing systems to validate their code and building the production executables. Progress here led to additional responsibilities. As development moved to agile and DevOps, infrastructure transformed into software defined infrastructure (SDI).
The separation of the operating system from the server hardware through virtual machines started the journey to SDI. Application containers removed the OS requirement and added cross-server mobility. Kubernetes and Docker Swarm allowed developers to encode application requirements in the recursively titled YAML. After containers, microservices is a further refinement that isolates an action into a separate piece of code—which is well suited for workload-based pricing of cloud environments. SDI’s capabilities bring us to where infrastructure definitions require a software development lifecycle (SDLC) of their own.
The outcome of all this is a massive increase in complexity that developers must consider as they code components of an application. It requires additional considerations of the infrastructure and business environment the application will run in. Developers can define the best YAML definition of needs for their new application, but does not mean the production infrastructure will be able to provide a matching environment.
Skills Developers Must Enhance
Here are some key areas where developers would benefit from honing their skills in order to stay relevant amid growing complexities mentioned above.
Networking
The single purpose of computer networking is to move data from the memory of one computer into the memory of a different computer. With TCP/IP protocols as the default network stack, the flexibility needed to survive a nuclear attack comes with a lot of variability in configuration. At the basic level, network components must be configured to identify how many bits of an address to use when searching for another network entity via Classless Inter Domain Routing (CIDR).
Software engineers need networking skills to understand how their software definitions may conflict with the eventual production network infrastructure. For instance, network definitions in the YAML descriptors for Kubernetes/Docker are generally set with a /nn configuration (e.g. /16). Conflicts arise when you overlap the pod addresses with physical server addresses or have a mismatch between development pods and production pods. If your pods-per-node count differs between pre-prod and prod environments, it is possible the CIDR settings will differ as well—as a colleague regrettably learned when using /8 rather than /16, leading to the largest production outage in their career.
In addition to these basics of network addressing, there are concepts such as micro-segmentation needing definitions of performance and connection requirements that span across containers, virtual machines and bare metal servers.
Security and Segmentation
Historically, applications have relied on production infrastructures with their firewalls and DMZs for protection from the outside world. That castle moat approach is insufficient for today’s threat landscape where applications live everywhere. Not only do enterprises need to protect against outside connections and attacks, they also need to prevent accidental connections from production applications to development, QA and testing environments. Crossing the streams can cause exposure of production data in test environments and, even worse, corruption of production data.
Ensuring that application components only communicate to specific services in specific environments is addressed with micro-segmentation and its ability to manage and monitor secure code-to-code communication. Control with this detail comes with its own challenges as incorrect or insufficient definitions could leave a production application unable to communicate with needed API’s or microservices. Making diagnosis in micro-segmented environments harder is the possibility that traffic for other transactions or applications may be successfully crossing the same environment. This is when you may hear the classic “it is working fine in test” response.
Micro-segmentation offers significant security benefits yet requires developers to clearly identify the required connections and components of any communication. The best time to note this is while coding, making this an important skill for the modern developer.
Dev and Ops Synchronicity
The synchronous arrival of DevOps and SDI provides a mutual opportunity for developer and operations teams. While developers need to learn production on-call skills and operations teams need to learn SDLC skills, both teams also need to remain current in security concepts. Some organizations tackle DevOps by creating merged teams, though members may continue to stay in their lane. Whether merging teams, or maintaining separation, it is beneficial to hold brown bag sessions to share information and help educate each other. Ask your operations peers to talk about their three most common production issues and the three most popular tools they use.
An important aspect of the application on-call generalist is having the right tool supporting them to effectively diagnose production outages and slowdowns. Whether developed as an in-house solution (with incumbent technical debt) or selected as a product, you will need something that can see from end-to-end and allow the code-centric software engineer to drill into databases, storage, networks and code components from an application point of view without having to learn siloed management and monitoring tools for each aspect of infrastructure.
While individuals come into a DevOps transformation with I-shaped specialists, they must leave as T-shaped engineers, thus enhancing their value to current employers and helping their career.