In my recent blog, Revolutionizing the Nine Pillars of SRE with AI-Engineered Tools, I indicated that AI could assist in shifting the wisdom of production to the left. In this blog, I’ll explain in more detail how AI-engineered tools can be used to improve the pillar that I call SRE culture.
AI Use Cases for Shifting SRE Culture Left
Shifting the “wisdom of production” to the left is all about bringing production considerations into the development process as early as possible. Here are some use cases where AI can help facilitate this shift, along with specific challenges and AI tools that can assist.
Predicting System Behavior: Predicting system behavior under various conditions can be difficult, especially as systems grow more complex. AI can help by analyzing historical data and predicting system behavior. Tools like Splunk can use machine learning to provide these predictions and facilitate earlier decision-making.
Automated Code Reviews: Manually reviewing code for potential production issues can be time-consuming and error-prone. AI can automate this process, identifying potential issues more quickly and accurately. Tools like DeepCode or Codota use AI to analyze code and identify potential issues.
Security Vulnerability Detection: Identifying security vulnerabilities early in the development process can prevent costly and damaging breaches in production. AI can analyze code to identify potential vulnerabilities that might be missed by human reviewers. Tools like Snyk or WhiteSource can help with this.
Performance Optimization: Predicting how code changes will impact performance can be challenging. AI can analyze changes and predict their impact on system performance. Tools like Akamas leverage AI to automate the performance optimization process.
Dependency Analysis: Identifying how changes in one part of a system will impact other parts can be difficult. AI can analyze system dependencies and predict the impact of changes. Tools like FOSSA can help in automating dependency analysis and identifying potential issues.
Resource Allocation: Predicting resource needs for new features or changes can help prevent resource shortages in production. AI can analyze resource usage patterns and predict future needs. Tools like Turbonomic or Densify use AI to optimize resource allocation.
User Experience Optimization: Predicting how changes will impact the user experience can help prevent negative impacts in production. AI can analyze user behavior and predict the impact of changes on user experience. Tools like FullStory or Crazy Egg can provide AI-driven insights into user behavior.
RoadMap to SRE Wisdom
A practical implementation roadmap for introducing AI tools to help shift wisdom to the left may include the following steps.
Identify Needs and Goals: Start by identifying the areas where shifting wisdom to the left could have the most impact. This might include areas where issues are frequently discovered late in the development process or where manual processes are time-consuming or error-prone. Define clear goals for what you hope to achieve by introducing AI in these areas.
Research Tools: Once you’ve identified your needs and goals, research the AI tools available that could help meet those needs. Consider factors like the tool’s capabilities, integration with your existing systems, cost and support options.
Evaluate Tools: Based on your research, select a few tools to evaluate more thoroughly. This could involve setting up trials or demos, speaking with the tool’s developers or other users and assessing how well the tool meets your needs.
Pilot Implementation: Once you’ve selected a tool, start with a pilot implementation on a small scale. This will allow you to learn how to use the tool, identify any potential issues or obstacles and understand the tool’s impact before rolling it out more broadly.
Training: Provide training for your team on the new tool. This could include formal training sessions as well as ongoing support as they start using the tool.
Monitor and Adjust: After the tool has been implemented, monitor its use and impact. This could involve tracking metrics related to your initial goals, as well as soliciting feedback from your team. Use this information to make any necessary adjustments to your use of the tool.
Expand Use: Once you’re satisfied with the results of your pilot implementation, expand the use of the tool to other areas of your organization. Continue to monitor its use and impact and make any necessary adjustments.
Repeat: Technology is constantly evolving, and there may be new tools or features that could further improve your ability to shift wisdom to the left. Periodically revisit your needs and goals, and repeat this process to continuously improve your processes.
Summary
Shifting the wisdom of production to the left is an essential strategy for modern IT organizations, but it’s a journey that requires thoughtful planning, the right tools, and a commitment to continuous learning and improvement. As we’ve explored, there are several compelling use cases for leveraging AI in this transformative process. Whether it’s predicting system behavior, automating code reviews, detecting security vulnerabilities, optimizing performance, analyzing dependencies, managing resources or enhancing the user experience, AI tools are helping organizations tackle these challenges more efficiently and effectively.
While embarking on this journey may seem daunting, a well-crafted implementation roadmap can help make the process manageable and rewarding. By identifying specific needs and goals, researching and evaluating the right tools, piloting their implementation, providing comprehensive training and consistently monitoring and adjusting their usage, organizations can successfully adopt AI tools to shift the wisdom of production to the left. With continued effort and commitment, this transformation can lead to more informed decision-making, more efficient processes and, ultimately, better outcomes in production.