The Deterministic Future of AI-Generated Code - staging-devopsy.kinsta.cloud

AI completely changed how we write code. A few years ago, people were still trying to type faster. There were even competitions about it. Now typing doesn’t matter. You can just ask an AI to write the code for you, and it will do it in seconds. The bottleneck is gone. We’re in a world of superhuman velocity.

But with that, we also get superhuman risk. You can generate a ton of code that looks right, but you don’t really know if it does what you think it does. Or if it fits the architecture you wanted. So the question is no longer “how do we write faster?” It’s “how do we verify what’s written?”

What Code Review Means Now

At groundcover we use AI all the time. We use it to write code and also to review it. One of our engineers even built a utility that uses AI to create pull requests automatically. It reads the ticket, checks the Figma file, and builds a PR that already follows our conventions. The engineer just reviews the last few lines.

But you still have to be accountable for what gets shipped. And that’s the hard part. You can’t always read all the code anymore. Sometimes it’s not even practical. So you start thinking: Does this code matter? Is it on a critical path? Can it break something important?

That’s how we started treating code reviews. For simple libraries or utilities, maybe you don’t read the code at all. For something that runs inside the sensor, where performance and memory really matter, you have to check every line.

The Problem with Trusting AI

AI is great, but it’s not deterministic. It’s statistical. It can make things up. It can write logs that never existed or forget to rename a variable. So you can’t just trust it.

I don’t rely on logs written by AI, or even markdowns or comments. If you want to use AI, you have to accept that it will fail sometimes. That means you need guardrails that don’t rely on the AI’s own output. Guardrails that are deterministic.

Tests are good for that. They don’t lie. Most of the time. Linters are good too. But I think we need smarter ones. Linters that can check for complexity, conventions, and even architectural patterns. Things that matter when you have a lot of generated code.

Observability Tells the Truth

This is where observability becomes part of code review. We use eBPF, which runs inside the kernel, to see what the system actually does. It’s the ground truth. eBPF doesn’t depend on your code. It doesn’t care what logs you wrote. It just tells you what happened.

We even started instrumenting our test environments with it. You can run tests, collect the traces, and feed them back to the AI. It creates a loop based on reality. That’s how you make sure your system behaves as intended, not as described.

Deterministic Guardrails for the AI Era

I think the next step for engineering is adding new stages to continuous integration (CI). We need verification that’s not only static but behavioral. Smarter linters. Deterministic truth checks. Things that make sure what we ship actually behaves as expected.

At the end of the day, when something breaks at 3 a.m., the AI won’t fix it for you. It’s still your responsibility. That’s why everything we build is focused on getting to that deterministic truth. eBPF gives us the foundation for it.

AI can help us move faster than ever. But if we want to trust what we ship, we have to make verification deterministic.