Human Code Reviews Are Dead
How Top Engineers Review Code in the Age of AI
🚀 Announcement: I go all-in on YouTube
Big news! I’m launching my YouTube channel.
I’ll share practical content about AI, software engineering, testing, and building high-quality software.
The goal is simple: help developers become better engineers in the AI era.
I’m already wrapping up my first video about Claude Code best practices.
Subscribe by clicking here if you want to become pro in AI and coding.
Motivation
Most teams struggle with code reviews. Before AI, most engineering teams I worked with could barely keep up with the big numbers of PRs generated by developers.
Now AI makes it 10x worse. Code is being generated by AI faster than humans can realistically review it line by line.
The rise of AI is forcing us to build more resilient engineering systems instead of relying on humans manually catching problems in PRs.
In this article, I’ll break down why AI changes everything, and how teams can handle code reviews at massive scale without slowing down development.
AI Changes The Game

The benefit of AI is that it allows us to write more code, but it increases the bottleneck with reviewing code.
GitHub recently shared they originally planned for 10x growth during the vibe-coding era. A few months later, they realised they actually needed to prepare for 30x growth.
AI agents are massively increasing the number of PRs, commits, and overall code output across companies. Humans simply can’t keep up with it. Neither in terms of time nor cognitive load.
Human code reviews are becoming broken and no longer a viable option in the age of AI.
What can we do then?!
Review Intent, Not Code
Code reasoning becomes more valuable than code generation.
As AI takes over the HOW, engineers need to focus on the WHAT and, more importantly, the WHY.
After years of using AI in real engineering, one thing is clear:
AI is excellent for 80% of the work.
But it’s terrible for the rest 20%.
Why? Because agents don’t have taste.
They’re faster than us. But they need navigation. You have to tell them what to build. You need to ask the right questions.
Raw capability without direction produces mediocre software.
AI won’t replace great engineers. Great AI-assisted work is defined by the human guiding agents with intents and strong critical judgement.
How to Do Code Reviews
Layer 1: Compare Solutions
The biggest mistake developers make with AI agents is treating them like junior engineers, telling them exactly how to build software. We’re so used to controlling every detail that we forget AI often finds better solutions.
A better approach: ask AI for multiple options before coding. Push it to compare tradeoffs, explain pros and cons, or even test different approaches and rank them by verification results, testability, smallest diff, or fewer dependencies.
Writing code is becoming 100x cheaper, which gives us the freedom to explore multiple solutions before choosing the best one.
Layer 2: Acceptance Test Driven Development
Test-Driven Development was one of the best ways to build software before AI.
In the age of AI, Acceptance Test-Driven Development (ATDD) becomes the gold standard.
Why? Because software development is shifting from implementation to intent.
ATDD captures requirements and business behaviour upfront through executable tests. Instead of writing massive specs and reinventing waterfall, teams define specs in well-defined automated tests, then let AI implement incrementally, and validate behaviour continuously.
The future isn’t humans reviewing code diffs. It’s humans reviewing behaviour through automated tests while AI handles the implementation.
Layer 3: Deterministic Guardrails
Deterministic guardrails are automated mechanisms that enforce engineering standards consistently across a codebase.
The goal is simple: reduce reliance on AI & human discipline by encoding standards directly into the system.
Here are all the guardrails you can use in your app:
Architecture tests verify structural rules like dependency direction, layering, coupling, and naming conventions.
Build-time boundaries go one step further by making invalid dependencies impossible to compile through module/package separation and visibility rules.
Static code analysis detects code smells, complexity, security risks, dead code, and dangerous patterns automatically.
Formatting & style enforcement removes subjective code review discussions through automatic formatting and consistent conventions.
Linting rules prevent problematic coding patterns such as hidden side effects, unsafe async usage, or unused dependencies.
Type-system constraints reduce entire categories of bugs using strong typing, non-nullability, immutability, and exhaustive matching.
Test guardrails enforce standards like minimum coverage thresholds, mutation testing requirements and invariant tests.
Snapshot, approval, and contract testing help protect system behavior and integration boundaries over time.
Dependency guardrails prevent vulnerable, unapproved, or outdated libraries from entering the system.
CI/CD enforcement ensures every change passes automated validation before merge or deployment.
Pro tip: If you don’t know how to start, just copy this section, give it to your LLM, and ask it to provide an implementation strategy. It will make a hell of a difference.
Layer 4: AI Review Tools - CodeRabbit
As code volume explodes, manual reviews no longer scale.
Tools like CodeRabbit automatically review pull requests, detect issues, and provide instant feedback at massive scale.
Unlike humans, AI reviewers never get tired, review every PR consistently and scale with code volume
CodeRabbit has already become one of the dominant AI review tools, used in over 1M repositories and trusted by organizations like the Linux Foundation and MIT.
Their latest Slack Native Agent integration makes AI reviewers more like engineering teammates that developers can directly collaborate with, ask questions, and iterate alongside in real time.
The goal is not to replace engineers, but to automate repetitive review work so humans can focus on architecture, intent, and business decisions.
Layer 5: Pair & Mob Programming
Even in the age of AI, the best way to review software is still pair and mob programming.
But instead of reviewing code line by line, we review intent, challenge ideas, and weigh tradeoffs together in real time.
“When you work alone on a feature you do the best work you can do alone. When the whole team works together, you do the best work the entire team can do” - Allen Holub
Pair programming > everything else.
After 10+ years in software, nothing has leveled up my teams more than pair programming.
Two heads are better than one.
By pairing up, you make fewer coding mistakes, write higher quality code, and continuously share knowledge.
It makes decisions easier, and makes coding more fun.
Pair programming is the silver bullet of collaboration. Try it!
Layer 6: Skeptical Agents
Most developers use AI agents the wrong way. They use them as assistants. Helpful. Agreeable. Cooperative.
But strong systems are not built through agreement. They’re built through pressure.
The real shift happens when agents stop collaborating and start challenging each other.
We’ve already seen this in practice when there’s a separate QA tester on the team whose job is to break the system. We can inject the same identity into AI agents.
You can easily prompt your favorite LLM agent to become a QA tester. Here is an prompt I use daily:
You are an adversarial QA agent.
Your goal is to break the software, not to validate it.
Attack assumptions, explore edge cases, abuse inputs, exploit weak validation, find vulnerabilities, trigger unexpected states, and search for failure modes the developer did not anticipate.
Think like a malicious user, a chaos engineer, and a senior QA engineer combined. Never trust the implementation. Continuously look for ways the system can fail, leak, crash, or behave incorrectly.
This prompt alone saved me dozens of critical bugs in code written by AI agents. Try it now, and I guarantee you won’t regret it.
Summary
AI is breaking traditional code reviews. PR volume is exploding faster than humans can realistically review line by line.
The future of software quality is no longer human inspection, but layered verification systems: tests, deterministic guardrails, AI reviewers, adversarial agents, and fast feedback loops.
In the AI era, engineers shift from writing every line of code to designing systems that safely handle massive amounts of change.
The teams that win won’t be the ones writing the cleanest code, but the ones building the strongest engineering systems.
Craft Better Software








Love all of these ideas! Gonna try them out very soon. I am curious to know more about what Pair Programming looks like when we are using tools like Claude Code. When and how do we switch roles? What kind of input do we give each other? What do we do when the LLM is generating code?