AI Code Review: what to look for and why we built our own
The amount of AI-generated code on our projects is growing every sprint. Developers are shipping faster. But here’s the uncomfortable question: who’s reviewing all of that code?
Manual code review hasn’t scaled for years — and now the volume is even higher. AI code review isn’t a nice-to-have anymore. It’s a quality control necessity.
What we actually need from an AI review tool
After months of experimenting, we landed on a clear set of non-negotiables:
- Code quality, security, complexity, performance — the basics, but the tool must catch them consistently, not occasionally
- Project-specific standards — generic rules aren’t enough. The tool must know your code conventions, UI patterns, and architectural decisions. “This violates your naming convention” is useful. “Consider renaming this variable” is noise
- Requirement verification — the tool should read the Jira ticket, check every acceptance criterion, and flag what’s missing or deviating. This is where most tools fall short
- Token budget control — LLM calls cost money. A daily token limit keeps costs predictable without killing review quality
- Inline comments, not summary reports — a 500-word summary at the end of a PR is easy to skip. An inline comment pointing at the exact line? That gets read and fixed
What we tried
We tested CodeRabbit, Qodo Merge (ex PR-Agent), and GitHub Copilot code review. All solid products that handle general code quality well. But none of them could verify requirements against our Jira tickets, enforce project-specific rulesets, or give us the token budget control we needed.
What we built
So we built our own AI reviewer inside our internal DevOps automation tool — with help from generative AI itself. It triggers automatically on every merge request, fetches the diff, resolves referenced symbols across the codebase for full context, and pulls the linked Jira ticket with acceptance criteria. For large MRs it splits files into logical groups and reviews them in parallel. It posts inline comments to specific lines in GitLab — not a wall of text at the bottom. It tracks a daily token budget across all reviews. And it learns from developer feedback: every 👍 or 👎 on an AI comment shapes the next review.
The irony? We used AI to build the tool that reviews AI-generated code. And it works.