The LLM code review problem

After a time of working with people using AI to write code I have noticed something what should be making developers more productive is actually making my job way harder.

I like think of myself to be pretty good at code reviews. I could scan some code and understand not just what was broken but how the developer got there and what they were trying to do. With AI generated code I find it’s completely different. The code looks clean on the surface but it’s often broken in ways you don’t expect. All the normal signs of how someone thinks through a problem seems to be gone.

The back and forth of code review is dead too. I leave a comment they say “good catch” and come back with a completely different solution from AI. They can’t discuss the fix because they didn’t write the original code. Instead of slowly iterating from “a bit broken” to “actually good” every review comment triggers a complete rewrite.

Here’s what really gets me some devs now spend less time on their code than I spend reviewing it. They generate a complex solution in an hour that takes me half a day to properly understand and fix. I had one experience in which someone would completely rewrite their approach after every review. Each version looked polished but had different subtle bugs.

The tests don’t help either. When both your implementation and your tests come from AI you’re not getting the safety net you think you are. The tests might look thorough while actually testing the wrong things or missing obvious edge cases.

The worst part is the pressure this creates. When developers appear highly productive because they’re shipping lots of code pushing back on quality becomes difficult. You hear things like “80% good enough is fine” or “you’re being a perfectionist”.

I’m not saying we should abandon AI assistance. But the productivity gains everyone talks about might be an illusion if other developers are drowning in review work.