May 15, 2025·7 min read

What AI-Generated Code Gets Wrong (And How I Catch It Before It Ships)

AI writes code fast. It also makes specific, repeatable mistakes. After building three live SaaS products with AI assistance, here's my actual review checklist.

After building saasdb.app, briefstock.ai, and feedalyze.net with heavy AI assistance, I've noticed that the mistakes AI makes aren't random. They're patterned. The same categories of errors appear across different codebases, different prompts, different model versions.

Once you know the patterns, you can catch them fast. Here's my working checklist.

The core problem: AI optimizes for plausibility, not correctness

AI code generation produces code that looks correct and often runs without errors on the happy path. The failures surface at edges: concurrent users, empty states, large inputs, auth edge cases, stale closures. The model has seen millions of code examples and learned to produce code that resembles working code — but resemblance is not the same as correctness.

This isn't a criticism of AI tools. It's a constraint to design around. You don't blame a calculator for not knowing context — you learn which operations require human judgment.

Mistake 1: Missing loading and empty states

AI almost always generates the happy path. Give it a prompt like "build a dashboard that shows user metrics" and it will produce beautiful code for the case where:

The API returns data
The data is non-empty
The user is authenticated
The request completes in under 200ms

What it won't generate unprompted:

Loading skeleton while data fetches
Empty state when no data exists yet
Error state when the API fails
Timeout handling
Partial data states (some metrics loaded, others pending)

What I look for: Any component that fetches data. I scan for isLoading, error, and empty array checks. If they're missing, I add them before merging.

The quick fix: After AI generates a data-fetching component, I follow up with a specific prompt: "Add loading, empty, and error states to this component." This usually produces the missing branches in one pass.

Mistake 2: useEffect with stale closures

This is the single most common React bug in AI-generated code. The pattern looks like this:

useEffect(() => {
  fetchData(userId); // userId might be stale
}, []); // empty dep array = runs once, captures initial value

The AI generates a clean-looking effect with an empty dependency array because it's seen this pattern countless times in tutorials. What it misses is that userId inside the effect captures the value from the initial render. If userId changes — because the user navigated to a different profile, because auth refreshed, because the parent re-rendered — the effect doesn't re-run.

What I look for: Every useEffect with an empty [] dependency array. I check each variable referenced inside the effect body and ask: can this value change over the component's lifetime? If yes, it belongs in the dependency array.

The pattern that actually matters: useCallback and useMemo have the same problem. AI-generated memoization often misses dependencies, producing values that never update.

ESLint catches most of these

If you're not running eslint-plugin-react-hooks with the exhaustive-deps rule set to error, you're leaving the most reliable AI bug detector on the table. It doesn't catch all stale closure bugs, but it catches the mechanical ones. Add it to your config and fix the warnings before merging.

Mistake 3: N+1 queries

AI generates database code that works correctly for small datasets and catastrophically for real ones. The classic pattern:

const users = await db.select().from(usersTable);
const results = await Promise.all(
  users.map(user => db.select().from(postsTable).where(eq(postsTable.userId, user.id)))
);

This executes one query per user. For 10 users it's fine. For 1,000 users it's 1,001 database round trips. I've seen AI generate this pattern for Supabase, Drizzle, Prisma, and raw SQL. The model knows JOIN syntax — it just doesn't default to it when generating list-with-related-data patterns.

What I look for: Any .map() or for loop that contains a database query or API call. If the inner operation is async, it's almost certainly an N+1.

The fix prompt: "Rewrite this to use a JOIN instead of a per-row query." Works reliably.

Mistake 4: Authentication checks that don't actually protect routes

AI writes auth middleware that's syntactically correct but semantically incomplete. The most common variant:

export async function middleware(request: NextRequest) {
  const token = request.cookies.get('session');
  if (!token) {
    return NextResponse.redirect('/login');
  }
  return NextResponse.next();
}

This checks that a cookie exists. It does not verify the cookie is valid, unexpired, or belongs to a real user. A cookie with any string value passes this check.

What I look for: Every auth check. I verify that it validates the token cryptographically (via JWT verification, session lookup in the database, or framework-provided session validation) — not just checks for presence.

Deeper concern: AI-generated Row Level Security policies for Supabase are often present but wrong. They pass auth.uid() checks but miss the case where auth.uid() is null for unauthenticated requests, which can allow reads on tables that should be fully private.

Mistake 5: Error handling that swallows failures silently

try {
  await sendEmail(user.email, template);
} catch (error) {
  console.log('Email failed');
}

AI generates catch blocks that log and continue. In development this seems fine — the function "handles" the error and moves on. In production, users silently don't receive emails, payments silently fail to process, webhooks silently drop — and you find out a week later from a support ticket.

What I look for: Any catch block that doesn't re-throw, return an error response, or explicitly alert on failure. A console.log in a catch block is a smell; a bare catch (error) {} is a bug.

The design principle this enforces: Decide upfront whether a failure is recoverable (retry/fallback) or fatal (propagate to the caller). AI code defaults to "log and continue" which is almost never the right answer for anything in a critical path.

Mistake 6: Missing input validation at API boundaries

AI generates API route handlers that trust request bodies. Given a prompt like "create a POST handler that saves a user profile," it will generate something that directly passes req.body to the database without validation:

export async function POST(request: Request) {
  const body = await request.json();
  await db.insert(usersTable).values(body); // body is untrusted input
}

This is an injection vector. body can contain fields that shouldn't be writeable (role, isAdmin, createdAt), values that violate schema constraints, or simply garbage that will cause a database error surfaced to the user.

What I look for: Every API handler that reads from request.json(), req.body, or URL params. I verify that the data is parsed through a validation schema (Zod, Valibot, or similar) before it touches the database.

One Zod schema per endpoint, no exceptions

The discipline of writing a Zod schema for every inbound API payload takes about 5 minutes per endpoint and has caught real issues in every product I've built. AI will generate the schema when asked — you just have to ask.

Mistake 7: Hardcoded values that should be config

const MAX_RETRIES = 3;
const API_TIMEOUT = 5000;
const FREE_TIER_LIMIT = 100;

These constants appear inline in the files that use them. When you need to change them — and you will need to change them — you're grepping through the codebase instead of updating a config file. For limits that affect billing, this is particularly dangerous: the enforcement logic and the pricing page can drift out of sync.

What I look for: Numeric literals in business logic. I check whether they're genuinely local constants (CSS values, mathematical coefficients) or operational parameters that might need to change without a code deployment.

My actual review flow

I don't run through this checklist mechanically for every file. I've internalized it into a pattern-matching scan:

Data fetching components: loading/empty/error states?
useEffect hooks: dependency arrays correct?
Loops with async operations: N+1?
Auth middleware and RLS: actually validates, not just presence-checks?
Catch blocks: re-throws or alerts, doesn't just log?
API handlers: Zod validation before DB writes?
Business-logic constants: in config, not inline?

Total review time for an AI-generated feature: 10–20 minutes. The alternative is finding the bug in production, which is measured in hours to days.

The goal isn't to distrust AI — it's to use AI for what it's fast at (boilerplate, repetitive patterns, scaffolding) while applying human judgment where the model's pattern-matching reliably falls short.

Araho Digital

We build what we write about.

Every technique in this post was used on a real client project. If you're building a SaaS product or internal tool and want it done in weeks, not months — that's what we do.

MVP development →Get a free quote →

Fixed price. Fixed scope. Money-back guarantee.