How AI Agents Would Read Code Like A Senior Developer

When you review code as a senior engineer, you don't just look at the diff. You consider:

What functions are being called? What do they return?
Are there similar patterns elsewhere in the codebase?
What types are involved? Are they nullable?
What's the dependency chain?

Most AI code review tools just throw raw diffs at an LLM and get a half baked review. What is missed is the most important thing: Context.

The Problem with Raw Diffs

Consider this code change:

// user-service.ts
const user = getUserById(id);
console.log(user.email);

An LLM looking at just this diff might flag:

"Potential null reference - user might be undefined"

But with context, we know:

getUserById returns User | null (from import analysis)
Similar code uses if (!user) return pattern (from pattern matching)
The function signature shows it's async but missing await (from type analysis)

Context changes everything.

Context Enrichment Pipeline

I built a ContextEnricher that analyzes code before the LLM sees it:

class ContextEnricher {
  async enrich(input: {
    fileName: string;
    changedCode: string;
    language: string;
    fullFilePath: string;
  }): Promise<EnrichedContext> {
    // 1. Parse imports and extract definitions
    const imports = await this.extractImports(input);
 
    // 2. Find similar patterns in codebase
    const patterns = await this.findSimilarPatterns(input);
 
    // 3. Build dependency graph
    const dependencies = await this.buildDependencyGraph(input);
 
    // 4. Extract type definitions
    const types = await this.extractTypeDefinitions(input);
 
    return {
      imports,
      similarPatterns: patterns,
      dependencies,
      typeDefinitions: types,
    };
  }
}

1. Import Analysis: Understanding Function Signatures

When you see getUserById(id), you need to know:

Is it async? (needs await)
What does it return? (User | null means you need null checks)
What are the parameters? (type, count, optional?)

So we have to parse imports and extract their definitions:

// What the LLM sees:
IMPORTS:
  getUserById(id: string): Promise<User | null>
    - Returns: User object or null
    - Async: Yes (returns Promise)
    - Parameters: id (string, required)
 
  validateEmail(email: string): boolean
    - Returns: boolean
    - Async: No
    - Parameters: email (string, required)

Now the LLM can catch:

Missing await on async functions
Missing null checks on nullable returns
Wrong parameter types or counts

2. Similar Pattern Matching: Learning from the Codebase

Codebases have patterns. If getUserById is used 20 times elsewhere, and 19 of those check for null, the 20th one should too.

I found Moss to be best local-first solution to do semantic search over the codebase.

You should seriously give it a try if you want some seriously performant speed for semantic search(I can't emphasize this more!)

Here's how I did it:

// Find similar code patterns
const similarPatterns = await mossClient.search({
  query: "getUserById null check pattern",
  file_types: ["ts", "tsx"],
  max_results: 5,
});
 
// Returns:
// - 15 instances use: if (!user) return null;
// - 3 instances use: if (!user) throw new Error(...);
// - 2 instances use: user?.email (optional chaining)

The LLM can now say:

"This pattern is used 15 times elsewhere, and all check for null. This instance is missing the check."

3. Dependency Graph: Understanding Relationships

Code doesn't exist in isolation, it derives from some imports and passes itself down as an import for other modules of the codebase. This chain makes dependency graph one of the most important aspects of the context.

class DependencyGraphBuilder {
  async buildGraph(filePath: string): Promise<DependencyGraph> {
    // Upstream: What this file depends on
    const upstream = await this.findImports(filePath);
 
    // Downstream: What depends on this file
    const downstream = await this.findUsages(filePath);
 
    // Related: Files that import the same things
    const related = await this.findRelatedFiles(filePath);
 
    return {
      upstream: [...],  // What you're calling
      downstream: [...], // What calls you
      related: [...]    // Similar files
    };
  }
}

This helps catch:

Breaking changes (downstream dependencies affected)
Inconsistent patterns (related files do it differently)
Missing error handling (upstream functions can throw)

4. Type Definition Extraction: Understanding Data Structures

Rich type systems has now become a necessity in any serious codebase. Extracting type defs around this graph adds ability to catch type mismatches early and helps us prevent some silly drift to bring the flow down.

// Extracted type definitions
TYPE_DEFINITIONS: interface User {
  id: string;
  email: string | null; // ← nullable!
  role: "admin" | "user";
}
 
type ApiResponse<T> = {
  data: T | null;
  error?: string;
};

Now the LLM knows:

user.email might be null (needs null check)
ApiResponse<User> has a data field that's nullable
Type mismatches don't stand a chance to break anything

Putting It All Together

Here's what the LLM receives for a simple code change:

// Changed code:
const user = getUserById(id);
console.log(user.email);
 
// Enriched context:
{
  imports: {
    getUserById: {
      signature: "(id: string): Promise<User | null>",
      async: true,
      nullable: true
    }
  },
  similarPatterns: [
    { pattern: "if (!user) return null", count: 15 },
    { pattern: "user?.email", count: 3 }
  ],
  typeDefinitions: {
    User: {
      email: "string | null"  // nullable!
    }
  },
  dependencies: {
    upstream: ["database", "cache"],
    downstream: ["auth-service", "profile-service"]
  }
}

Now the LLM can provide a context-aware review:

Issue: Missing null check and await

getUserById returns Promise<User | null> but is called without await
Even after awaiting, user could be null, but .email is accessed directly
user.email is also nullable (string | null), so needs additional check
Similar pattern used 15 times elsewhere, all include null checks

Fix:

const user = await getUserById(id);
if (!user) return null;
if (!user.email) return null;
console.log(user.email);

The Performance Challenge

There's no free lunch. Enriching context comes with a cost of tokens which with these new models is seamingly unreasonable. I had to think hard of a few approaches to bring it down as much as I can while not affecting the quality of review.

1. Lazy Loading

Only enrich when needed:

// Don't enrich if file is just formatting changes
if (changeAnalyzer.isFormattingOnly(diff)) {
  return minimalContext;
}

2. Depth Limits

Control how deep the scouting goes:

const enricher = new ContextEnricher({
  maxImportDepth: 1, // Don't go deeper than direct imports
  maxSimilarPatterns: 5, // Limit pattern matches
  includeTests: true, // Include test files for patterns
});

3. Caching

Cache enriched context per file(this has to be done with utmost accuracy by not letting stale cache affect our review quality at all):

const cacheKey = `${filePath}:${fileHash}`;
if (cache.has(cacheKey)) {
  return cache.get(cacheKey);
}

Results: Better Reviews and Fewer False Positives

Before context enrichment:

"Potential null reference" (but function never returns null)
"Missing await" (but function isn't async)
"Type mismatch" (but types are compatible)

After context enrichment:

"Missing await on async function getUserById"
"Null check needed - function returns User | null and similar patterns (15 instances) all check for null"
"Type error - user.email is string | null but accessed without null check"

The Trade-off

Context enrichment adds quite a bit of latency, but:

Gives better accuracy, fewer false positives
Catches real fixable issues with Actionable feedback
Understands codebase patterns as good as how you do:

I felt the latency was worth it since a review that catches real bugs is better than a fast review that misses them.

Conclusion

Throwing raw diffs at LLMs is like asking someone to review code without access to your codebase. By enriching context first imports, patterns, dependencies and types, I fill the context with the same information a real developer would have.

The result? Reviews that actually catch bugs.

Building Context