Most developers who've tried to wire AI into a large codebase hit the same wall: the model confidently does something plausible, and completely wrong. The instinct is to blame the model — to say it hallucinated, to say AI "isn't ready yet." But one developer's two-year journey building an AI-assisted code-review system makes a more uncomfortable argument: the failure isn't the model's. It's the design's.
The insight is deceptively simple — AI has no idea what context wasn't handed to it — and following it to its conclusion reshapes how you should build any AI system that touches real production code.
The Context Window Problem Nobody Wants to Admit
The naive solution to "AI doesn't understand our codebase" is to give it more context. Feed it more files. Use a bigger context window. With models supporting 200K+ tokens in 2025 and beyond, it feels like the problem should be solved.
It isn't. Two well-documented phenomena undercut the brute-force approach:
Lost in the middle — models demonstrably underweight information positioned in the middle of long prompts. Critical facts buried in file 47 of 80 receive less attention than facts in files 1 and 80.
Attention dilution — the more tokens you stuff into a context window, the more each individual token has to compete for the model's attention. Signal-to-noise collapses.
Learning-based alternatives (fine-tuning, continual training) run into different walls: machine unlearning is expensive and imprecise, and catastrophic forgetting means new training can destructively interfere with what the model already knows.
The framing shifts once you accept both paths have hard limits. The question stops being how do I give the model more context? and becomes how do I give it only the context it actually needs?
GraphRAG + MCP: Precision Over Volume
GraphRAG — retrieval-augmented generation over a knowledge graph rather than a flat vector store — is the architectural answer to that question. Instead of retrieving the nearest chunks to a query, a graph-structured retrieval system can traverse relationships: this function calls that service, which depends on this schema, which has this constraint. The retrieval is surgical rather than statistical.
MCP (Model Context Protocol) is the delivery mechanism. Rather than pasting retrieved facts into a static prompt, MCP allows a model to request specific facts at inference time — pulling nodes from the graph only when and because they're needed. The model drives the retrieval instead of having a retrieval system guess what to pre-load.
The combination — GraphRAG as the knowledge store, MCP as the access layer — means the model is working from a curated slice of your codebase's semantic structure, not a lossy compression of everything.
The path to that architecture wasn't clean. An early implementation built a code-graph: nodes are functions, edges are call relationships. Two months of work. It got thrown away. The problem was granularity — function-level nodes don't carry enough semantic weight to answer the questions an AI code reviewer actually needs to ask. The replacement is a product-graph (abbreviated cpg): nodes are product concepts and business entities, edges encode the relationships between them. The AI doesn't need to know how validateInvoice() is implemented; it needs to know that invoices have a specific set of invariants that must hold before payment is processed.
That shift — from structural graph to semantic graph — is where most GraphRAG implementations go wrong.
Harnesses, Not Guardrails
Once you have surgical context delivery, the next design question is: where are hallucinations allowed to happen?
This framing — harnesses rather than guardrails — is the sharpest conceptual contribution in this kind of AI system architecture. Guardrails try to catch bad outputs after the model produces them. Harnesses constrain the space of possible outputs by controlling the inputs and the task definition so precisely that the range of plausible-but-wrong outputs shrinks to near zero.
Four mechanisms implement this in practice:
Knowledge Graph — the cpg described above. The model cannot hallucinate facts about your codebase that aren't in the graph, because the graph is the only source of codebase facts it receives.
Auto Review — structured output validation that runs immediately after generation. Not semantic validation ("does this make sense?") but structural validation ("does this output conform to the contract the downstream step requires?"). Fast, cheap, and catches the majority of failure modes at the boundary.
Self-Healing — when Auto Review fails, the system generates a correction prompt from the failure mode rather than discarding the output entirely. This isn't the same as asking the model to "try again" — it's constructing a targeted re-prompt that provides the specific missing context the first attempt lacked.
Recurrence Prevention — persistent logging of failure patterns with indexed retrieval. When a failure occurs, it checks whether a similar failure has occurred before and, if so, includes the previous resolution as context in the correction prompt. The system gets better at its own failure modes over time without retraining.
Stacked together, these four mechanisms don't prevent hallucinations. They contain them — restrict them to surfaces where they're detectable and recoverable, and push them out of surfaces where they'd be silent and dangerous.
A non-engineer-facing PR application sits on top of this stack: automatic pull request summaries written in product terms rather than diff terms, generated from the product-graph context rather than raw code. The same harness architecture applies — the output is constrained by what the graph knows, validated structurally, and corrected when it fails.
The Design Principle That Actually Transfers
The philosophical throughline is this: ideal behavior with no spec given is a fantasy. Every AI integration that feels unreliable is, underneath, an integration with an underspecified task definition. The model is filling in blanks the designer didn't realize were blank.
The actionable version: before you wire AI into any step of a workflow, write down — explicitly — what the inputs are, what the outputs must satisfy, and what constitutes a recoverable failure. That document is your harness specification. If you can't write it, the AI integration isn't ready to be built.
One additional data point worth internalizing: targeting 90% test coverage as a solo developer broke the implementation. The coverage target itself became the work, crowding out the product. The lesson generalizes — don't let a metric become the goal when the goal is a working system. Design the system; the metrics follow.
The headline principle holds: AI isn't something to trust. It's something to design. The design work is harder than prompt engineering and more durable than fine-tuning. It's also, increasingly, the actual job.