Stop Feeding Your AI Agent the Whole Repo: Build a Project Brain That Retrieves What It Needs

Published June 26, 2026.

A lot of developers are using AI agents now, but many are still managing them like a giant chat box: paste everything, hope the agent understands the project, then watch it repeatedly read random files because it does not know where the real knowledge lives.

That is not a smart agent workflow. That is an expensive guessing loop.

The better pattern is to give your AI agent a project brain: a small, reliable knowledge system that tells the agent what the project is, how to work in it, and where to retrieve the exact docs, files, commands, and decisions needed for the current task.

The goal is not to put the entire repository into the model context. The goal is to make the agent pull the right context at the right time.

The problem: more context is not always better

Anthropic describes context as a finite resource. Their context engineering guidance explains that agent work is no longer only about writing a good prompt; it is about curating the whole state available to the model: instructions, tools, MCP servers, external data, history, and retrieved documents. They also warn about context degradation as context grows, where models can lose focus or fail to use information correctly.

This is exactly what happens in coding agents. If the agent has no project brain, it tries to discover everything from scratch:

It reads file after file.
It opens package files repeatedly.
It forgets architecture decisions from the previous session.
It uses generic framework advice instead of your project conventions.
It modifies code before understanding how the project is tested or deployed.

A good AI-agent setup should answer these questions before work begins:

What is this project?
What are the most important directories?
How do I run, test, lint, build, and deploy it?
What conventions must I follow?
Where are the deeper docs?
How do I retrieve only the files relevant to this task?
What should never be touched without permission?

The project brain pattern

I like to split the agent brain into five layers:

Core identity: a short project overview, product purpose, tech stack, and safety boundaries.
Operating instructions: commands, test rules, code style, PR rules, and common workflows.
Knowledge index: a map of docs, architecture notes, API contracts, database schemas, and decision records.
Retrieval system: search, embeddings, BM25, repo maps, MCP tools, or a local RAG service that fetches relevant chunks on demand.
Feedback memory: corrections and lessons learned after the agent makes mistakes, kept short and maintained like documentation.

Think of it like onboarding a new developer. You would not throw the entire repository at them and say, “Read everything.” You would give them a README, architecture guide, test commands, key docs, and a way to search for what they need.

What belongs in always-loaded context?

Keep always-loaded instructions small. Put only the high-signal details the agent needs on almost every task:

Project purpose and current app/product boundaries.
Tech stack and package manager.
Commands for install, dev, test, lint, typecheck, and build.
Critical directories and what they contain.
Non-negotiable rules: security, migrations, generated files, secrets, deployment rules.
How to find deeper docs.

Do not put large API references, entire schemas, old discussions, or full architecture docs directly in the top-level instruction file. Link to them. Make them retrievable.

What belongs in retrieval?

Retrieval is for information that might be important but is not needed on every turn:

Long architecture documents.
Database schema details.
API endpoint references.
Feature specs.
Error catalogs.
Design-system docs.
Historical decisions.
Examples of similar implementations.

Anthropic’s contextual retrieval article explains the classic RAG flow: chunk documents, embed chunks, store them in a vector database, retrieve the most relevant chunks at runtime, and add those chunks to the model prompt. It also highlights a common improvement: combine semantic embeddings with lexical search such as BM25, especially for exact identifiers and technical terms.

Codex: use AGENTS.md as the front door

For OpenAI Codex, the first solution is AGENTS.md. OpenAI’s Codex documentation says Codex reads AGENTS.md before doing work, and it supports layered guidance: global instructions in the Codex home directory, project instructions from the repository root, and nested instructions closer to the current directory. The docs also mention a default project instruction size limit of 32 KiB, which is a healthy reminder: this file should be concise.

A practical AGENTS.md should include:

# AGENTS.md

## Project overview
This is a SaaS app for ...

## Commands
- Install: pnpm install
- Dev: pnpm dev
- Test: pnpm test
- Lint: pnpm lint
- Typecheck: pnpm typecheck

## Directory map
- apps/web: frontend
- apps/api: backend
- packages/db: database schema and queries
- docs/architecture: deeper architecture notes

## Retrieval rules
- Start with docs/ai/index.md before broad file search.
- Search exact symbols with ripgrep before opening many files.
- Read only files related to the current task unless the task requires broader discovery.

## Safety
- Do not modify migrations or production config without explicit approval.

Then create a deeper index:

docs/ai/index.md
- Product overview: docs/product/overview.md
- Architecture: docs/architecture/system.md
- Database: docs/database/schema.md
- API routes: docs/api/routes.md
- Testing: docs/testing.md
- Deployment: docs/deployment.md

This gives Codex a map, not a mountain.

Claude Code: use CLAUDE.md, rules, imports, and MCP

Claude Code’s docs explain that Claude remembers projects through CLAUDE.md files and auto memory. CLAUDE.md is for persistent instructions you write; auto memory is for notes Claude accumulates from corrections and preferences. The docs recommend adding information when Claude makes the same mistake twice, when a code review catches something it should have known, or when a teammate would need the same context to be productive.

Claude Code can also import additional files using @ references. That means your CLAUDE.md can stay short while pointing to deeper docs:

# CLAUDE.md

## Project instructions
Read this first: @docs/ai/index.md

## Required workflow
- Before editing, identify the smallest relevant file set.
- Prefer docs and symbol search before broad repository scans.
- Run the tests listed in @docs/testing.md for changed areas.

## Private local notes
Personal machine setup may live in CLAUDE.local.md and should not be committed.

Claude Code also supports MCP, which is important because the best project brain is often not a static file. MCP can connect the agent to issue trackers, databases, design tools, documentation systems, search tools, or a custom codebase retriever.

Codex + Claude Code together: make one source of truth

If you use both Codex and Claude Code, do not maintain two separate brains that drift apart. Use a shared source of truth and thin wrappers for each agent.

Recommended structure:

AGENTS.md
CLAUDE.md
docs/ai/index.md
docs/ai/project-overview.md
docs/ai/commands.md
docs/ai/architecture-map.md
docs/ai/testing.md
docs/ai/retrieval-rules.md
docs/ai/known-pitfalls.md

AGENTS.md should summarize and point Codex to docs/ai/index.md. CLAUDE.md should summarize and import the same docs where supported. The canonical knowledge lives in docs/ai/.

This prevents the common problem where Codex follows one workflow, Claude Code follows another workflow, and the developer has to clean up the confusion.

Ollama and local coding agents: use local RAG carefully

For local agents using Ollama, Continue-style setups, Open WebUI, Aider, or custom scripts, retrieval matters even more because local models often have smaller context windows and weaker long-context behavior.

Open WebUI’s RAG documentation specifically warns that Ollama may default to a 2048-token context length, which can severely limit RAG performance. Their docs recommend increasing context length for Ollama models when using web or document retrieval.

A good local setup looks like this:

Run a capable local model through Ollama.
Create a local document/code index using embeddings plus keyword search.
Chunk code by symbols, classes, functions, markdown sections, and config files—not random fixed-size text only.
Store metadata: file path, symbol name, package/module, last modified date, and doc type.
Retrieve top results with hybrid search: embeddings + BM25 or exact match.
Optionally rerank results before adding them to context.
Keep the final context budget strict: only include what the task needs.

For codebases, retrieval should prioritize exact symbols, file paths, imports, package names, and error messages. Pure vector search can miss exact technical identifiers. Hybrid retrieval is usually better.

Aider and repo maps: another useful pattern

Aider’s repository map is a great example of not feeding the full repo to the model. Aider builds a concise map of important classes, functions, types, and call signatures across the repository. Its docs explain that the map helps the model understand how code relates across the codebase, and that for large repositories Aider selects the most relevant portions to fit the active token budget.

This is the right idea: give the agent a compressed map first, then let it request specific files when needed.

Any local AI agent: the minimum brain I recommend

If you are building or configuring any coding agent, start with this minimum setup:

docs/ai/index.md
AGENTS.md or CLAUDE.md or equivalent agent instructions
scripts/ai/context-search.sh or an MCP search server
README.md kept human-friendly
docs/architecture/system.md
docs/testing.md
docs/deployment.md
docs/decisions/

Your docs/ai/index.md should be boring and direct:

# AI Project Index

## Start here
- Project overview: ./project-overview.md
- Current architecture: ../architecture/system.md
- Commands: ./commands.md
- Testing: ../testing.md

## Search strategy
1. Search docs first.
2. Search exact symbol names with ripgrep.
3. Open the smallest relevant files.
4. Only broaden search if tests or references prove more context is needed.

## Do not read everything
Avoid full-repo scans unless the task is explicitly architectural or migration-related.

The agent workflow I want more teams to use

When assigning a task to an AI coding agent, tell it to follow this loop:

Classify the task. Is it bug fix, feature, refactor, test, documentation, deployment, or research?
Load the small brain. Read the top-level instruction file and docs/ai/index.md.
Retrieve targeted context. Use docs, symbol search, repo map, or MCP retrieval.
State the file plan. Name the files it believes are relevant.
Edit minimally. Change only what the task requires.
Verify. Run the correct tests, lint, typecheck, or build.
Update the brain. If the agent learned a durable project rule, add it to the right doc.

The last step is important. Your project brain should improve over time. If the agent repeatedly makes the same mistake, do not keep correcting it in chat. Put the lesson into AGENTS.md, CLAUDE.md, or the relevant project doc.

How to know your agent brain is working

You know your setup is improving when:

The agent stops reading the same setup files every session.
It can explain the project structure before editing.
It asks for or retrieves specific files instead of scanning randomly.
It runs the correct tests without being reminded.
It follows project conventions from the first attempt.
It cites docs or code paths when making architectural claims.
New teammates can use the same docs to become productive faster.

Common mistakes

Putting too much in AGENTS.md or CLAUDE.md. These should be maps and rules, not giant manuals.
Only using embeddings. Code retrieval needs exact search too.
Not maintaining docs. Stale instructions are worse than missing instructions.
Letting each agent have a separate truth. Codex, Claude Code, and local agents should point to the same canonical docs.
No verification loop. A knowledgeable agent still needs tests.
No boundaries. Tell agents what not to edit without approval.

Final recommendation

Do not try to make your AI agent smart by stuffing the entire project into context. Make it smart by giving it a project brain:

A small always-loaded instruction file.
A clear documentation index.
A retrieval layer for deep knowledge.
A repo map or symbol search for code understanding.
MCP tools for live external systems.
A habit of updating durable knowledge when the agent learns something important.

The best AI coding agents are not the ones with the biggest prompt. They are the ones with the best access to the right knowledge at the right time.