AI loses context in large codebase
appears when:Once the project passes a few hundred lines, the AI starts forgetting earlier decisions, re-hallucinating types, and introducing duplicate utilities
AI loses context — fix large-codebase memory loss in Cursor, Claude Code, and v0
Context loss is not a bug in Cursor, Claude Code, or v0. It is the inherent limit of token-window LLMs meeting a growing codebase. The fix is not a better prompt — it is a workflow built on external memory files, tight per-prompt scope, and session checkpoints.
Your AI loses context because the model’s attention window cannot hold your whole codebase. Three levers solve it: (a) external memory files (CLAUDE.md, .cursorrules, AGENTS.md) that the tool re-reads every session; (b) tight file-level scope on every prompt — never @codebase for large repos; (c) summarization checkpoints between sessions. If you are mid-session and losing context, stop; write a session summary; start a fresh chat with scoped files.
Quick fix for AI loses context
01# CLAUDE.md / AGENTS.md / .cursorrules — project memory02 03## Architecture04- Next.js 16 App Router (breaking changes from 14 — read node_modules/next/dist/docs)05- Supabase for auth + database (RLS enabled on every table)06- Stripe for payments (webhooks verify raw body; never .json() before verify)07- Vercel deploy (pooled DATABASE_URL on port 6543)08 09## Key types10- User lives in src/types/user.ts — do not re-declare11- ApiResponse<T> is the wrapper for every route handler12- All dates are ISO strings in the API, Date objects in the client13 14## Naming conventions15- Files: kebab-case.tsx16- React components: PascalCase17- Hooks: useXxx, prefixed, in src/hooks/18- Server actions: verbNoun, async, in src/actions/19 20## Do not modify21- src/lib/schema.ts — JSON-LD contract for SEO22- src/components/ProblemPage.tsx — shared problem-page shell23- prisma/migrations/* — migrations are immutable once applied24 25## Current focus26- Building /fix/* pages — each follows the ProblemPage pattern27- See src/app/fix/stripe-webhook-not-firing/page.tsx as the referenceDeeper fixes when the quick fix fails
01 · Per-tool solutions: Cursor
Cursor’s attention is precise when you scope it and smeared when you do not. Never use @codebase for a repository larger than a few thousand lines — it pulls too many files, the working memory overflows, and the answer drifts. Instead use @-references to name the exact files: @src/lib/auth.ts @src/hooks/use-user.ts. Five files is the practical ceiling for a single task.
Split the work correctly between Cursor’s two modes. Composer is for implementation — it edits files directly and keeps attention on the diff. Ask is for design — it reasons without writing, which is what you want when the task is still ambiguous. Mixing them (“design and implement this in Composer”) blows context on the design phase and leaves nothing for implementation.
Maintain a .cursorrules file at repo root. Architecture, key types, naming conventions, files that are load-bearing. Cursor injects it into every prompt; your chat transcript no longer has to carry the same context by hand.
02 · Per-tool solutions: Claude Code
Claude Code reads CLAUDE.md at repo root automatically, plus any CLAUDE.md inside submodules along the current path. That cascading pattern is the highest-leverage feature in the tool — use it. Feature-level memory files are cheaper than bloating the root file, and they scope architectural notes to the part of the tree that needs them.
Use /compact at natural breakpoints. After finishing a feature, after a long debug session, after a cross-file refactor. /compact summarizes the transcript into a short synopsis and discards the rest, resetting effective working memory without losing the decisions. Without compacting, a long Claude Code session ends up in the attention-decay zone quickly.
Prefer small focused PRs over long sessions. The model that handles one feature cleanly is the same model that degrades on the tenth. If the work truly is large, scope it across sessions, not within one. Use --add-dir to let Claude Code reach files outside the current working directory without forcing the whole monorepo into context.
03 · Per-tool solutions: v0
v0 is a front-end-only tool. Context loss in v0 almost always means you have hit its project-size ceiling — it was built to generate single components and short flows, not to carry a multi-page app in working memory. When the chat starts contradicting earlier components, rewriting a design system per turn, or losing track of shared types, the signal is: time to eject.
The migration path is v0 → Next.js. Export the generated code, move it into a real repository, and handle the app-level concerns (routing, auth, database) in your IDE with Cursor or Claude Code. The v0 migrate-out guide walks through the full handoff — folder structure, component boundaries, and the Next.js App Router idioms v0 does not generate by default.
04 · Per-tool solutions: Lovable
Lovable retains chat context per project, which helps until it doesn’t. The failure mode is global prompts that touch the whole app (“refactor our auth”) — Lovable tries to hold the entire project in working memory and drifts. Scope every prompt to a single component with “Edit this component.” The model attends to the current file and ships smaller, safer edits.
Lovable also lets you edit the project description, which functions as a memory file. Make it rich. Stack, architecture, load-bearing components, naming conventions. Lovable re-reads the description when you start a new chat, which is your cheapest insurance against a fresh session losing track of design decisions.
05 · The memory-file pattern (universal)
The pattern is the same across every AI coding tool: maintain a plain-text file the AI re-reads every session. Different tools pick different filenames — CLAUDE.md, .cursorrules, AGENTS.md, README.md for tools that respect it — but the content is identical. Architecture decisions, key types, files that are load-bearing, naming conventions, explicit do-not- modify lists.
Keep the file short. Under 200 lines is the right order of magnitude. If it grows past that, split into feature-level memory files (src/auth/CLAUDE.md, src/payments/CLAUDE.md) so the AI loads only the section it needs. A 2,000-line memory file bloats every prompt and defeats the point. A 150-line memory file is the floor the AI reasons from on every turn.
Treat the memory file like a test: it earns its place by preventing a specific class of regression. Every time you catch the AI re-hallucinating, update the file. Every time the AI proposes a pattern you do not want, add a do-not rule. The file evolves into the shortest possible briefing that keeps the AI on the path.
06 · Summarization checkpoints
When a session runs long, working memory degrades. The fix is a checkpoint: write a 200-word summary of decisions plus open issues, paste it into a fresh chat, and continue from there. Treat chats as disposable; state lives in files, not in transcripts.
A good checkpoint has three sections: (1) decisions — what you chose and why, in bullets; (2) in-progress — the file you were editing, the function, the failing test; (3) next step — one sentence describing the immediate next action. Paste this into a new chat along with the 3-5 files the task touches, and working memory resets with full attention on the current problem.
Claude Code’s /compact automates most of this inside one session. Cursor and Lovable do not have an equivalent yet; you write the checkpoint by hand. Either way, the discipline is the same: short chats, long files.
07 · When memory loss means you need human help
- The AI rewrites the same function every session, even though it is already well-implemented in the codebase
- Team members cannot onboard to the project because the AI contradicts the docs and nobody knows which is current
- New features take longer to ship than old features did — the AI velocity curve is going the wrong way
- Your
CLAUDE.mdis longer than yourREADME.md— a signal the memory file is doing work that belongs in real documentation or real abstractions - Every deployment surfaces a new regression in a file the AI “helpfully” cleaned up during an unrelated task
Why AI-built apps hit AI loses context
How LLM context windows actually work
A context window is the number of tokens the model accepts on a single forward pass — currently 200K tokens for GPT-4.1, 1M for Claude 4 Opus and Gemini 2.5 Pro, and similar ranges across the frontier. Those numbers are the nominal ceiling. The effective working window — the portion the model reliably attends to — is smaller, usually 30K-60K tokens depending on the task.
The gap between nominal and effective comes from how transformer attention scales. Every token attends to every other token through learned weights; at long distances those weights shrink, and competing signals near the current turn drown out older context. Benchmarks like “needle in a haystack” show frontier models can retrieve a literal string from anywhere in a 1M window, but retrieval is not reasoning. Ask the model to integrate twelve facts scattered across the window and performance collapses well before you hit the token limit. That is why the model “forgets ” code it was shown 50 messages ago even though the message is technically still in the prompt.
Symptoms you are hitting context limits
- The model re-asks questions you already answered (“what database are you using?”)
- It hallucinates type fields that were defined earlier in the session
- It forgets the file structure and proposes new folders that duplicate existing ones
- It introduces duplicate utilities — a second
formatDatethat lives next to your existing one - It re-imports already-imported modules inside the same file
- It mixes patterns from unrelated files — wraps a server component in
"use client"then adds a database call to it - It contradicts decisions it made two turns ago without acknowledging the reversal
Any one of these in isolation is noise. Three in the same session means the session is cooked. Stop, summarize, restart.
“Chats are disposable. State lives in files. If your memory plan depends on scrolling up, you do not have a memory plan.”
AI loses context by AI builder
How often each AI builder ships this error and the pattern that produces it.
| Builder | Frequency | Pattern |
|---|---|---|
| Cursor | Every mid-sized project | @codebase pulls too much; no .cursorrules file; Composer and Ask used interchangeably |
| Claude Code | Long sessions | No CLAUDE.md at repo root; /compact never used; single session covers multiple features |
| v0 | Past the first flow | Tool is front-end-only — context loss signals the project has outgrown v0 and needs to migrate to Next.js |
| Lovable | Global prompts | Thin project description + whole-app refactors; no per-component scope |
| Bolt.new | Medium | StackBlitz preview hides architecture drift; no persistent memory file by default |
| Replit Agent | Medium | Session-scoped memory; cross-session decisions are not retained without manual notes |
Related errors we fix
Stop AI loses context recurring in AI-built apps
- →Create a memory file (CLAUDE.md / .cursorrules / AGENTS.md) on day one of the project — not after context loss becomes a problem.
- →Cap every AI task at five files and 2,000 lines of changed surface area. Larger jobs get decomposed, not batched.
- →Use /compact in Claude Code at natural breakpoints — after a feature, after a debug session, after a refactor.
- →Treat chats as disposable: when a session starts drifting, write a 200-word summary and start fresh.
- →Review and update the memory file weekly. Every do-not rule earns its place by preventing a specific regression you already hit.
- →Split memory files along feature boundaries once the root file exceeds 200 lines — cascading CLAUDE.md files beat one monolith.
Still stuck with AI loses context?
AI loses context questions
Why does Cursor forget code I showed it yesterday?+
What's the difference between context window and working memory?+
Does a CLAUDE.md really help? It feels like ceremony.+
Should I start a new chat for every task?+
How big can my project get before Cursor starts losing context?+
Can I just use a bigger model to fix this?+
Ship the fix. Keep the fix.
Emergency Triage restores service in 48 hours. Break the Fix Loop rebuilds CI so this error cannot ship again.
Hyder Shah leads Afterbuild Labs, shipping production rescues for apps built in Lovable, Bolt.new, Cursor, Replit, v0, and Base44. our rescue methodology.