Reverse-Engineer a Repo into an Engineering Guide
Reusable prompt. Copy the block below into a model and point it at any GitHub repository.
How to use
- Clone or open the target repository so the model has filesystem access.
- Paste the prompt below into the model.
- Replace
<REPO_PATH>with the absolute path to the repo (or leave it and supply via context). - The model will produce two HTML files in
./.ai-docs/in the current working directory.
Output contract
./.ai-docs/<repo-name>-code-analysis.html— working document with audit trail../.ai-docs/<repo-name>-engineering-guide.html— polished engineering guide.- Both files are single, self-contained HTML documents with inline minimal light-theme CSS, no external assets, no JavaScript.
The prompt
You are a senior/staff software engineer performing a reverse-engineering pass on an existing
codebase. Your job is to recover the engineering understanding that would let a new engineer
confidently work in the system: how the code is organized, how the major flows work, what
patterns are used, where responsibilities live, and what risks or ambiguities remain.
You will produce TWO HTML artifacts, both written to `./.ai-docs/` in the current working
directory (create the directory if it does not exist):
1. `<repo-name>-code-analysis.html` — the working document / audit trail
2. `<repo-name>-engineering-guide.html` — the polished engineering guide
Repository to analyze: <REPO_PATH>
================================================================================
GROUND RULES
================================================================================
- Describe HOW THE CODE WORKS, not just what the product does.
- Use implementation evidence directly: framework names, library names, file paths,
function names, class names, types, architectural patterns, data structures, algorithms,
commands, configuration files, and tests are all allowed and encouraged when useful.
- Distinguish observed facts from inferred intent. If something is inferred, label it clearly.
- If docs and code disagree, code is the source of truth. Note the divergence in the working
document and explain the code behavior in the final guide.
- Do not invent architecture, flows, conventions, or guarantees. If evidence is thin, mark it
as an open question.
- Prefer concrete references over vague summaries. Cite the files, symbols, tests, or commands
that support each important claim.
- Focus on helping an engineer modify, debug, extend, or review the system safely.
- Do not include secrets, credentials, tokens, or sensitive local environment details.
================================================================================
PROCESS
================================================================================
PHASE 1 — Documentation pass
Read and extract stated intent, architecture, setup, and operating guidance from:
- README, README.* in root and subdirectories
- /docs, /doc, /documentation directories
- SPEC.md, DESIGN.md, ARCHITECTURE.md, ROADMAP.md, VISION.md, RFC*.md
- CHANGELOG.md, HISTORY.md
- CONTRIBUTING.md, AGENTS.md, CLAUDE.md, COPILOT*.md
- issue/PR templates
- Wiki content if available
- Top-level comments in entry-point files
Produce a draft understanding of the system. This is a hypothesis, not the answer.
PHASE 2 — Repository map
Identify the high-level structure of the repository:
- Applications, packages, services, libraries, tools, scripts, and generated artifacts
- Main entry points and runtime processes
- Build, test, lint, type-check, packaging, deployment, or release surfaces
- Configuration files that shape runtime or developer behavior
- Important directories that should not be manually edited
PHASE 3 — Architecture discovery
Determine the system architecture from code:
- Major subsystems and their responsibilities
- Dependency direction and ownership boundaries
- Composition roots and dependency injection patterns
- External interfaces: CLI commands, HTTP routes, IPC channels, exported APIs,
UI entry points, scheduled jobs, event subscribers, background workers
- Persistence, cache, queue, filesystem, network, and credential boundaries
- Security and trust boundaries
PHASE 4 — Core flow tracing
Trace the most important end-to-end flows through the code. For each flow, identify:
- How the flow is entered
- The primary modules/classes/functions involved
- The data or state passed through the flow
- Error handling and cancellation behavior
- Observable outputs or side effects
- Tests that verify the behavior, if present
Choose flows based on actual system importance, such as:
- Application startup
- Authentication/session lifecycle
- Primary user action or request path
- Background job or scheduler execution
- Data persistence/update path
- External integration call path
- UI-to-backend communication path
- Shutdown/cleanup path
PHASE 5 — Pattern and convention extraction
Identify recurring implementation patterns:
- Naming conventions
- File organization conventions
- Service/module/class structure
- Error handling patterns
- Logging/observability patterns
- Validation and parsing patterns
- Test style and test fixture patterns
- State management patterns
- API or event payload conventions
- How new functionality is typically added
PHASE 6 — Quality, safety, and operability review
Identify:
- Existing test coverage by area
- Critical paths with weak or missing tests
- Lint/type/build guarantees
- Known TODOs, FIXME comments, feature flags, deprecated paths, or migration seams
- Error-prone areas, implicit contracts, or surprising coupling
- Security-sensitive code paths
- Performance-sensitive or concurrency-sensitive code paths
PHASE 7 — Synthesis
Produce:
- A working analysis document with evidence, uncertainty, and audit trail
- A polished engineering guide suitable for onboarding or future maintenance
================================================================================
ARTIFACT 1 — WORKING DOCUMENT (`<repo-name>-code-analysis.html`)
================================================================================
Audience: the engineer doing the reverse-engineering. Honest, annotated, shows seams,
uncertainty, and supporting evidence.
Structure:
1. Overview — repo identity, primary purpose, main technologies, and inferred runtime model
2. Sources consulted — docs, config files, entry points, tests, and other evidence reviewed
3. Repository map — major directories and what they contain
4. Entry points — application, API, CLI, UI, job, event, or package entry points
5. Architecture notes — subsystems, boundaries, dependencies, and composition patterns
6. Core flows traced — step-by-step technical traces with evidence
7. Implementation patterns — recurring conventions and examples
8. Data and state — important types, schemas, persisted state, caches, queues, or files
9. Error handling and observability — how failures are surfaced, logged, retried, or ignored
10. Testing and verification — test layout, commands, coverage signals, and gaps
11. Security and trust boundaries — credentials, sandboxing, permissions, validation, unsafe edges
12. Risks / sharp edges — surprising coupling, fragile assumptions, concurrency concerns, migrations
13. Divergences from documentation — where docs and code disagree
14. Open questions / ambiguities — what a maintainer should clarify next
Each important claim should include evidence such as:
- File path and symbol name
- Test name
- Command name
- Configuration key
- Runtime entry point
- Observed dependency relationship
================================================================================
ARTIFACT 2 — ENGINEERING GUIDE (`<repo-name>-engineering-guide.html`)
================================================================================
Audience: an engineer joining the project or preparing to make changes. Clean,
authoritative, practical, and implementation-focused. It should read like a technical
onboarding guide, not an audit log.
Structure:
1. Overview — what the system is, what it does, and how it runs
2. Mental model — the simplest accurate model of the system's architecture
3. Repository layout — important directories and files, with responsibilities
4. Runtime architecture — processes, entry points, service boundaries, and dependency flow
5. Major subsystems — each subsystem's responsibility, key files/classes/functions, and collaborators
6. Core workflows — concise traces of the most important flows through the code
7. Data model and state — important types, schemas, persistence, caches, queues, and filesystem usage
8. External interfaces — APIs, IPC, CLI, events, jobs, integrations, or exported package surfaces
9. Cross-cutting patterns — error handling, validation, logging, observability, configuration, security
10. Testing strategy — how tests are organized, what commands to run, and what areas are covered
11. How to make common changes — practical guidance for adding a new feature, route, command,
UI surface, service, job, integration, or test, depending on what exists in this repo
12. Operational concerns — build, packaging, deployment, release, migrations, feature flags,
local development, or troubleshooting
13. Known risks and open questions — only items that matter for future engineering work
For each major subsystem include:
- Responsibility
- Key implementation files/symbols
- Public or external interface
- Important collaborators
- State owned or modified
- Tests or verification points
- Common extension pattern
For each core workflow include:
- Trigger
- Step-by-step path through code
- Important data passed along the way
- Side effects
- Failure behavior
- Tests or verification points
================================================================================
HTML / STYLE REQUIREMENTS (both files)
================================================================================
- Single self-contained HTML file, inline <style> block, no external CSS, no JS,
no images, no web fonts.
- Light theme. White or near-white background. Near-black body text.
- System sans-serif font stack only.
- Max content width ~860px, generous line-height (~1.6).
- Subtle heading hierarchy using size and weight.
- One restrained accent color for links and code references.
- No sidebars, navs, banners, or decoration. Minimal chrome.
- Use semantic HTML: <h1>–<h3>, <p>, <ul>, <ol>, <dl>, <table>, <code>,
<pre>, and <section> where appropriate.
- Code references should use <code>.
- Must render correctly when opened directly in a browser with no server.
================================================================================
DELIVERABLE
================================================================================
After writing both files, print to the console:
- The absolute paths of both files
- A one-paragraph summary of the system as you now understand it
- A count of:
- major subsystems mapped
- entry points identified
- core flows traced
- recurring implementation patterns identified
- risks / sharp edges found
- open questions remaining
Do not begin writing the final guide until you have read enough of the repository to make
confident technical claims.
Do not invent architecture, conventions, or flows.
If evidence is thin, mark it as an open question in the working document and either omit it
from the guide or clearly qualify it.
Notes
- The working document is the receipts; the engineering guide is the onboarding map. Keep them separate in tone.
- If you run this against multiple repos, the
.ai-docs/directory accumulates a portable library of engineering guides over time.