Reversa Tutorial: Reverse-Engineer Legacy Code Into AI-Ready Specs
Editorial note: Star and commit counts in this article are snapshots as of May 1, 2026, and will move quickly given the project's age. All commands and capabilities are drawn from the public Reversa repository and documentation site; verify against the latest README before installing.
Spec-driven development had a breakout 2025. GitHub Spec Kit hit roughly 71,000 stars per Martin Fowler's comparison of Kiro, spec-kit, and Tessl, AWS shipped Kiro as a full agentic IDE, and Andrej Karpathy's "vibe coding" coinage triggered a backlash that put structured specifications back at the center of AI-assisted engineering. Every one of those tools assumes the same thing: that you are starting from a spec.
That assumption breaks the moment you point an AI agent at a fifteen-year-old PHP monolith, or a Rails 4 app no one has touched in five years, or a .NET Framework codebase nobody wants to admit still runs payroll. There is no spec to drive from. There is only behavior buried in code that no single person on the team fully understands. Reversa, an MIT-licensed framework that just hit 351 GitHub stars, is built for exactly that situation.
TL;DR: Reversa (github.com/sandeco/reversa, 351 stars, MIT license) runs inside your existing AI coding agent and coordinates 14 specialist sub-agents through a five-phase pipeline — Reconnaissance, Excavation, Interpretation, Generation, Review — to turn an undocumented legacy codebase into traceable specs marked CONFIRMED 🟢, INFERRED 🟡, or GAP 🔴. Install with
npx reversa install, activate with/reversa. Source: github.com/sandeco/reversa, May 2026.
What Is Reversa, and Why Does It Exist?
Reversa is an MIT-licensed specification reverse-engineering framework you install inside an existing legacy project. It coordinates 14 specialist sub-agents (Scout, Archaeologist, Detective, Architect, Writer, Reviewer, plus optional Visor, Data Master, Design System, and a handful of others) through a five-phase analysis pipeline. The output is a structured specification ready for any modern coding agent to act on: C4 architectural diagrams, ERDs for the data model, state machines for the major workflows, and API contracts for any exposed interfaces. The project landed on GitHub a few weeks before this writing and has 351 stars on the main repo as of May 1, 2026.
The reason it exists is that spec-driven development tools have a blind spot the size of the actual enterprise software estate. Pragmatic Coders' 2025 legacy code research found that 70% of Fortune 500 companies still run software more than two decades old, and that 60–80% of typical IT budgets go to maintaining it. None of those codebases have specs. Spec Kit, Kiro, BMAD, and Tessl all assume the spec exists or is being authored fresh. Reversa is the only widely-adopted tool I have seen that explicitly sets out to produce the spec from a codebase that was never specified in the first place.
The framing in the project's own documentation is direct: Reversa is "the bridge between the legacy system and AI agents." It does not generate human-readable documentation in the usual sense. It generates operational contracts: structured artifacts that an AI coding agent can ingest before being asked to safely change anything. That distinction is load-bearing, and it is the reason Reversa is more interesting than the dozens of "AI documentation tools" that have shipped this year. The economics are starting to back the approach. CIO Dive reported in 2025 that the average cost of a typical COBOL modernization project dropped from $9.1 million in 2024 to $7.2 million in 2025, a 21% cost reduction in a single year, driven primarily by AI tooling compressing the discovery and translation phases. Reversa is built for exactly that discovery phase.
How Does the Reversa Pipeline Work?
Reversa runs the analysis as a five-phase pipeline with named agents responsible for each stage. Reconnaissance sends the Scout to map the repo, identify entry points, build a directory model, and produce a personalized exploration plan for the rest of the run. Excavation dispatches the Archaeologist (history, commit patterns, deprecated paths) and the Detective (behavior, runtime traces, side effects) to dig into the code itself. Interpretation hands the dig results to the Architect, who builds the structural model: the C4 diagram, the ERD, the state machines. Generation hands that model to the Writer, who produces the actual spec documents. Review closes the loop with the Reviewer agent, which validates each claim and tags it with a confidence marker.
The single most underrated feature is the confidence marker on every output. Each claim Reversa surfaces is tagged with one of three states. CONFIRMED 🟢 means the Reviewer can point at code that proves the claim: a function definition, a route handler, a database constraint. INFERRED 🟡 means the agents are confident but cannot cleanly verify, usually because the relevant behavior is split across files or buried in indirect calls. GAP 🔴 means the agents could not determine the answer at all and a human needs to look. Most LLM-generated documentation is written with the same confident voice regardless of whether the underlying claim was verified or hallucinated. Reversa refuses to do that, and that refusal is the entire reason its output is trustworthy enough to feed back into a coding agent.
The Detective agent in particular tends to surface things you had forgotten the codebase did. Old feature flags that are still wired up. Auth fallbacks for browsers nobody uses anymore. Cron jobs that fire on dates that have already passed. None of those would show up in a fresh-spec workflow because they are not features anyone would choose to write today. But they are in the code, they will affect any modernization work, and Reversa is the first AI tool I have used that actually finds them.
How Do You Install and Run Reversa?
Installation is two commands. From the root of the legacy project you want to analyze, run npx reversa install to scaffold the agent definitions and state directory. Then, inside whichever AI coding agent you already use, activate it with the slash command /reversa. For agents that do not support slash commands, the bare token reversa works as a prompt prefix. The framework introduces itself, builds an exploration plan based on what the Scout finds in your repo, and asks for confirmation before kicking off the longer phases.
# From the root of your legacy codebase
npx reversa install
# Then, inside Claude Code, Cursor, Codex, Gemini, etc.
/reversa
Reversa creates two directories. .reversa/ holds state, including .reversa/state.json, which checkpoints progress between phases. If your agent context resets mid-run (a real risk on large codebases), Reversa picks up from the last checkpoint instead of restarting from Reconnaissance. _reversa_sdd/ holds the actual outputs: diagrams, spec documents, and confidence-marked artifacts. Both directories are namespaced explicitly, so they are easy to gitignore, easy to find, and impossible to confuse with project source.
Prerequisites are minimal. You need Node.js 18 or newer for the install command, and you need an AI coding agent already configured. Reversa supports more than a dozen, including Claude Code, Cursor, Codex, and Gemini. It does not bring its own LLM, does not request API keys, and does not transmit anything to a Reversa-controlled service. The intelligence comes entirely from whichever agent you are already paying for. (For a different angle on Claude Code-native workflows, see our PPC audit automation tutorial.)
What Does the Output Actually Look Like?
After a full run, the _reversa_sdd/ directory contains a structured spec set. Expect a top-level architectural document built from the C4 model, ERDs for the data model, state machines for any non-trivial workflow the Detective surfaced, and per-component spec files for each major module. Inside each document, individual claims are tagged with the confidence marker and a reference back to the file and line range that supports them. A reviewer can click through from a CONFIRMED claim to the actual code that justifies it, which is what makes the output auditable rather than just confident.
A typical excerpt reads something like this. "Authentication uses bcrypt with a cost factor of 12 🟢 (src/auth/hash.js:14)." "Session timeout appears to be 30 minutes 🟡, observed in test fixtures and one middleware default, but no canonical config file sets it explicitly." "Rate-limiting strategy 🔴: endpoints under /api/v2/ appear to have inconsistent throttling. Manual review needed." That format is dense, reviewable, and structured for an agent to parse on the next pass. It is also a much more honest representation of what an LLM actually knows about a strange codebase than a polished prose document would be.
The artifact set is designed to be fed back in. Once Reversa has produced the spec, you can hand it to Claude Code or Cursor with a prompt along the lines of "use the spec in _reversa_sdd/ as the source of truth for the next change." That is when the second half of the value lands. The agent doing modernization work is no longer guessing at the codebase. It is operating against a spec it can audit.
How Does Reversa Compare to Spec Kit, Kiro, and BMAD?
It does not compete with them. It complements them. Martin Fowler's comparison of the SDD landscape is the clearest framing on offer: Spec Kit is a CLI for slash-command scaffolding around an existing agent (and the most-starred of the bunch at roughly 71,000 GitHub stars); Kiro is a full agentic IDE built around a three-phase spec workflow; Tessl is the most radical of the three, treating spec as the primary source of truth with code as a derived artifact. All three assume the spec is being authored. Reversa is structurally different. It produces the spec from code that already exists.
The natural pipeline is Reversa first, Spec Kit or Kiro second. Reversa reads the legacy codebase and produces a confidence-marked spec. You review the GAP 🔴 markers and the worst INFERRED 🟡 ones with a human. You then feed the cleaned spec to whichever spec-first tool fits your team. The modernization work, which is the part where you are actually changing code, happens against an artifact generated by reading the existing system rather than by reading your hopes for it.
When not to use Reversa: greenfield work where there is no legacy code to read, well-documented codebases where the spec already effectively exists, anything under roughly 2,000 lines that a senior engineer can hold in their head in an afternoon, and projects with no AI coding agent already in place (since Reversa relies on one). For everything else in the modernization category, it is currently the most interesting open-source tool in the space.
Is It Safe to Point Reversa at a Real Legacy Codebase?
Yes, by design. Reversa agents are scoped to write only into .reversa/ and _reversa_sdd/. No source file in your project is modified, deleted, or overwritten. The framework itself does not request, store, or transmit API keys; the LLM intelligence comes entirely from the agent already configured in your environment. The maintainer still recommends working from a clean Git state as a belt-and-braces measure, which is good practice with any agentic tool but not strictly required by Reversa's architecture.
Our reading: The realistic risk vector is not Reversa damaging your code. It is token spend. Multi-agent pipelines on a large codebase can run up a meaningful Claude or Codex bill. Start by pointing Reversa at one subdirectory, watch the Reconnaissance phase complete, get a feel for the cost, and only then commit to a full-repo run. The token economics will dominate the rollout decision more than anything else.
One operational note: if your codebase is not in Git, snapshot it first. Reversa will not corrupt it, but you want a recovery point before introducing any agent activity into a directory tree, on principle. Two minutes of git init && git add . && git commit is cheap insurance.
Should You Try Reversa?
Use Reversa if you have a codebase older than roughly three years that no single person on the team fully understands and you are about to ask an AI agent to change it. Skip it if your code is well-documented, under 2,000 lines, actively maintained by the people who originally wrote it, or in a language or framework where good static-analysis tooling already produces most of what Reversa would.
| Situation | Fit | Why |
|---|---|---|
| Inherited Rails 4 / Laravel 5 / .NET Framework codebase | Strong | Exactly the situation Reversa was built for |
| Modernization or platform-migration consulting | Strong | Confidence markers make the spec defensible to clients |
| Team handing off ownership of a service | Strong | Generates the artifact a new owner would otherwise have to author by hand |
| Greenfield project | Skip | Use Spec Kit or Kiro. There is no code to read yet. |
| Small, well-understood codebase (< 2k LOC) | Skip | A senior engineer can read it faster than the pipeline can analyze it |
| No AI coding agent in place yet | Skip for now | Reversa relies on an existing agent; set that up first |
The honest case against trying it right now is that Reversa is early. As of this writing the project has 351 stars and a small commit history. Breaking changes are likely. The agent-orchestration logic will get refined. The set of bundled specialist agents will probably grow. None of that is a reason to wait if you have a real legacy codebase you need to make sense of this quarter (early friction is the cost of being early), but it is worth knowing going in. (For a parallel example of a small, opinionated open-source tool that pays off this kind of early bet, see our piece on the text-to-cad open-source harness for Claude Code.)
Frequently Asked Questions
What is Reversa?
Reversa is an MIT-licensed specification reverse-engineering framework on GitHub (sandeco/reversa) that coordinates a team of specialist AI sub-agents through a five-phase pipeline to extract structured specs (C4 diagrams, ERDs, state machines, API contracts) from a legacy codebase. It runs inside an AI coding agent you already use, such as Claude Code, Cursor, or Codex, and had 351 stars as of May 1, 2026.
Does Reversa modify my code?
No. Reversa agents are scoped to write only into the .reversa/ state directory and the _reversa_sdd/ output directory. No source file in your project is modified, deleted, or overwritten. The maintainer still recommends working from a clean Git state as a precaution, which is sensible practice with any agentic tool.
Which AI agents does Reversa work with?
Reversa supports more than a dozen coding agents, including Claude Code, Cursor, Codex, Gemini, and several others. It does not bring its own LLM. The intelligence comes from whichever agent is already configured in your environment, so there is no separate Reversa subscription, no API key handling, and no second LLM bill on top of the one you already pay.
How is Reversa different from GitHub Spec Kit and Kiro?
Spec Kit (roughly 71,000 GitHub stars) and Kiro are spec-first tools: you author a specification and they help an agent generate code from it. Reversa is the inverse. It reads code that already exists and writes the specification back out. The two are complementary rather than competing. The natural workflow is Reversa first to extract the spec, then Spec Kit or Kiro to drive the modernization work against that spec.
The Bottom Line
Reversa is the tool that finally addresses the part of the spec-driven development story everyone else has been quietly avoiding. Spec Kit, Kiro, and the rest of that category implicitly assume the spec exists or is being written. Most enterprise software does not match that assumption. The codebases that need AI-assisted modernization the most are the ones that have no spec, no fully-knowledgeable owner, and no realistic path to either without a tool like this. Reversa is read-only, agent-agnostic, free, and the confidence markers make its output something a senior engineer can actually trust enough to act on. The cost of trying it is roughly an evening of token spend on one of your worst-documented internal services.
Pick the codebase you have been dreading. Run npx reversa install on a branch. Let Reconnaissance finish overnight. In the morning, see what the Detective found that you had forgotten was in the code. That single exercise will tell you more than any blog post (including this one) about whether this workflow fits how your team actually ships software. (For more in this open-source-agents lane, see our coverage of Agent Zero's autonomous agent framework.)
Member discussion