Matt Pocock's Dictionary of AI Coding: 70+ Terms Explained
Editorial note: Term counts and section structure in this article reflect the public dictionary-of-ai-coding repository as of May 2, 2026. The repo ships frequently. If a definition reads slightly differently when you check, the upstream README is the source of truth.
Most "AI coding glossary" posts try to define every term anyone has ever used. Two hundred entries, alphabetical, no narrative, no opinion on what matters. They are reference works, not curricula. Matt Pocock's dictionary-of-ai-coding is a different shape entirely, and that is the whole reason it is interesting.
Pocock organised 70+ terms into seven sections that run in a deliberate order: model, then sessions and context windows, then tools, then failure modes, then handoffs, then memory and steering, then patterns of work. Read straight through, the dictionary is closer to a syllabus than a reference. This post walks the seven sections, pulls out the terms that change how you actually use Claude Code or Codex the most, and points to where each concept shows up in real workflows.
TL;DR: Matt Pocock's open-source dictionary defines 70+ AI coding terms across seven ordered sections. The dictionary is free at github.com/mattpocock/dictionary-of-ai-coding, written in plain English, and structured so reading start-to-finish gives you a working mental model of how a modern coding agent operates. The seven concepts that matter most: harness, context window, attention degradation, compaction, AGENTS.md, skill, and the spectrum from vibe coding through human-in-the-loop to AFK.
What Is the Dictionary of AI Coding, and Who Is Matt Pocock?
The dictionary is an open-source GitHub repository that defines more than 70 terms used in modern AI-assisted coding, written in plain English, organised into seven thematic sections. Pocock states the motivation directly: AI coding "can feel like it's just for experts," with unexplained jargon and mysterious failures, and the dictionary is the translation layer.
Matt Pocock is best known as a TypeScript educator (Total TypeScript) who pivoted hard into agentic engineering content over the past year. The dictionary shipped alongside his mattpocock/skills repo, a separate collection of agent skills that went through the developer Twitter cycle in early 2026. The two repos are designed to be read together: the dictionary teaches the vocabulary, the skills repo shows what disciplined daily work looks like once you have it.
The seven sections, in order: The Model (14 terms covering what the model actually is and how billing works), Sessions, Context Windows & Turns (8 terms on how state accumulates), Tools & Environment (9 terms on how the agent acts on the world), Failure Modes (9 terms on what goes wrong and why), Handoffs (7 terms on transferring work between sessions), Memory and Steering (5 terms on cross-session continuity), and Patterns of Work (8 terms on review and oversight styles). The progression mirrors how a working developer actually meets these ideas: first you need to understand the engine, then the vehicle, then how to drive it without crashing.
Why Does This Vocabulary Matter Right Now?
Per the Pragmatic Engineer's February 2026 developer survey of roughly 900 working engineers, Claude Code now handles 44% of complex coding work (multi-file refactoring, architecture decisions, large-scale debugging) versus 19% for Codex/ChatGPT (The Pragmatic Engineer, 2026). The category leader's documentation, in turn, casually drops terms like harness, compaction, and AGENTS.md with little or no definition. The vocabulary gap is between most developers and the tool they are already using daily.
Two practical consequences flow from that gap. First, bug reports get muddled. "The agent went off the rails" is not a useful description; "attention degradation past about 75% context fill, after which the agent began ignoring the system prompt's coding conventions" is. Second, team conventions drift. Without shared words for handoff artifact, compaction, and clearing, every developer invents their own habit and writes their own undocumented rules.
For a parallel look at how teams are wiring agents into actual workflows, see our coverage of agent skills as a marketing operations layer — the same vocabulary applies on the non-engineering side of the org.
The Model: How Does It Actually Work?
Section one (14 terms) tries to cure one specific confusion: developers treating the model as if it remembers things. Pocock's framing is that the model itself is stateless. It takes context in, predicts one token, appends it, and runs again. The "memory," the personality, the project awareness — all of that is stitched on by the harness around the model, not by the parameters inside it.
Three terms in this section earn their keep on day one. Token, the atomic unit the model reads and writes (roughly word-sized but not exactly). Inference, the act of running the trained model, which is what every chat turn actually is and what every dollar on your bill pays for. Prefix cache, the provider-side trick where shared request prefixes get reused at a lower rate — the reason a long system prompt is cheaper than it looks and the reason restructuring messages can quietly double your cost.
The most useful single sentence in the section is the definition of harness: everything around the model that turns it into something agentic — tools, system prompts, context management, recovery logic. The shorthand Pocock uses is Agent = Model + Harness. Once that frame is in place, a lot of confused conversations resolve themselves. Claude is a model. Claude Code is a harness on top of Claude. Cursor is a harness too, with different defaults. The model you rent and the harness you run are different products with different choices, and most of what you experience as "this tool's behavior" is the harness, not the model. (For an open-source example of what that means in practice, see our piece on a custom Claude Code harness for engineering work.)
Why Do Long Sessions Get Dumber?
Sections two and four (sessions, context windows, and failure modes) together explain the phenomenon every regular user has noticed: the agent is sharper at the start of a session than at the end. Pocock has names for both halves. Early in the session, the model sits in the smart zone: low context fill, attention well-distributed across what matters, instructions followed cleanly. As context accumulates, the model drifts into the dumb zone: more noise to ignore, finite attention budget spread thinner, signal on meaningful relationships fading.
The technical name is attention degradation. Each token in the context has a finite capacity to influence other tokens; as the session grows, that capacity gets diluted. The model does not announce the transition. You just notice that it has stopped reading AGENTS.md, started inventing function signatures, or quietly forgotten a constraint you set 40 messages ago.
From our own sessions: The signature pattern is that the agent stops grounding answers in files it has read. Mid-session, asked to extend a function, it will write code that ignores the existing import block and invent a helper that already exists three files away. It is not "lying"; the attention budget for the original file has shrunk below the threshold needed to keep the constraint live. Clearing the session and re-attaching the relevant files fixes it almost every time.
The dictionary's other gift here is the split between parametric knowledge (facts encoded in the model's weights at training time, frozen at the knowledge cutoff) and contextual knowledge (facts the model is reading in this session). When you debug an "it just made up that API," the question to ask is which kind of knowledge it was relying on, and whether you actually loaded the source of truth into context or only assumed the parametric memory would carry. Most hallucinations are not mysterious once you have those two terms.
How Does an Agent Actually Touch the World?
Section three (9 terms) covers tools and environment, and the central observation is that the agent has exactly one window onto reality: tool results. A tool call is structured text the model emits, naming a tool and its arguments. The harness executes it. The tool result comes back as text the model reads on the next turn. The model never sees your filesystem; it only sees what the harness tells it about the filesystem after a Read or Bash call returns.
This sounds pedantic until you notice how much it explains. Why does the agent sometimes "forget" what is in a file? Because it never read the file in this session and the parametric memory is wrong. Why does asking it to "look around" produce confident nonsense? Because looking around in plain English does nothing; only a tool call does. Why does Claude Code feel different from a chat window with the same model? Because the harness exposes a different toolset and a different permission model.
The other five terms in the section are practical sandbox knobs: permission request (the user-approval pause before an unapproved tool runs), permission mode (which tool calls trigger approval versus run automatically), agent mode (preset bundles of permissions and behavior), sandbox (the isolated environment that contains the blast radius of agent actions), and environment (the world outside the harness, usually a filesystem). Together they define how much trust you have extended to the agent and how much damage it can do without checking with you. (For an extreme worked example, see our review of an open-source framework where the OS itself is the sandbox.)
When Should You Hand Off, Compact, or Just Clear?
Section five gives the cleanest practical decision in the whole dictionary. As context fills and the dumb zone approaches, you have three moves. Clearing ends the session and starts fresh with empty context — cheap, total, no continuity. Compaction is an in-memory handoff: the agent summarises the session so far and uses that summary to seed a new one — preserves momentum, loses fidelity. Handoff is the explicit version, where you write a handoff artifact (a document in the environment) that the next session reads at startup.
Pocock's quietly useful term here is the distinction between a ticket (single-session work scope) and a spec (multi-session work made of tickets). A ticket should never need a handoff. If you are reaching for compaction inside a ticket, the real problem is that the ticket is too large; carve it down. Specs are different. They are designed to span sessions, and the handoff artifact — usually a checked-in markdown file describing state, decisions, and next steps — is the connective tissue.
| Move | When to use it | What it costs |
|---|---|---|
| Clear | Task is finished, or the session has gone off the rails | Total context loss; cheap and clean |
| Compact (autocompact) | Mid-task, hitting context limits, momentum matters more than fidelity | Summary loses detail; the next session inherits a model's interpretation, not the raw record |
| Handoff (artifact) | Multi-session spec; the work will outlive any one session | Up-front authoring effort; durable and reviewable |
Skills, AGENTS.md, and Subagents: The Memory Layer
Section six (5 terms) covers the cross-session continuity layer, and it is probably where the dictionary teaches a habit most readers do not yet have. A memory system persists information to the environment (files, a database) and reloads it into fresh sessions. AGENTS.md is the canonical example: a project brief checked into the repo, loaded into context at session start, that tells the agent what the project is, what conventions to follow, and where things live.
The discipline term is progressive disclosure. Do not load every reference document into context up front. Load only what is needed, with clear pointers to the rest, so the agent can pull the next thing on demand. That is exactly what an AGENTS.md plus a skills/ directory gives you: a small front door with named branches off it. Pocock's companion mattpocock/skills repo is a living example of skill bundling at scale.
The last term in the section is subagent, an agent spawned by another agent's tool call, running in its own session with its own context. Subagents are how you keep the parent's smart zone clean during a long task: instead of polluting the orchestrator's context with a 200-line database query result, the orchestrator dispatches a subagent that reads, reasons, and reports a five-line summary. Combined with progressive disclosure, this is the practical mechanism for staying in the smart zone for hours instead of minutes. (For a deeper example of where the memory question gets hard, see our piece on a self-hosted agent memory layer.)
Vibe Coding, Human-in-the-Loop, AFK: Picking the Right Pattern
Section seven (8 terms) is where the dictionary stops being neutral and starts implying a worldview. Pocock lists six review styles in a single section and the implicit message is that they are not interchangeable. Human-in-the-loop is real-time pairing during a session. AFK is unattended execution, often parallel sessions while the user is doing something else. Automated check is deterministic verification — tests, type checks, lints — that the agent can self-correct against. Automated review is one agent reviewing another agent's work. Human review is the user reading the actual diff. Vibe coding is accepting the agent's code without human review and only checking that the behavior works.
Vibe coding the term was coined by Andrej Karpathy in February 2025 (Wikipedia, 2026), and it has since become the lazy default that everyone-who-knows-better is trying to get past. Pocock's framing helps: vibe coding is one valid pattern among several, appropriate when the cost of being wrong is low and the cost of careful review is high (a throwaway script, a one-off prototype). It is the wrong default for code that other humans will run, maintain, or trust. The right default is some mix of automated checks, automated review, and selective human review — and the mix should be chosen on purpose, not drifted into.
Our reading: The single most under-discussed term in the section is grilling — Socratic-style agent interviewing to develop a design concept before any planning starts. Most quality problems in agent output trace back to skipping this step. The agent built the wrong thing well, because nobody pinned down what "the thing" actually was.
That mental shift — from "what should the agent build" to "what is the design concept we are both committing to before any code is written" — is the line between vibe coding and what Karpathy himself later called agentic engineering. Pocock's dictionary gives you the words to make the line concrete on a real team.
How Should You Actually Use the Dictionary?
The dictionary is a 30-minute end-to-end read, then a lookup forever. The mistake is treating it like a textbook to memorise. Read it once in order, in a single sitting, so the seven-section progression lands as a mental model. Bookmark the repo. Drop the link into your team's AGENTS.md as a recommended onboarding read. When a teammate uses a term you do not have in your head — compaction, attention degradation, autocompact, grilling — check the entry, not a Stack Overflow answer.
One stronger move, if you run a team: spend an hour aligning on the vocabulary. Pick the eight or ten terms your team uses most (probably harness, context window, tool call, handoff, compaction, AGENTS.md, skill, and the review patterns) and agree to use them consistently in standups, PRs, and incident reviews. Vocabulary discipline is the cheapest team-quality upgrade in 2026, and it has compounding returns; every later conversation is shorter when both parties already share the words. (For a worked example of this on the tooling side, see our review of Claude Code drop-in alternatives — the differences are easier to describe once you have the harness vocabulary.)
Frequently Asked Questions
What is the dictionary of AI coding?
It is a free, open-source glossary by Matt Pocock that defines 70+ AI coding terms across seven sections: model, sessions and context windows, tools and environment, failure modes, handoffs, memory and steering, and patterns of work. The project lives at github.com/mattpocock/dictionary-of-ai-coding and is written in plain English on purpose.
What is a "harness" in AI coding?
The harness is everything around the model that turns it into a working agent: the agent loop, tool dispatch, the permission system, context management, the system prompt, and recovery logic. Pocock's shorthand is Agent = Model + Harness. The model is the brain you rent; the harness is the body and workplace built around it.
What does "vibe coding" actually mean?
Vibe coding is a working pattern where the user accepts the agent's code without human review and only checks that the behavior works. The diff is treated as opaque. The term was coined by Andrej Karpathy in February 2025 and Pocock's dictionary lists it alongside human-in-the-loop and AFK as one of three review patterns.
What is compaction and when should I use it?
Compaction is an in-memory handoff. The agent summarises the current session and uses that summary to seed a fresh one, so context capacity resets but task momentum survives. Use it when you are approaching context limits but the work is unfinished. Clear instead if the task is genuinely done.
Is the dictionary the same thing as "agentic engineering"?
Closely related but not the same. Agentic engineering is the discipline of working with coding agents under human oversight, with verification and review baked in. Pocock's dictionary is the vocabulary you need to do that work without faking it. One is the practice; the other is the language the practice runs on.
The Bottom Line
Most AI glossaries define everything and teach nothing. Pocock's dictionary defines fewer things on purpose, in an order chosen on purpose, and the result is a 30-minute read that quietly upgrades how you think about every coding agent you use after it. The terms that change the most behavior are not the obscure ones; they are the basics most developers fake their way around — harness, tool result, attention degradation, compaction, AGENTS.md, grilling.
Read it once, end to end. Drop the link in your team's project brief. Argue about which terms your team will use consistently and write them down. The next time someone says "the agent went off the rails" you will have a real conversation about which rail and at what context fill, and that conversation is most of agentic engineering in 2026.
Member discussion