// guide · primer

Agentic coding, explained: from autocomplete to autonomous agents.

Published April 17, 2026 · 9 min read.

TL;DR

Agentic coding is software development where an AI system takes a task, decides on a sequence of actions, executes them against a real codebase, and iterates on feedback — with the developer supervising instead of typing. The four levels of autonomy (autocomplete → assisted edit → supervised agent → autonomous agent) are the mental model that survives every tool change. Well-tested codebases, well-specified tasks, and senior engineers benefit most; messy codebases, novel architecture decisions, and high-risk work still want close human review.

"Agentic coding" is the phrase the industry landed on for what's been happening since around 2024: AI tools that don't just complete lines of code, but plan, execute, and verify multi-step tasks on a real codebase. It's a clumsy term — "agentic" is a word almost nobody used in conversation three years ago — but it's the one we've got.

Here's what it actually means, how we got here, and the mental model that will save you the most arguments at work.

A definition that holds up

Agentic coding is software development in which an AI system takes a task, decides on a sequence of actions, executes them against a real codebase or environment, and iterates on feedback — with the developer supervising rather than typing.

The load-bearing words are sequence of actions, executes, and iterates. A chatbot that writes a snippet isn't agentic. A tool that writes a snippet, runs it, sees the error, fixes it, runs the tests, and opens a PR is agentic. The line isn't the model; it's the loop.

How we got here: a brief timeline

2021–2022: the autocomplete era

GitHub Copilot shipped. You typed, it guessed the next line. It was magical for the first month and then became invisible — which is the strongest thing you can say about a tool. The mental model was autocomplete, and autocomplete was about all that was on offer.

2023: chat in the editor

ChatGPT trained everyone to talk to LLMs. Cursor, Continue, and others started embedding a chat pane next to your code. You could say "refactor this function" and see a diff. Editing was still manual — you clicked "apply" — but the scope of a single AI action grew from a line to a function to a file.

2024: tool use and the first real agents

Two things clicked. First, the major model vendors shipped reliable tool use — the model could reliably produce structured function calls, and frameworks sprouted to feed the results back into the conversation. Second, context windows grew past the point where "read the whole repo" was a viable strategy.

That's the year Claude Code, Cursor Agent, Devin, and a long tail of open-source agents arrived. For the first time you could hand off multi-step work and come back to see it done. A lot of the early demos were over-promised and under-delivered, but the primitive — a model that plans, acts, and verifies — was genuinely new.

2025: MCP, protocols, standardization

Anthropic published MCP in late 2024 and it took over in 2025. For the first time, the tools agents could call were portable between clients. An agent that knew how to query your Postgres, commit to your git, and update your Linear ticket wasn't a bespoke integration — it was a stack of off-the-shelf MCP servers.

The same year, Google shipped A2A for agent-to-agent communication, and OpenAI reluctantly began supporting MCP in their Agents SDK. The ecosystem stopped being proprietary fiefdoms and started looking like a protocol layer.

2026: agentic by default

Which brings us to now. Every major coding assistant has an "agent mode." The debate is no longer whether agents work — they clearly do, on well-scoped tasks — but where the failure modes are and how much autonomy to grant. Which is the useful question.

The four levels of autonomy

Here's the mental model that has held up best. Think of AI coding on a spectrum of who's driving:

Level 1 — Autocomplete

AI suggests, you accept or reject, one fragment at a time. Copilot circa 2022. You're fully in control; the AI is a very fast fingers.

Level 2 — Assisted edit

You select a region, describe the change, AI proposes a diff, you apply. Cmd-K in Cursor. Still human-driven; the AI is a collaborator on single-step tasks.

Level 3 — Supervised agent

You describe a multi-step task, the AI plans and executes, you review each tool call and diff before it proceeds. Cursor Agent with confirm-each-step on. Claude Code with permission prompts. This is where most production teams live in 2026.

Level 4 — Autonomous agent

You describe a task, the AI executes end-to-end — tests, commits, opens a PR — and you come back to review the result. Claude Code with permissions relaxed. Devin-style "just do the task." Useful, but only when the task is well-scoped, well-tested, and the blast radius is bounded.

Teams that get value from agentic coding know which level each task belongs at. Greenfield script? Level 4. Touching the auth system? Level 2 at most. The mistake is treating "agentic" as a binary.

What actually works in 2026

Well-specified tasks with clear success criteria

Agents thrive when "done" is obvious. A failing test to fix, a feature described in a ticket, a migration to write. Give them a target and they hit it.

Codebases with strong test suites

The most reliable agent behavior in the wild is "run the tests, iterate until green." Teams with 80%+ test coverage get dramatically more value from agents than teams without. This one fact has quietly shifted how a lot of companies think about testing ROI.

Boring, repetitive work

Porting a config format. Adding logging to 40 endpoints. Upgrading a dependency across a monorepo. Tasks that are tedious but well-defined are where agents save the most real engineering hours.

Greenfield prototyping

Weekend projects, internal tools, proofs-of-concept. The "you're going to rewrite it anyway" category. Agentic tools shine here because the cost of a wrong call is low.

What still doesn't work

Novel architecture decisions

Agents are good at executing plans. They're not good at making the one-way-door architectural decisions that determine whether a system scales. Use them to prototype three options; don't let them pick which one ships.

Messy, under-documented codebases

Agents are only as good as the context they can load. If your codebase has tribal knowledge, implicit contracts, and no docs, agents will confidently do the wrong thing. The fix isn't a better agent — it's a CLAUDE.md and some READMEs.

Work with high security or financial risk

Anything where a wrong call costs real money or real user trust wants human review on every step. Level 3, minimum.

How to think about your team

Three questions:

Where this is headed

Honest answer: fast, in ways that are hard to predict specifically. The bets worth making are slower-moving:

If you want to stay on top of where this all goes, that's what Agentic Dev exists for — a daily read-out on the tools, releases, and workflow shifts, filtered for signal. Subscribe below and we'll do the curation.

Further reading

Stay current on agentic coding.

Daily email. Tools, workflows, and the state of agent autonomy — curated.