abelcastro.dev

The Problem With Splitting Human and Agent Docs | abelcastro.dev

The short version. Sort docs by content type, not by audience. One README per meaningful module, read by both humans and agents. Automated checks go in tooling. Judgment calls go in a small conventions file. Contracts live in code. Per-task intent lives in the ticket. Agent-only rules exist but stay rare. Specs exist but only for features that cross modules and stakeholders. That's the whole playbook. Most of it is standard advice, the wrinkle is how the agent-specific pieces fit in without taking over. This is the follow-up to a previous post where I argued that spec-driven development is solving the wrong problem and that agent context should live where agents naturally look. Given that an AI agent is now one of the readers, what should the documentation that does exist actually look like? The rest of this post is the answer, piece by piece, and what goes wrong when you split by audience instead.

Sort by content type, not by audience

The common instinct is to split docs by audience. Architecture docs for humans, CLAUDE.md or AGENTS.md for agents, keep them in their lanes. I've tried this and it falls apart fast: the two docs drift, they contradict each other, and I end up maintaining two versions of the same content.

The better cut is by content type. The same system gets documented in multiple ways depending on what you're communicating. Imagine an orders module in some backend app:

The explanation of what the orders module does and why.
The constraint that only the auth module can verify tokens.
The CreateOrderDto that defines what a valid order looks like.

Three kinds of content, three different natural homes:

a README
a conventions file or a lint rule
the code itself

Once content types are sorted, the audience question mostly disappears. A module README that describes what the module does doesn't need a twin anywhere. Both readers want the same information.

Here's the split:

What each module does and why goes in a README, co-located with the code. The README also covers dependencies and what patterns to follow inside the module. I'll use "module context" as shorthand for this in the tree below. Both humans and agents read the same file.

Constraints split into two kinds. The ones a linter or type checker can enforce belong in tooling: ESLint rules, ruff configs, strict TypeScript, pre-commit hooks, CI checks. These run automatically and fail the build, which is stronger than any prose could be. The ones that can't be automated, usually judgment calls and architectural patterns, go in a short conventions file. An example of each: ESLint can enforce "no any," but it can't enforce "business logic lives in services, not in controllers." The first belongs in .eslintrc. The second belongs in docs/conventions.md.

Navigation goes in a small index like CLAUDE.md. Its job is to point at the READMEs and the conventions doc, not to re-describe content that lives elsewhere. If the repo structure is clear enough, this file can be thin or skipped entirely.

Code contracts already live in the code. TypeScript interfaces, Django models, OpenAPI schemas, typed function signatures. These are machine-readable and accurate by construction. Writing a prose version of them in a README creates two sources of truth and one of them will drift.

Per-task intent goes in the ticket, not in the repo. A GitHub issue or Jira ticket is where "what are we trying to do right now" lives. Copying it into a markdown file just adds something else to keep in sync.

What this looks like in practice

A monorepo with a NestJS API and a Next.js web app, structured around these categories:

my-monorepo/
│
├── README.md                          # Navigation: what this is, how to run it, where to go next
├── CLAUDE.md                          # Navigation: index pointing at READMEs and conventions
│
├── .eslintrc.js                       # Tooling: automated checks (no-any, import rules)
├── .prettierrc                        # Tooling: formatting checks
├── .husky/                            # Tooling: pre-commit hooks
│   └── pre-commit
├── .github/
│   └── workflows/
│       └── ci.yml                     # Tooling: lint + type-check + test gates
│
├── .claude/                           # Agent rules (only when they earn their place)
│   └── rules/
│       ├── no-secrets-in-code.md      # Rule: scoped to the whole repo, high-stakes
│       └── auth-boundary.md           # Rule: enforces service boundary the linter can't catch
│
├── docs/
│   ├── architecture.md                # System overview: diagram, boundaries, how modules fit together
│   ├── conventions.md                 # Constraint (judgment-based): architectural patterns, testing philosophy
│   └── specs/                         # Exception: one spec per cross-cutting feature
│       └── place-an-order.md          # Example: feature touching UI, API, payments, inventory
│
├── apps/
│   │
│   ├── api/                           # NestJS app
│   │   ├── README.md                  # Module context: what this app does, how it's shaped
│   │   ├── ...
│   │   │
│   │   └── src/
│   │       ├── main.ts
│   │       │
│   │       ├── orders/
│   │       │   ├── README.md          # Module context: orders module purpose, deps, patterns
│   │       │   ├── orders.controller.ts
│   │       │   ├── orders.service.ts
│   │       │   ├── orders.service.spec.ts
│   │       │   └── dto/
│   │       │       ├── create-order.dto.ts     # Contract: input shape lives in the code
│   │       │       └── order-response.dto.ts   # Contract: output shape lives in the code
│   │       │
│   │       ├── payments/
│   │       │   ├── README.md          # Module context: payments module, Stripe integration notes
│   │       │   ├── payments.service.ts
│   │       │   └── ...
│   │       │
│   │       └── auth/
│   │           ├── README.md          # Module context: auth module, token flow, guard usage
│   │           └── ...
│   │
│   └── web/                           # Next.js app
│       ├── README.md                  # Module context: what this app does, routing notes
│       ├── ...
│       │
│       └── app/
│           ├── checkout/
│           │   ├── README.md          # Module context: checkout flow, states, key components
│           │   ├── page.tsx
│           │   └── ...
│           │
│           └── account/
│               ├── README.md          # Module context: account section purpose and structure
│               └── ...
│
└── packages/
    │
    ├── shared-types/
    │   ├── README.md                  # Module context: what types live here and why
    │   └── src/
    │       ├── order.ts               # Contract: shared type definitions
    │       └── ...
    │
    ├── ui/
    │   ├── README.md                  # Module context: component library purpose, patterns
    │   └── ...
    │
    └── eslint-config/
        ├── README.md                  # Module context: what this config enforces
        └── index.js                   # Tooling: shared lint rules for the monorepo

A few things worth pointing out. Not every folder gets a README, only meaningful modules and packages. CLAUDE.md is pure navigation, it doesn't re-describe the architecture. DTOs and shared types are contracts in code, not prose. .claude/rules/ has exactly two files, not a sprawl, and each spec in docs/specs/ is there because it can't be scoped to a module or ticket. That's not accidental.

If a tool can enforce it, let it

The best constraint is one that fails the build. ESLint catches bad imports. TypeScript's strict mode catches any. ruff and Black handle formatting. pre-commit hooks catch common mistakes before they get committed. When tooling can enforce a constraint, writing it in prose is strictly worse: slower feedback, relies on someone reading the doc, and drifts away from the code over time.

This has a second benefit for working with agents. A constraint the build enforces applies every time, including when the agent writes code. A constraint that lives only in prose applies when the agent reads and remembers it, which is less reliable. If it matters, making it automated makes it real.

The conventions file ends up smaller than you'd think when tooling handles the automatable checks. What's left is the stuff that genuinely needs judgment: architectural patterns, testing philosophy, workflows that span multiple tools. Those still benefit from being written once and read by both humans and agents, but the list is short.

Why prose isn't the default answer

Even for content that stays in prose, the instinct to write more because "the agent needs full context" pushes in the wrong direction. Prose has real costs for agents that I didn't appreciate at first.

Token cost is real and compounds. A 500 line README burns tokens on every agent session that loads it. Across a team running many sessions a day, that adds up to a cost that shows up on the bill.

More context isn't always better context. When a constraint is buried in narrative prose, the agent has more surface area to get distracted by. From what I've seen, shorter and more structured tends to produce more reliable behavior than thorough and narrative.

Over-specification invites over-production. If the README lists fifteen edge cases because I wanted to be thorough, the agent may write code for all fifteen even when the task only needs three. That's slop, caused by the doc, not the agent.

There's no feedback loop for overexplaining. A doc that's too vague shows up quickly: the agent produces wrong output, tests fail, or the reviewer pushes back. A doc that's too long has no equivalent signal. The agent still produces working output, just after chewing through more context than it needed. Nothing fails, so nothing tells you the doc got bloated. The feedback is asymmetric, and the natural drift is toward more.

The practical implication is that writing for both humans and agents doesn't mean writing more. It means writing clearly and keeping each doc as short as it can be while still being explicit.

When agent-only rules earn their place

Rules in .claude/rules/ are the one place where agent-only content is genuinely the right answer. But they're easy to overuse, and when they grow unchecked they create hidden behavior: the agent follows a directive from a rule file the human never opened, and when the output surprises you, the reason is scattered somewhere you don't think to look.

This reminds me of Django signals. A signal can fire from code you didn't write, triggered by an action you took somewhere else. Useful, but it surprises you when something goes wrong, because the behavior doesn't live where you're looking.

So rules earn their place when three things are true:

tooling can't enforce the same constraint
a README wouldn't reach it because the rule applies across modules
the constraint is specific and stable

If any of those fails, the content belongs in tooling, in a README, or in conventions.md instead. That's why the example tree has two rule files, not twenty.

When a spec earns its place

Specs in docs/specs/ are the home for features that don't fit the other homes. A module README is scoped to one module. A ticket is scoped to one unit of work. But some features don't sit inside a single module or a single ticket, and for those a spec is the right place.

The shape that usually needs one is a feature that crosses several modules and several stakeholders at the same time. "Place an order" is the example I keep coming back to. It touches the UI, the API, the payments module, inventory, and notifications. A Jira ticket can describe what the user wants at a high level, but it can't capture cross-module behavior, edge cases, and the alignment between product, design, and engineering that has to happen before anyone writes code. A single spec document gives everyone one place to converge.

The real work with docs/specs/ is figuring out the scope of each file. A spec earns its place when three things are true:

the feature spans enough modules that no single README can describe it
stakeholders need to align on behavior and edge cases before implementation
the behavior is stable enough to be worth writing down, not a quick experiment

If any of those fails, the content belongs in the ticket, in a module README, or in the code. Getting that scoping call right is what keeps the specs folder useful instead of turning it into a second documentation layer that drifts from the code.

Closing

If I am starting a repo, the playbook above is my default. If I am working in a repo where CLAUDE.md and the human docs have already drifted apart, the migration is less scary than it looks. Picking one module, merging its agent-facing and human-facing docs into a single README, deleting what the tooling already enforces, moving cross-module constraints to docs/conventions.md, and seeing what's left. Probably it's a lot less than I started with.

Write less, put each thing where it belongs, and most of the drift and duplication just goes away.

Bonus: ai-first docs skill

I've packaged this as a Claude skill if you want to try it on an existing repo: ai-first-docs.