AGENTS.md: OpenAI's Answer to the Agent-Tool Communication Crisis

OpenAI just released AGENTS.md, a ‘README for agents.’ The timing is critical. Developer infrastructure is becoming agent-majority. Fly.io already has more AI agents than human users. I predict agents will outnumber human developers 10-to-1 by the end of 2025, as each developer spawns multiple assistants.

Yet these agents are drowning. They burn tokens and make basic mistakes because they cannot understand our codebases. AGENTS.md is OpenAI’s proposed solution. But is it the answer, or does it merely highlight how primitive our agent-codebase communication truly is?

The Crisis in Every Codebase

AI agents struggle with simple tasks. They can’t find the test command, run the right database migration, or discover critical context buried in documentation. Developers spend more time correcting agents than the agents save them.

I’ve seen an agent read 50,000 lines of code to find a single configuration any team member could provide instantly. I’ve watched another modify a deprecated module because nothing marked it as off-limits. A third runs npm test when the project uses pnpm and a custom script.

We are applying our most advanced AI to problems an intern would solve on day one. Our codebases were never designed to communicate with non-human intelligence.

Enter AGENTS.md: A Temporary Fix

OpenAI’s solution is a markdown file, AGENTS.md, in the repository root. This is their minimal example:

# Sample AGENTS.md file

## Dev environment tips
- Use `pnpm dlx turbo run where <project_name>` to jump to a package
- Run `pnpm install --filter <project_name>` to add the package
- Use `pnpm create vite@latest <project_name>` for new React packages

## Testing instructions
- Find the CI plan in .github/workflows
- Run `pnpm turbo run test --filter <project_name>` for tests
- Fix any test or type errors until the suite is green

The format is clean, readable, and underwhelming. While standardization is vital, AGENTS.md is a band-aid on a broken leg. It addresses the symptom—agents not knowing commands—but ignores the disease: tools that cannot speak to AI.

What AGENTS.md Gets Right

The standard succeeds on three points:

A Predictable Location: A single file, /AGENTS.md, ends the frustrating search through READMEs, wikis, and contribution guides.
Agent-Specific Context: It correctly separates human documentation from the explicit, complete, and parseable instructions that agents need.
Simplicity: The format is just Markdown, which both humans and machines can read without complex parsers.

The Deeper Problems It Ignores

AGENTS.md falls short because it overlooks the fundamental needs of an agentic workflow.

1. It Lacks Token Awareness

The standard tells an agent what to run but not the token cost. An agent cannot know if pnpm turbo run test will generate 100 tokens or 100,000. This is the exact token blindness that ctx solves.

A token-aware AGENTS.md would look like this:

## Testing instructions
- Run tests: `pnpm test` (~500 tokens typical output)
- Full test suite: `pnpm test:all` (~5,000 tokens, includes integration)
- Quick smoke test: `pnpm test:smoke` (~50 tokens)

2. It Is Static, Not Dynamic

AGENTS.md is a static file in a living codebase. It cannot report the current build status, active feature flags, or recent breaking changes.

Instead of reading outdated documentation, agents should discover a tool’s capabilities dynamically using flags like --help. A tool’s help text is always current. Static documentation is a temporary fix; dynamic discovery is the solution.

3. It Prevents Conversational Learning

The standard misses the most powerful improvement mechanism: self-reflection. Agent interactions, like the JSONL files Claude saves to ~/.claude/projects/, are a goldmine of untapped learning opportunities.

Agents should analyze their own conversation histories to identify recurring failures, learn successful approaches, and avoid repeating mistakes. Each conversation should be training data for the next. AGENTS.md treats every interaction as the first.

4. It Blocks Meta-Tool Creation

The most significant omission is the next level of agentic work. Agents should not just use tools—they should create them. When an agent finds a repetitive task, it should be able to build a new tool, add it to the project’s tools/ directory, and expand its own capabilities.

AGENTS.md assumes a fixed toolkit. True agentic coding begins when agents can build their own.

The Real Solution: Agent-Native Tools

AGENTS.md is a conversation starter. The real solution requires rewriting our developer tools for an AI-first world.

1. Meta-Tool Creation: The goal is not to document existing tools but to empower agents to build new ones. An agent that can write its own scripts in a cli/ or tools/ folder becomes a partner that evolves with the codebase.

2. Context-Aware Tools: This is already solved. Context (ctx) wraps any CLI command, making it agent-aware. Instead of a 500,000-line log dump from docker logs myapp, ctx provides a structured, token-counted summary:

$ ctx docker logs myapp
{
  "tokens": 850,
  "output": "[last 100 lines of logs]",
  "input": "docker logs myapp",
  "metadata": { "success": true, "exit_code": 0, "duration": 127 }
  ...
}

3. Dynamic Help Discovery: Why document commands in a static file when tools can explain themselves? An agent querying git --help or docker run --help gets real-time, accurate information. Dynamic discovery makes static documentation obsolete.

The Competitive Landscape

OpenAI, Anthropic, and Google are racing to own the developer workflow. AGENTS.md is a strategic move in this race. But they are all focused on making agents smarter when they should be making tools more communicative. The bottleneck is not agent intelligence; it is tool communication.

How to Use AGENTS.md Now

Despite its limits, you should adopt the standard. Here’s how to make it useful:

1. Focus on Problems, Not Commands: Explain the ‘why’ behind each command.

## When tests fail mysteriously
- Check if database is running: `docker ps | grep postgres`
- Reset test database: `npm run db:test:reset`

2. Warn About Token Traps: Call out expensive operations.

## Debugging
- EXPENSIVE: `git log --all --graph` (can be 100k+ tokens)
- Better: `git log --oneline -10` (~100 tokens)

3. Version the File: Note when the instructions were last updated.

## Recent changes (Updated: 2025-08-20)
- Migrated from npm to pnpm.
- Test database now requires an explicit start command.

The Path Forward

AGENTS.md is a faster horse—an improvement on a broken model. It makes the current workflow slightly better without fixing the underlying problem.

The revolution will come when:

Tools are token-aware by default.
Agents discover capabilities dynamically.
Codebases self-document through introspection.
Tool-agent communication protocols replace static files.

We are connecting nuclear reactors (AI agents) with copper wires (CLI tools). AGENTS.md is just better insulation. We need a complete rewiring.

Adopt AGENTS.md as a necessary, temporary standard. But the real work, and the real opportunity, is in building the next generation of truly agent-native developer tools.