Plan-Execute: A Two-Step AI Coding Revolution

Single-model AI coding is broken. LLMs hallucinate, lose context, and produce inconsistent code. The solution isn’t a bigger model; it’s a better workflow.

I’ve found success by orchestrating multiple AI systems in a simple, two-step process: Plan → Execute. This workflow uses specialized AI for each phase, coordinated by the same CLI tools developers use daily. No complex protocols, no abstract definitions—just pure, efficient code generation.

This is a CLI-first revolution that transforms a specification into production-ready code in minutes.

The Problem: Why Single Models Fail

Current AI coding assistance fails because it mixes planning, execution, and validation in a single, fragile process. This leads to:

Context Collapse: The AI forgets the goal midway through implementation.
Inconsistent Quality: Complexity degrades output.
All-or-Nothing Execution: A single failure destroys the entire session.
No Visibility: It’s a black box until it’s too late.

The root cause is that agents build context on the fly through inefficient tool calls. They are forced to discover what they don’t know, one API call at a time.

The Two-Step Solution: Plan-Execute

Our breakthrough separates the cognitive load, assigning each task to the best-suited AI model.

Step 1: Plan with Gemini 2.5 Pro

We start with Gemini 2.5 Pro and its 1M token context window. It ingests the user’s specification and the entire relevant codebase to produce a detailed AI-PLAN.md. This plan is exhaustive, covering file paths, code changes, dependencies, and test strategies.

Create the Plan:

# 1. Gather relevant codebase context using git and repomix
git ls-files "*.go" "go.mod" "Makefile" | repomix --stdin --output=codebase.xml

# 2. Generate a detailed plan using the spec and codebase context
gemini "Create a plan from this spec" < USER-SPEC.md codebase.xml > AI-PLAN.md

Gemini’s massive context allows it to create a holistic, actionable plan that respects existing patterns and architecture.

Step 2: Execute with Claude Opus/Sonnet

Next, we hand the AI-PLAN.md to Claude Opus or Sonnet. With a smaller 200K token window, Claude excels at precise, reliable execution when given clear instructions. It doesn’t need to understand the whole system—only the plan and the files it needs to modify.

Execute the Plan:

# 1. Extract only the files mentioned in the plan for focused context
rg -o '[a-zA-Z0-9_/]+\.go' AI-PLAN.md | sort -u | repomix --stdin --output=plan-files.xml

# 2. Execute the plan with the focused context
claude "Implement this plan" < AI-PLAN.md plan-files.xml

Claude works through the plan task-by-task, modifying code, running tests, and validating each step.

Why It Works: CLI-First Orchestration

This workflow is effective because it relies on simple, powerful principles.

Code Is All You Need

Instead of abstract protocols, we give AI agents the same tools developers use: git, grep, rg, and repomix. These CLI tools are:

Direct: No ambiguity.
Composable: Unix pipes have solved integration for decades.
Familiar: AIs trained on code already understand them.
Deterministic: They execute predictably every time.

The repomix tool is key, allowing us to package specific codebase slices into a token-efficient XML format for the AI to consume.

Smart Token Economics

We use the right model for the right job, optimizing for both capability and cost.

Phase	Model	Context Limit	Optimized For
Plan	Gemini 2.5 Pro	1M tokens	Holistic understanding, architecture, planning
Execute	Claude Opus/Sonnet	200K tokens	Precise edits, error handling, validation

This division of labor prevents any single model from being overwhelmed.

Token Usage by Project Size

The Plan-Execute workflow adapts to projects of any size by managing context intelligently.

Small Projects (<100K Tokens)

For small codebases, the entire project fits comfortably within both Gemini’s and Claude’s context windows. This allows for straightforward, holistic analysis and execution without complex context engineering.

# The entire codebase can be passed to both models
git ls-files | repomix --stdin --output=codebase.xml
gemini "Create a plan..." < spec.md codebase.xml > plan.md
claude "Implement this plan..." < plan.md codebase.xml

Medium Projects (100K-200K Tokens)

Here, the full codebase fits into Gemini’s 1M context window for planning, but exceeds Claude’s 200K limit. For execution, we must provide only the files referenced in the AI-PLAN.md.

# Planning uses the full context
git ls-files | repomix --stdin --output=full-codebase.xml
gemini "Create a plan..." < spec.md full-codebase.xml > plan.md

# Execution uses a focused subset of files
rg -o '[a-zA-Z0-9_/]+\.(go|ts)' plan.md | sort -u | repomix --stdin --output=plan-files.xml
claude "Implement this plan..." < plan.md plan-files.xml

(This example is complete, it can be run "as is")

Large Projects (>200K Tokens)

For large codebases, even Gemini’s 1M context window may be challenged. Here, repomix with git becomes critical for creating a representative, token-efficient slice of the codebase for planning. Execution remains targeted, using only the files specified in the plan.

# Planning requires intelligent context slicing
git ls-files "src/core/**" "src/utils/**" | repomix --stdin --output=core-context.xml
gemini "Create a plan..." < spec.md core-context.xml > plan.md

# Execution remains focused
rg -o '[a-zA-Z0-9_/]+\.(go|ts)' plan.md | sort -u | repomix --stdin --output=plan-files.xml
claude "Implement this plan..." < plan.md plan-files.xml

(This example is complete, it can be run "as is")

Visualizing the Workflow

flowchart TB
    subgraph "Human Developer"
        H1[Write Requirements<br/>USER-SPEC.md] --> AI1
        H2[Review Plan] --> AI2
        H3[Validate Code]
    end
    
    subgraph "AI Agents"
        AI1[Gemini: Generate Plan<br/>→ AI-PLAN.md] --> H2
        AI2[Claude: Execute Plan<br/>→ Code Changes] --> H3
    end
    
    style H1 fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style H2 fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style H3 fill:#e3f2fd,stroke:#1976d2,stroke-width:2px
    style AI1 fill:#fff3e0,stroke:#f57c00,stroke-width:2px
    style AI2 fill:#e8f5e9,stroke:#388e3c,stroke-width:2px

(This example is complete, it can be run "as is")

The Workflow in Action

We used this process to build the two-step CLI itself.

Input: A USER-SPEC.md file detailing the goal.
Planning: Gemini analyzed our spec and the existing codebase, producing a 475-line AI-PLAN.md with 12 distinct steps.
Execution: Claude followed the plan, modifying files, running tests, and validating each change.

Result: A fully functional, two-step CLI workflow with error handling and safety assertions, built and validated in just six minutes.

Conclusion: The Future is Orchestrated

The Plan-Execute workflow represents a paradigm shift from single-model prompting to multi-system orchestration. By separating planning from execution and using standard developer tools, we create AI coding systems that are robust, scalable, and practical for production use.

The breakthrough isn’t a complex new protocol. It’s the realization that code is all you need. The most powerful tool for an AI agent is a shell prompt and access to git, grep, and repomix. The future of AI development lies in augmenting human creativity with simple, powerful, and orchestrated workflows.