GraphX: Why I'm Building a Graph-Based Execution Engine in Zig

Two seemingly unrelated discoveries changed how I think about building AI agents: a 10-year-old plain-text accounting system and a 2011 blog post about graph theory.

The Plain-Text Accounting Revelation

I recently came across Sid Goel’s account of managing a decade of personal finances in plain-text files using Beancount. The numbers were striking: 9,895 transactions, 1,086 accounts, 507 attached documents—all stored in human-readable text files that will outlive any proprietary app or service.

What captivated me wasn’t the tooling. It was the philosophy.

Plain-text accounting embodies principles that software engineers intuitively understand but rarely apply outside version control:

Data ownership: Your records live on your machine, not in someone else’s data center
Longevity: Text files opened in any editor, immune to vendor lock-in
Auditability: Every transaction visible, version-controlled, traceable
Deterministic replay: Given the same input ledger, you always get the same balances

This last point—deterministic replay—is what made me sit up. Accounting systems aren’t just databases. They’re state machines with strict invariants. You can’t create money from nothing. Debits must equal credits. The rules are absolute, and violations are rejected.

Accounting Is Graph Theory

Then I read Martin Kleppmann’s ‘Accounting for Computer Scientists’. His core insight hit me:

Basic accounting is just graph theory.

Accounts are nodes. Transactions are directed edges. Money flows along edges from one account to another. The balance at each node follows a simple rule: add incoming edges, subtract outgoing edges. And critically: the sum of all account balances is always zero, because every transaction appears twice in the graph.

This reframing eliminated years of confusion about debits and credits. But more importantly, it revealed a pattern I’d been circling around in my own work.

The Connection to Agentic Systems

For the past year, I’ve been writing about the evolution of agentic systems—from pure LLMs to tool-calling agents to graph-based architectures. In Graph-Based Agents, I argued that true computational agency requires formal graph structures: nodes represent computational units, edges define control flow, and state persists across the entire execution.

In Agent Mode, I explored how we’re transitioning from ‘questions to answers’ toward ‘goals to completed tasks’—where transparency, audit trails, and human intervention points become essential.

The common thread kept surfacing: graphs.

Not graphs as a visualization aid. Graphs as the fundamental computational substrate.

The Problem with Current Agent Frameworks

While building my own AI agent (internally called AgentX), I evaluated every major graph-based execution framework: LangGraph, PydanticAI Graph, others. They all share a similar limitation: they’re graph-shaped wrappers around imperative execution.

When an agent runs, what actually happens?

Steps execute in some order
Tools get called
Decisions get made
Sometimes things fail and retry
Sometimes humans need to approve actions

But where does this execution history live? How do you replay it? How do you audit it? How do you pause an agent, resume it three days later, and continue exactly where it left off?

Most frameworks treat execution as ephemeral. The agent runs, produces output, and the intermediate steps disappear into logs that aren’t structured for replay. You can’t hand someone a run artifact and say ‘here’s exactly what happened, cryptographically verifiable.’

The GraphX Insight

Here’s what clicked: accounting systems and agent execution systems are solving the same problem.

Both need:

Accounting	Agent Execution
Append-only transaction ledger	Append-only execution trace
Strict invariants (debits = credits)	Policy enforcement (tool approval gates)
Deterministic replay from ledger	Deterministic replay from run graph
Audit trail for compliance	Audit trail for AI safety
Typed accounts (assets, liabilities)	Typed events (steps, tool calls, decisions)

The insight isn’t that agents should use double-entry bookkeeping. The insight is that both domains reduce to constrained graph state plus immutable events plus replay.

What GraphX Actually Is

GraphX is my attempt to build this unified substrate. It’s a general-purpose system for:

Domain graph machines: State modeled as a typed graph, transitions as append-only events, invariants that reject illegal states
Run graphs: Every agent/workflow execution produces an immutable graph recording steps, routing decisions, retries, tool calls, and approvals
Deterministic gating policies: Tool permissions, human approvals, and compliance rules as structural graph constraints—not ad-hoc callbacks

The key design choice: everything is an event, and events are immutable.

When an agent proposes a tool call, that’s an event. When the tool call is approved, that’s an event. When it executes and returns, that’s an event. When a human provides input, that’s an event. The run graph isn’t a log—it’s the canonical representation of what happened.

This means:

Replay: Given a run graph and recorded outputs, reproduce the exact same execution path without re-calling LLMs or tools
Audit: Every decision has a traceable lineage
Pause/Resume: Checkpoint the run graph, shut down, resume days later
Human-in-the-loop: Approval gates are first-class events, not interrupt handlers

Why Zig?

A practical note on implementation: I’m building GraphX in Zig.

The reasoning is straightforward. Graph-based state machines with strict invariants need:

Predictable performance (no GC pauses during critical invariant checks)
Memory efficiency (large graphs shouldn’t require large heaps)
Correct concurrency (parallel execution with deterministic ordering)
Portability (embed in other systems, compile to WASM)

Zig provides these without the complexity overhead of Rust’s borrow checker or C’s memory safety footguns. It’s the right tool for infrastructure that other systems will depend on.

The Bigger Picture

I don’t know if Beancount uses graph-based state machines internally. That’s not the point. The point is that a 10-year-old personal finance system and a modern AI agent framework share the same fundamental requirements:

Immutable history
Strict constraints
Deterministic replay
Queryable state

The accounting world figured this out centuries ago with double-entry bookkeeping. Software systems are still catching up.

As AI agents take on more consequential tasks—managing finances, making purchases, modifying production systems—we need execution substrates that provide the same guarantees accountants expect from their ledgers. Not ‘best effort’ logging. Not ‘we can probably reproduce what happened.’ Cryptographic certainty.

What’s Next

GraphX is in early development. The specification outlines the core requirements:

Domain graphs with schema validation and invariant enforcement
Run graphs with typed events for steps, routing, tools, and approvals
Checkpoint/resume across process restarts
Content-addressed storage for large artifacts
Hash-chain integrity verification

If you’re building agents that need auditability, or domain systems that need replay, or workflows that need human gates—this is the problem space GraphX addresses.

The goal isn’t to replace LangGraph or PydanticAI. It’s to provide a lower-level primitive that any execution framework can build on. Graph-based state machines aren’t specific to AI. They’re the right abstraction for any system where state changes must be tracked, constrained, and reproduced.

Accounting taught us this centuries ago. We’re just applying it to a new domain.

This post is part of a series on agentic AI systems. Related reading: Graph-Based Agents, Evolution of Agentic Systems, Agent Mode.