I wrote about OpenClaw in Local Agents, but after talking to developers and clients, I realized the architecture itself deserves explanation. Understanding how it works reveals both the design choices and their reliability implications.
What OpenClaw actually is
An agent runtime plus a gateway in front of it.
The gateway is a long-running process on your machine that:
- accepts inputs from messaging apps, timers, webhooks, and internal events
- routes each input to the right agent
- queues work so turns don’t interleave chaotically
Here is a simple diagram:
The gateway accepts five inputs:
- messages
- heartbeats
- crons
- hooks
- webhooks
Plus one unique input: agent-to-agent messages.
Here is a more complex diagram:
The ‘alive’ feeling comes from inputs arriving even when you’re not typing, not from independent reasoning.
The core illusion: ‘proactive’ behavior is just scheduled + event-driven input
OpenClaw has four components:
- time that produces events
- events that trigger agents
- state that persists across interactions
- loop that keeps processing
The entire system is a simple loop:
inputs → queue → agent turn → actions → persisted state → repeat
Inputs (the 5 + 1 model)
-
Human messages WhatsApp/iMessage/Slack/etc. Gateway routes the message to an agent.
-
Heartbeats (timer ticks) A periodic timer (every 30 minutes) that triggers an agent turn with a prewritten prompt.
Time becomes an input source. If there’s nothing to report, the agent returns a special token and the system suppresses the message.
-
Cron jobs Like heartbeats, but scheduled at specific times with specific prompts (daily 9am email triage).
This creates ‘good morning’ texts that look autonomous.
-
Hooks (internal state-change events) System events like gateway startup, agent task begins, or user issues stop/reset. These hooks run setup steps, manage memory, or modify context.
-
Webhooks (external systems) Inputs from Slack, GitHub, Jira, email providers. This extends responses beyond chat to your digital workflows.
Bonus input: Agent-to-agent messaging Multiple agents with isolated workspaces can message each other, creating the appearance of collaboration (but still just queued messages).
Queue + sessions = why it feels coherent
Two design details matter:
Sessions are per channel WhatsApp and Slack don’t share conversational context; each channel has its own session boundary.
Sequential processing within a session If you send multiple messages while the agent is busy, they queue and process in order, preventing jumbled replies.
Memory/state: ‘it remembers’ because it reads files
Persistence isn’t mystical learning. State is stored locally as markdown files:
- preferences
- conversation snippets and context
- notes from prior sessions
When a heartbeat fires, the agent reloads state and continues. It reads MEMORY.md files. Nothing magic.
The four-component pattern
Reusable blueprint:
- Time produces events (heartbeats + cron)
- Events trigger agent turns (messages/webhooks/hooks)
- State persists (files/db)
- Event loop keeps processing (queue/dispatcher)
The ‘agent heartbeat’ pattern you’ll see across frameworks.
Security implications
The agent can run shell commands, read/write files, execute scripts, and control the browser.
Your threat model must assume:
- prompt injection via emails/docs/web pages
- malicious or vulnerable skills/plugins
- credential exposure
- dangerous command interpretation
Operational mitigations:
- run on a secondary machine
- use isolated accounts
- limit enabled skills
- monitor logs
- optionally containerize to reduce blast radius
Builder takeaways
If you’re designing something OpenClaw-like, answer these questions:
- What counts as an input? (human, time, internal, external, agent-to-agent)
- How do you queue + serialize work? (per-session ordering avoids chaos)
- How do you persist state safely? (files/db + redaction + access controls)
- How do you bound tool power? (least-privilege tools, allowlists, sandboxing)
- How do you make it feel alive without being unsafe? (heartbeats/cron + narrow prompts)