Workflow Recipes: Core Multi-Agent Processes
Recipe 1: Standard task flow
Section titled “Recipe 1: Standard task flow”Every task’s complete path from human to delivery:
Human (Telegram) → COO receives ↓COO writes Task Brief (Goal / Context / Definition of Done / Priority) ↓COO dispatches to agent Discord channel ↓Agent executes → @mentions COO on completion ↓COO receives result → dispatches to QA (QA) ↓QA reviews → PASS / FAIL ↓PASS → COO reports to humanFAIL → feedback to original agent → fix → QA re-reviews → loop until PASSKey constraints:
- COO never touches execution (COO Principle)
- No QA PASS = never say “done”
- Every step logged to
memory/YYYY-MM-DD.md
Recipe 2: QA Gate
Section titled “Recipe 2: QA Gate”QA isn’t optional. It’s hard-coded into the system.
QA’s review checklist
Section titled “QA’s review checklist”Code deliveries:
- Code logic correct
- No security vulnerabilities (hardcoded API keys, unvalidated input)
- Build passes
- Deployment succeeds (HTTP 200, not 404)
- Core features work (not just homepage loads — this is a direct result of the Feb 8 incident)
Content deliveries:
- Facts accurate
- Grammar/spelling
- Format matches target platform
- Sensitive info check (no leaked API keys, personal info)
QA failure flow
Section titled “QA failure flow”QA FAIL → COO extracts failure reason ↓COO creates fix Brief (with QA's specific feedback) ↓Routes back to original agent ↓Agent fixes → resubmits ↓QA second review ↓If fails again → escalate to humanHard rule: Same task fails QA twice consecutively → escalate to human, stop cycling. Infinite QA loops are as harmful as infinite API retries.
Recipe 3: Task queue
Section titled “Recipe 3: Task queue”All tasks tracked in memory/task-queue.md:
- [ ] Task description | agent: xxx | priority: P1 | added: ISO-time- [~] In-progress task | dispatched: ISO-time- [x] Completed task | completed: ISO-time | qa_status: PASS- [-] Recycled task | reason: timeout/no-responsePriority levels
Section titled “Priority levels”| Level | Meaning | Behavior |
|---|---|---|
| P0 | Human explicitly requested | Needs human approval before execution |
| P1 | Important | Notify human then execute |
| P2 | Automated | Execute directly |
| P3 | Low-risk | Execute directly, use budget model |
Queue rules
Section titled “Queue rules”- Every human instruction → immediately becomes a
[ ]task - COO checks queue every heartbeat (15 min)
- If any
[ ]tasks exist → dispatch immediately - Dispatched →
[~] - Completed →
[x] - Timeout (30 min no response) →
[-]+ alert
Never allowed: Writing “waiting for human confirmation” unless genuinely needing human decision (payment, publishing, irreversible actions).
Origin story: February 11, 2026. Sub-agent finished at 7:14 AM. Remaining tasks written on a “Still TODO” list. Not executed. COO idle for 4 hours. Human woke up to find nothing done.
Recipe 4: Heartbeat monitoring
Section titled “Recipe 4: Heartbeat monitoring”Three layers of defense
Section titled “Three layers of defense”Layer 1: COO self-check (every 15 min)
1. Read memory/task-queue.md2. Check [~] tasks for timeout3. Check today's log for anomalies4. Issue found → Telegram human5. All clear → HEARTBEAT_OKLayer 2: Agent block detection (every 10 min) Scan agent Discord channels. If agent posted a question with no reply for 15+ min → alert.
Known limitation: only detects “agent asked, nobody answered.” Cannot detect “agent promised delivery then disappeared.” That needs Layer 3.
Layer 3: Task-level timeout (30 min)
Each [~] task records dispatched timestampIf now - dispatched > 30min with no completion→ Telegram alert: "Task XXX dispatched to Coder 30+ min ago, no completion"Most reliable layer. Pure time-based. No dependency on agent behavior.
Agent offline diagnosis flow
Section titled “Agent offline diagnosis flow”When timeout detected:
Step 1: Check agent logs ├── Error logs → Identify error type │ ├── API error (429/403/500) → Switch model or wait │ ├── Tool call timeout → Likely SSH/network issue │ └── Context overflow → Needs new session └── Logs normal → Agent may just be slower than expected ↓Step 2: Telegram alert to human Include: who, what task, how long, log summary ↓Step 3: Human decides ├── Retry → New session, re-dispatch ├── Switch agent → Dispatch to different capable agent ├── Switch model → Same agent, more suitable model └── Manual → Human handles itRecipe 5: Cross-gateway coordination
Section titled “Recipe 5: Cross-gateway coordination”MacBook Air (COO + QA) and Mac Mini (Coder + Research + Marketing) on different gateways. sessions_send only works within the same gateway.
Methods
Section titled “Methods”- COO → Coder/Research/Marketing: Discord messages
- Coder/Research/Marketing → COO: Discord @mention
- COO → QA:
sessions_send(same gateway, direct)
Cross-gateway dispatch template
Section titled “Cross-gateway dispatch template”~/.openclaw/bin/openclaw message send \ --channel discord \ --target "⚡・coder" \ --message "[Task brief] — COO (COO)"Cross-gateway gotchas
Section titled “Cross-gateway gotchas”| Problem | Cause | Fix |
|---|---|---|
sessions_send not working | Only works within same gateway | Use Discord messages |
| @mention not triggering | Plain text @name, not <@BOT_ID> | Use correct Discord ID format |
| Messages silently dropped | Receiver users list missing bot ID | Add all bot IDs + ignoreBots: false |
| Config changes not taking effect | Session cached old config | Must openclaw gateway restart after config changes |
Recipe 6: Memory management & optimization
Section titled “Recipe 6: Memory management & optimization”Memory architecture
Section titled “Memory architecture”memory/├── MEMORY.md # Long-term decision rules (manual, <200 lines)├── task-queue.md # Current task queue├── improvements-log.md # Improvement log (updated after each failure)├── 2026-01-25.md # Daily logs├── ...└── 2026-03-09.mdThree-layer memory strategy
Section titled “Three-layer memory strategy”| Layer | File | Content | Loading |
|---|---|---|---|
| L1: Hard rules | SOUL.md | Role definition, behavior constraints, hard rules | Auto-loaded every session start |
| L2: Working memory | memory/YYYY-MM-DD.md | Today’s tasks, decisions, progress | Auto-read every heartbeat |
| L3: Long-term rules | MEMORY.md | Cross-day decisions, lessons, config rules | Auto-loaded every session start |
Context management in practice
Section titled “Context management in practice”# SOUL.md rules:# ⛔ HARD RULE: When context usage > 60%:# 1. Save current state to memory/YYYY-MM-DD.md# 2. Execute /compact# 3. Continue working# ⛔ HARD RULE: Never let context exceed 70%Compaction strategy
Section titled “Compaction strategy”When to trigger Compaction:
- Context usage hits 60% → auto-trigger
- After task completion → archive results to daily log, then compact
- Session running >2 hours → proactive compact
Compaction isn’t a built-in OpenClaw feature — it’s implemented via the /compact command or auto-trigger rules in SOUL.md:
/compacttells the model to summarize the conversation, discarding early details- You can also trigger similar behavior via cron
- Critical info should be saved to memory files before compacting
Why not vector databases / RAG?
Section titled “Why not vector databases / RAG?”We tried embedding-based memory retrieval but found:
- Unreliable recall — semantic search recall rate isn’t high enough, critical info may not be retrieved
- Noisy results — “relevant” content mixed with irrelevant info, diluting useful context
- Maintenance overhead — requires additional infrastructure (vector DB, embedding API)
- Non-deterministic — you don’t know what the agent will retrieve each time
Our alternative: SOUL.md + MEMORY.md = deterministic documents loaded every session. 100% reliable.
Recipe 7: Sub-agent efficiency
Section titled “Recipe 7: Sub-agent efficiency”Fat task, thin agent
Section titled “Fat task, thin agent”Provide all context upfront. Target: each sub-agent completes in 3-5 tool calls.
Task brief template
Section titled “Task brief template”## Task[One sentence describing the goal]
## Context- Relevant file paths: [specific paths]- Background: [what the agent needs to know]- References: [links or files]
## Done Criteria- [ ] Specific completion standard 1- [ ] Specific completion standard 2- [ ] Specific completion standard 3
## Constraints- Priority: P[0-3]- Deadline: [time]- Model: [recommended model]Script-first principle
Section titled “Script-first principle”If a task involves API calls, write a script first then run it. Don’t let agents make API calls one by one.
Retry discipline
Section titled “Retry discipline”- Infrastructure failure (API timeout, 429) → max 2 retries
- After 2 → stop 30 minutes or change approach
- Never retry infinitely
Cost of this rule: One sub-agent once spent 30 minutes retrying an exhausted API. 30 minutes = zero output.
Further reading: See Anthropic’s agent design best practices and OpenAI’s agent building guide.
Version compatibility
Section titled “Version compatibility”All configs and commands tested on OpenClaw 2026.3.2.
| Your version | Notes |
|---|---|
| 2026.2.x | Most configs compatible, openclaw message send params may differ slightly |
| 2026.3.x | This handbook fully applies |
| 2026.4.x+ | Check OpenClaw Changelog for breaking changes |
Next: Opinions — contrarian views on AI agents.