Workflow Recipes: Core Multi-Agent Processes

Recipe 1: Standard task flow

Every task’s complete path from human to delivery:

Human (Telegram) → COO receives
    ↓
COO writes Task Brief (Goal / Context / Definition of Done / Priority)
    ↓
COO dispatches to agent Discord channel
    ↓
Agent executes → @mentions COO on completion
    ↓
COO receives result → dispatches to QA (QA)
    ↓
QA reviews → PASS / FAIL
    ↓
PASS → COO reports to human
FAIL → feedback to original agent → fix → QA re-reviews → loop until PASS

Key constraints:

COO never touches execution (COO Principle)
No QA PASS = never say “done”
Every step logged to memory/YYYY-MM-DD.md

Recipe 2: QA Gate

QA isn’t optional. It’s hard-coded into the system.

QA’s review checklist

Code deliveries:

Code logic correct
No security vulnerabilities (hardcoded API keys, unvalidated input)
Build passes
Deployment succeeds (HTTP 200, not 404)
Core features work (not just homepage loads — this is a direct result of the Feb 8 incident)

Content deliveries:

Facts accurate
Grammar/spelling
Format matches target platform
Sensitive info check (no leaked API keys, personal info)

QA failure flow

QA FAIL → COO extracts failure reason
    ↓
COO creates fix Brief (with QA's specific feedback)
    ↓
Routes back to original agent
    ↓
Agent fixes → resubmits
    ↓
QA second review
    ↓
If fails again → escalate to human

Hard rule: Same task fails QA twice consecutively → escalate to human, stop cycling. Infinite QA loops are as harmful as infinite API retries.

Recipe 3: Task queue

All tasks tracked in memory/task-queue.md:

- [ ] Task description | agent: xxx | priority: P1 | added: ISO-time
- [~] In-progress task | dispatched: ISO-time
- [x] Completed task | completed: ISO-time | qa_status: PASS
- [-] Recycled task | reason: timeout/no-response

Priority levels

Level	Meaning	Behavior
P0	Human explicitly requested	Needs human approval before execution
P1	Important	Notify human then execute
P2	Automated	Execute directly
P3	Low-risk	Execute directly, use budget model

Queue rules

Every human instruction → immediately becomes a [ ] task
COO checks queue every heartbeat (15 min)
If any [ ] tasks exist → dispatch immediately
Dispatched → [~]
Completed → [x]
Timeout (30 min no response) → [-] + alert

Never allowed: Writing “waiting for human confirmation” unless genuinely needing human decision (payment, publishing, irreversible actions).

Origin story: February 11, 2026. Sub-agent finished at 7:14 AM. Remaining tasks written on a “Still TODO” list. Not executed. COO idle for 4 hours. Human woke up to find nothing done.

Recipe 4: Heartbeat monitoring

Three layers of defense

Layer 1: COO self-check (every 15 min)

1. Read memory/task-queue.md
2. Check [~] tasks for timeout
3. Check today's log for anomalies
4. Issue found → Telegram human
5. All clear → HEARTBEAT_OK

Layer 2: Agent block detection (every 10 min) Scan agent Discord channels. If agent posted a question with no reply for 15+ min → alert.

Known limitation: only detects “agent asked, nobody answered.” Cannot detect “agent promised delivery then disappeared.” That needs Layer 3.

Layer 3: Task-level timeout (30 min)

Each [~] task records dispatched timestamp
If now - dispatched > 30min with no completion
→ Telegram alert: "Task XXX dispatched to Coder 30+ min ago, no completion"

Most reliable layer. Pure time-based. No dependency on agent behavior.

Agent offline diagnosis flow

When timeout detected:

Step 1: Check agent logs
    ├── Error logs → Identify error type
    │   ├── API error (429/403/500) → Switch model or wait
    │   ├── Tool call timeout → Likely SSH/network issue
    │   └── Context overflow → Needs new session
    └── Logs normal → Agent may just be slower than expected
        ↓
Step 2: Telegram alert to human
    Include: who, what task, how long, log summary
        ↓
Step 3: Human decides
    ├── Retry → New session, re-dispatch
    ├── Switch agent → Dispatch to different capable agent
    ├── Switch model → Same agent, more suitable model
    └── Manual → Human handles it

Recipe 5: Cross-gateway coordination

MacBook Air (COO + QA) and Mac Mini (Coder + Research + Marketing) on different gateways. sessions_send only works within the same gateway.

Methods

COO → Coder/Research/Marketing: Discord messages
Coder/Research/Marketing → COO: Discord @mention
COO → QA: sessions_send (same gateway, direct)

Cross-gateway dispatch template

~/.openclaw/bin/openclaw message send \
  --channel discord \
  --target "⚡・coder" \
  --message "[Task brief] — COO (COO)"

Cross-gateway gotchas

Problem	Cause	Fix
`sessions_send` not working	Only works within same gateway	Use Discord messages
@mention not triggering	Plain text @name, not `<@BOT_ID>`	Use correct Discord ID format
Messages silently dropped	Receiver `users` list missing bot ID	Add all bot IDs + `ignoreBots: false`
Config changes not taking effect	Session cached old config	Must `openclaw gateway restart` after config changes

Recipe 6: Memory management & optimization

Memory architecture

memory/
├── MEMORY.md           # Long-term decision rules (manual, <200 lines)
├── task-queue.md       # Current task queue
├── improvements-log.md # Improvement log (updated after each failure)
├── 2026-01-25.md       # Daily logs
├── ...
└── 2026-03-09.md

Three-layer memory strategy

Layer	File	Content	Loading
L1: Hard rules	SOUL.md	Role definition, behavior constraints, hard rules	Auto-loaded every session start
L2: Working memory	memory/YYYY-MM-DD.md	Today’s tasks, decisions, progress	Auto-read every heartbeat
L3: Long-term rules	MEMORY.md	Cross-day decisions, lessons, config rules	Auto-loaded every session start

Context management in practice

# SOUL.md rules:
# ⛔ HARD RULE: When context usage > 60%:
# 1. Save current state to memory/YYYY-MM-DD.md
# 2. Execute /compact
# 3. Continue working
# ⛔ HARD RULE: Never let context exceed 70%

Compaction strategy

When to trigger Compaction:

Context usage hits 60% → auto-trigger
After task completion → archive results to daily log, then compact
Session running >2 hours → proactive compact

Compaction isn’t a built-in OpenClaw feature — it’s implemented via the /compact command or auto-trigger rules in SOUL.md:

/compact tells the model to summarize the conversation, discarding early details
You can also trigger similar behavior via cron
Critical info should be saved to memory files before compacting

Why not vector databases / RAG?

We tried embedding-based memory retrieval but found:

Unreliable recall — semantic search recall rate isn’t high enough, critical info may not be retrieved
Noisy results — “relevant” content mixed with irrelevant info, diluting useful context
Maintenance overhead — requires additional infrastructure (vector DB, embedding API)
Non-deterministic — you don’t know what the agent will retrieve each time

Our alternative: SOUL.md + MEMORY.md = deterministic documents loaded every session. 100% reliable.

Recipe 7: Sub-agent efficiency

Fat task, thin agent

Provide all context upfront. Target: each sub-agent completes in 3-5 tool calls.

Task brief template

## Task
[One sentence describing the goal]

## Context
- Relevant file paths: [specific paths]
- Background: [what the agent needs to know]
- References: [links or files]

## Done Criteria
- [ ] Specific completion standard 1
- [ ] Specific completion standard 2
- [ ] Specific completion standard 3

## Constraints
- Priority: P[0-3]
- Deadline: [time]
- Model: [recommended model]

Script-first principle

If a task involves API calls, write a script first then run it. Don’t let agents make API calls one by one.

Retry discipline

Infrastructure failure (API timeout, 429) → max 2 retries
After 2 → stop 30 minutes or change approach
Never retry infinitely

Cost of this rule: One sub-agent once spent 30 minutes retrying an exhausted API. 30 minutes = zero output.

Further reading: See Anthropic’s agent design best practices and OpenAI’s agent building guide.

Version compatibility

All configs and commands tested on OpenClaw 2026.3.2.

Your version	Notes
2026.2.x	Most configs compatible, `openclaw message send` params may differ slightly
2026.3.x	This handbook fully applies
2026.4.x+	Check OpenClaw Changelog for breaking changes

Next: Opinions — contrarian views on AI agents.