OpenClaw tasks capability analysis

Scope: background tasks and task flow surfaces only. This is the working audit trail for a reverse-engineered product specification.

Overview

OpenClaw is a local-first personal assistant. The task capability gives operators and agents a shared activity ledger for work that continues outside the immediate conversation. It does not schedule work by itself; it records, exposes, recovers, and controls detached work started by scheduled jobs, child-agent delegation, background agent runs, and long-running tool actions.

Sources consulted

README.md for product identity and operator surface context.
docs/automation/tasks.md for the stated background task model.
docs/cli/tasks.md for task operator command expectations.
docs/automation/taskflow.md for flow-level capabilities above individual task records.
docs/automation/cron-jobs.md and docs/cli/cron.md for scheduled-job interaction with task records.
CHANGELOG.md for the product history around shared background-run tracking, chat task boards, status visibility, and recovery behavior.
Task command, gateway, chat, status, scheduled-job, child-agent, and task-flow tests under src/ to validate executable behavior.

Note: The docs index command was attempted but failed during dependency setup because a local postinstall process could not spawn the runtime executable. The docs pass continued with direct source files.

Capabilities - documented and verified

Users can see detached work as durable activity records VERIFIED: Background work from scheduled jobs, child-agent delegation, external agent sessions, command-initiated runs, and media generation becomes visible as task records with status, owner, timing, progress, delivery state, and outcome. Evidence: docs/automation/tasks.md:15, docs/automation/tasks.md:91, src/tasks/task-registry.ts:1490, src/tasks/task-registry.types.ts:3.
Users can distinguish task progress from scheduling responsibility VERIFIED: The product explicitly positions task records as an activity ledger, while scheduled jobs and heartbeat decide when work runs. Evidence: docs/automation/tasks.md:11, docs/automation/tasks.md:23, docs/automation/cron-jobs.md:41.
Users can track lifecycle outcomes consistently across detached work VERIFIED: Every task can move through queued, running, success, error, timeout, cancellation, or lost states. Runtime completion updates active records and protects stronger terminal outcomes from being overwritten by later success signals. Evidence: docs/automation/tasks.md:119, src/tasks/task-registry.types.ts:5, src/tasks/task-registry.ts:1430, src/tasks/task-registry.ts:1743, src/tasks/task-registry.ts:1782.
Operators can list and filter task records VERIFIED: Operators can view all recent task records and filter by kind or status, with structured output available for automation. Evidence: docs/cli/tasks.md:40, src/cli/program/register.status-health-sessions.ts:381, src/commands/tasks.ts:357.
Operators can inspect one task by multiple lookup tokens VERIFIED: Operators can show a single record by task identity, run identity, or session identity and see timing, delivery, error, and summary fields. Evidence: docs/automation/tasks.md:202, docs/cli/tasks.md:48, src/commands/tasks.ts:409.
Operators can change notification policy for a running task VERIFIED: Operators can switch a task between terminal-only updates, all state changes, or silence. Evidence: docs/automation/tasks.md:175, docs/cli/tasks.md:56, src/commands/tasks.ts:455.
Operators can cancel background work from the task surface VERIFIED: Operators can request cancellation for running work. Child-agent and external agent sessions can be stopped through their backing control surfaces; command-tracked work can be recorded as cancelled. Evidence: docs/automation/tasks.md:210, docs/cli/tasks.md:64, src/commands/tasks.ts:477, src/gateway/server-methods/tasks.test.ts:185.
Operators can audit unhealthy task records VERIFIED: The product surfaces stale queued work, stale running work, lost backing state, delivery failures, missing cleanup metadata, and inconsistent timing. Evidence: docs/automation/tasks.md:223, src/tasks/task-registry.audit.ts:92, src/tasks/task-registry.audit.test.ts:21, src/commands/tasks.test.ts:74.
Operators can preview and apply maintenance VERIFIED: Maintenance can reconcile active records with their backing owner, recover scheduled-run outcomes from durable run history, stamp cleanup dates, prune expired records, and clean old scheduled-run session rows. Evidence: docs/automation/tasks.md:240, docs/automation/tasks.md:316, src/tasks/task-registry.maintenance.ts:953, src/tasks/task-registry.maintenance.ts:1068, src/commands/tasks.ts:584.
Users can receive completion updates without polling VERIFIED: Completion is push-oriented: the product can deliver directly to the remembered channel target or queue an event for the requester session and wake the session. Evidence: docs/automation/tasks.md:161, src/tasks/task-registry.ts:1119, src/tasks/task-registry.test.ts:21.
Chat users can view a current-session task board VERIFIED: A chat command shows active and recent task records linked to the current session, with a privacy-preserving agent-local count fallback when no linked rows are visible. Evidence: docs/automation/tasks.md:278, src/auto-reply/reply/commands-tasks.ts:84, src/auto-reply/reply/commands-tasks.test.ts:48, src/auto-reply/reply/commands-tasks.test.ts:212.
Status surfaces summarize task pressure VERIFIED: Status views summarize active work and recent failures while hiding stale completed records. Evidence: docs/automation/tasks.md:286, src/status/status-text.ts:116, src/tasks/task-status.ts:162, src/auto-reply/reply/commands-status.test.ts:317.
Programmatic clients can list, read, and cancel tasks VERIFIED: External clients can request task summaries, retrieve a single task, and cancel a task using a stable response shape with sanitized text. Evidence: CHANGELOG.md:1146, src/gateway/protocol/schema/tasks.ts:15, src/gateway/server-methods/tasks.ts:133, src/gateway/server-methods/tasks.test.ts:55, src/gateway/server-methods/tasks.test.ts:139.
Task records survive process restarts and are automatically pruned VERIFIED: Task records persist outside process memory, are restored on demand, and terminal rows receive bounded cleanup timing. Evidence: docs/automation/tasks.md:302, src/tasks/task-registry.ts:940, src/tasks/task-registry.ts:986, src/tasks/task-registry.store.sqlite.ts:390.
Scheduled-job runs can be recovered before being declared lost VERIFIED: When scheduled-job ownership is no longer live, maintenance checks durable scheduled-run history and job state before marking a task lost. Evidence: docs/automation/tasks.md:147, docs/automation/cron-jobs.md:61, CHANGELOG.md:4205, src/tasks/task-registry.maintenance.issue-60299.test.ts:207.
Users can inspect and cancel higher-level task flows VERIFIED: When work is tracked as a multi-step flow, operators can list, inspect, and request cancellation at the flow level rather than only at the individual task level. Evidence: docs/automation/taskflow.md:95, docs/automation/taskflow.md:123, src/commands/flows.ts:144, src/commands/flows.ts:192, src/commands/flows.ts:252.

Capabilities - divergent

All terminal tasks notify the user DIVERGENT: The documentation first states that terminal tasks notify the user, then later describes silent notification policies. Code confirms that silent or not-applicable records do not send user-visible completion messages. The formal spec treats notification as policy-controlled, not universal. Evidence: docs/automation/tasks.md:161, docs/automation/tasks.md:175, src/tasks/task-registry.ts:345, src/agents/tools/media-generate-background-shared.ts:117.

Capabilities - missing or partial

Operator docs index validation PARTIAL: The task documentation is present and internally cross-linked, but the documented docs-index command could not be completed in this local checkout because dependency setup failed before the index command ran. This is a verification gap for this audit, not evidence that the product capability is absent.

Capabilities - discovered from code

User-visible task text is sanitized before leaving the trusted context DISCOVERED: Task titles, progress, terminal summaries, and errors are stripped or truncated before appearing in chat, status, or client responses, preventing internal runtime context from leaking to users. Evidence: src/tasks/task-status.ts:49, src/gateway/server-methods/tasks.test.ts:139, src/auto-reply/reply/commands-tasks.test.ts:133, src/agents/openclaw-tools.session-status.test.ts:1111.
Task views respect session and agent visibility boundaries DISCOVERED: Session task boards show linked details only for the current session and fall back to aggregate same-agent counts instead of exposing details from other sessions. Programmatic task listing can filter by agent and session. Evidence: src/auto-reply/reply/commands-tasks.test.ts:212, src/gateway/server-methods/tasks.ts:99, src/gateway/server-methods/tasks.test.ts:55.
Media generation prevents accidental duplicate long-running work DISCOVERED: Active media-generation task records are reused to report status for repeated requests, helping users avoid starting duplicate long-running generation. Evidence: docs/automation/tasks.md:108, src/agents/media-generation-task-status-shared.ts:32, src/auto-reply/reply/commands-tasks.test.ts:87.
Task maintenance explains why stale work is retained or reconciled DISCOVERED: Structured maintenance output includes decision reasons for stale running tasks, such as active backing ownership versus missing backing state. Evidence: CHANGELOG.md:79, src/commands/tasks.test.ts:145, src/tasks/task-registry.maintenance.ts:166.
Flow health can identify stalled or inconsistent orchestration state DISCOVERED: Audit behavior covers stuck flows, missing linked tasks, missing blocked tasks, stuck cancellation, and restore failure. Evidence: src/tasks/task-flow-registry.audit.test.ts:70, src/tasks/task-flow-registry.audit.test.ts:114, src/commands/tasks.test.ts:74.
Plugins can access bounded task and flow controls for their own sessions DISCOVERED: Plugin-facing runtime capabilities let a plugin list, resolve, inspect, and cancel task records owned by its bound session, and inspect related flow state. Evidence: CHANGELOG.md:5086, src/plugins/runtime/runtime-tasks.ts:60, src/plugins/runtime/runtime-tasks.ts:119.

Constraints observed

Tasks are records and controls, not schedulers. Scheduling remains owned by the feature that starts the work.
Heartbeat and normal interactive chat do not create task records.
Terminal records are retained for a bounded period and then pruned.
Lost state has a grace period and is evaluated against the owning runtime before being reported.
Silent notification policy must suppress user-facing task updates.
Session and agent boundaries limit which task details can be shown in chat and tool surfaces.
Task text shown to users and clients must be sanitized and bounded.
Cancellation is best-effort and depends on what owns the underlying work.

Integrations observed

Scheduled-job runner: creates a task record for every run and supplies recovery evidence.
Child-agent delegation: creates and updates records for delegated work.
External agent session runner: creates and updates records for separate background sessions.
Command-initiated agent runner: records command-backed background work.
Media generation providers: create session-backed background records and use them for status and duplicate guards.
Messaging channels: receive task completion notifications when policy and routing allow.
Operator clients: list, inspect, and cancel task records through a stable client contract.
Plugin runtime: exposes owner-scoped task and flow controls to extensions.

Open questions and ambiguities

Whether the public documentation should soften the phrase "When a task reaches a terminal state, OpenClaw notifies you" to "can notify you" to match silent task policies.
Whether task retention should become configurable for operators with heavier audit requirements.
Whether all task-producing integrations should publish the same level of progress detail, or whether progress is intentionally best-effort by producer.
Whether flow capabilities should be documented as an operator feature, a plugin author feature, or both.