Skip to content

Autonomous Agent Workflow

When you run cspace up <name> --prompt-file <path> with a prompt that references a GitHub issue, cspace provisions a devcontainer instance and launches an autonomous Claude agent inside it. The agent follows a 7-phase workflow defined in the implementer playbook (lib/agents/implementer.md) to resolve the issue end-to-end — from reading the issue through shipping a tested PR.

The agent is fully autonomous: there is no human in the loop. It makes all decisions itself, does not wait for approvals, and is expected to ship a complete, tested pull request.

The agent reads the GitHub issue and creates the development branch:

  1. gh issue view <number> — read the issue description and acceptance criteria
  2. Create a branch from the base: git checkout -b issue-<number> origin/<base-branch>
  3. Push the branch with an empty initial commit
  4. Open a draft PR linking to the issue: gh pr create --draft --base <base-branch>

The base branch is typically main, but the coordinator can set it to a feature branch or another issue branch when managing dependencies.

Goal: Build deep understanding of the relevant code before designing a solution.

The agent launches 2–3 parallel code-explorer sub-agents, each targeting a different aspect of the codebase:

  • Similar features — find existing code that does something comparable
  • Architecture — understand the high-level structure, abstractions, and control flow
  • User experience — trace the feature from the user’s perspective

Each explorer identifies 5–10 key files. Once all explorers return, the agent reads every identified file to build comprehensive understanding before moving to design.

Goal: Design multiple implementation approaches with explicit trade-offs.

The agent launches 2–3 parallel code-architect sub-agents, each with a different focus:

ApproachFocus
Minimal changesSmallest diff, maximum reuse of existing patterns
Clean architectureMaintainability, elegant abstractions, future extensibility
Pragmatic balanceSpeed and quality — good enough architecture shipped fast

The agent reviews all approaches and selects the one that best fits the specific task. This avoids the common failure mode of jumping to the first idea without considering alternatives.

The agent creates an implementation plan and writes the code. If it encounters work that is important but out of scope for the current issue, it creates a new GitHub issue for that work and continues with the original task.

The agent runs the project’s configured verification commands:

  1. Lint, typecheck, and tests: The command from verify.all in .cspace.json
  2. E2E tests: The command from verify.e2e in .cspace.json

If any checks fail, the agent fixes the issues and re-runs verification until everything passes.

  1. Commit all changes with a message that includes Closes #<number>
  2. Push the branch: git push
  3. Mark the PR as ready: gh pr ready

The agent performs its own review before considering the task done:

  1. Screenshots — uses Playwright MCP browser tools to capture screenshots of new or changed features from the running preview server
  2. Code review — runs a self-review pass on the PR diff, fixing any issues found
  3. AC verification — re-reads the issue and compares every acceptance criterion against the actual changes, going back to implement anything missing

The supervisor uses the SDK’s streaming-input mode with a queue-backed async iterable:

  1. The initial prompt (the rendered implementer playbook) is pushed to a PromptQueue
  2. The SDK’s query() function consumes the queue as an async iterator
  3. External commands (from the coordinator or host) can push additional user turns into the queue while the agent is running

This allows the coordinator to send directives mid-task — for example, “rebase onto the latest feature branch” or “the requirements changed, also add X” — without restarting the session.

Each supervisor listens on a Unix domain socket for host-side commands:

CommandBehavior
send_user_messageInjects a new user turn into the running session
interruptGracefully cancels the current query
statusReturns session ID, turn count, idle time
shutdownCloses the prompt queue and exits cleanly

Socket path: /logs/messages/{instance}/supervisor.sock (agents) or /logs/messages/_coordinator/supervisor.sock (coordinator).

All inter-agent messaging uses cspace send through the Unix socket:

  • Coordinator → agent: cspace send <instance> "directive text" — arrives as a new user turn in the agent’s conversation
  • Agent → coordinator: cspace send _coordinator "Worker issue-42 complete. Status: success. PR: ..." — the agent’s final step, reporting completion
  • Human → anyone: cspace send <instance> "message" or cspace send _coordinator "message" from the host

The coordinator’s session is multi-turn — it stays alive after dispatching workers, waiting for their completion messages to arrive as new user turns. Each worker completion triggers the coordinator to update its status table and start follow-up work.

The supervisor writes structured NDJSON event logs to /logs/events/{instance}/:

session-2026-04-10T12-00-00-000Z-<session-id>.ndjson

Each line is a self-describing envelope:

{
"ts": "2026-04-10T12:00:00.000Z",
"instance": "mercury",
"role": "agent",
"sdk": { ... }
}

The diagnostics server (cspace diagnostics-server) tails these logs in real time and exposes agent state over WebSocket for dashboards and over MCP for coordinator diagnostic tools (agent_health, agent_recent_activity, read_agent_stream).

If the SDK emits no events for 10 minutes (configurable via --idle-timeout-ms), the supervisor assumes the agent is stuck — typically on a hung MCP tool call (e.g., a crashed browser sidecar). For agents, it calls interrupt() on the query to unwind gracefully and exits with idle_timeout status. For coordinators, it closes the prompt queue for a clean shutdown (no worker messages arrived within the timeout).

The agent system prompt instructs agents to handle errors pragmatically:

  • Persistent tool errors — investigate briefly (2–3 attempts), then exit cleanly with a diagnostic summary
  • Environmental failures (MCP unreachable, browser hung, repeated identical errors) — do not retry indefinitely
  • Final message — always include enough diagnostic context for the coordinator to decide next steps

The coordinator can restart a stuck agent with cspace restart-supervisor <instance> --reason "...", which preserves the workspace and launches a fresh session.