Concord design

Revenue anomaly investigation swarm

A coordinator-led agentic swarm investigates a flagged date range — pulls metrics through governed SQL, retrieves prior incidents, optionally spawns a deeper analyst — and produces one finance-review artifact. Read-only outside; no external sends.

Triggerexternal_webhook | api_request Planagentic swarm Cancellationgraceful Approvals1 Side effects0 (read-only)
02

User journey

An upstream anomaly detector flags a suspicious revenue window and pings this workflow. A coordinator agent investigates, optionally delegates to an analyst sub-agent for deeper root-cause work, and drafts a finance-grade summary artifact. The finance director reviews. On approval, the artifact is marked published in Concord and visible to the finance team in-app. Nothing is emailed or pushed externally.

User journey
flowchart LR Detector([Anomaly detector
flags date range]) --> Webhook[/external_webhook
or api_request/] Webhook --> Coord["Coordinator agent
investigates"] Coord --> Analyst{Deeper analysis
needed?} Analyst -->|yes| Sub["Analyst sub-agent
focused dive"] Analyst -->|no| Draft Sub --> Draft[Draft summary artifact] Draft --> Review[Finance director
human review] Review --> Published([Published artifact
visible to finance team]) style Coord fill:#EFE6F0,stroke:#7A5560 style Sub fill:#EFE6F0,stroke:#7A5560 style Review fill:#F5E0D2,stroke:#D97757 style Published fill:#F1F2EC,stroke:#6B7B5A
03

Trigger / ingress

Two supported ingress paths, same command type:

Idempotency

Ingress-level idempotency_key template:

idempotency_key = "revenue_investigation:{ingress}:{event_id_or_action_id}"

Payload schema (key fields)

fieldtypenotes
date_range_startdaterequired; inclusive
date_range_enddaterequired; inclusive; max 31 days
metric_scopeenumgross | net | recognized | bookings
anomaly_scorefloat0-1, from detector; null on manual
requested_byuser_idanalyst on manual; system:detector on webhook

Implicit context: workspace_id, tenant_id, trace_id.

04

Command(s)

One command type, parents the entire swarm.

fieldvalue
command_typeinvestigate_revenue_anomaly
ingressexternal_webhook | api_request
cancellation_modegraceful
idempotency_keyrevenue_investigation:{ingress}:{event_id_or_action_id}
status pathcreated → validated → queued → running → succeeded | failed | cancelled | expired

The standard status machine applies. compensated is not reachable: no external writes, so no compensation chain exists.

05

Policy stack

Composed at validate time and before each tool call:

permission
cost
data_safety
connector_scope
agent_risk
memory_consent
approval_requirement
external_sharing
rate_limit
06

Execution plan

Async, agentic swarm. The coordinator runs on agent_runs; the optional analyst runs on swarm_children. SQL goes through connector_calls.

Execution plan
flowchart TB Ingress([Trigger]) --> Validate["Validate payload
sync_function"] Validate --> Policy{Policy stack
allow?} Policy -->|deny| Fail([Command failed:
policy_denied]) Policy -->|allow| Enqueue[Enqueue SwarmRun
queue: agent_runs] Enqueue --> Coord["Coordinator AgentRun
agent_tool_call loop"] Coord --> SQL["run_sql via Snowflake
connector_call
queue: connector_calls"] Coord --> Mem["retrieve_memory
agent_tool_call"] Coord --> Decide{Need deeper
analysis?} Decide -->|yes| Analyst["Analyst sub-AgentRun
queue: swarm_children"] Decide -->|no| Draft Analyst --> Draft["Draft artifact
sync_function"] Draft --> Approval["Approval: finance director
human_task"] Approval --> Publish["Mark artifact published
sync_function"] Publish --> Done([succeeded]) style Policy fill:#F5E0D2,stroke:#D97757 style Decide fill:#F5E0D2,stroke:#D97757 style Coord fill:#EFE6F0,stroke:#7A5560 style Analyst fill:#EFE6F0,stroke:#7A5560 style Approval fill:#F5E0D2,stroke:#D97757 style Done fill:#F1F2EC,stroke:#6B7B5A

Execution modes used

sync_function
async_task
connector_call
agent_tool_call
human_task
07

Effects table

Read-only investigation. No CoreEffect rows are emitted to domain_effects. The artifact is an Artifact, not an effect.

Note

Effects are side-effect plans on the outside world. This workflow performs no external writes — Snowflake access is SELECT-only, memory writes are internal Concord state, and the artifact is in-app. Therefore the effects table is empty by design.

08

Memory

Reads

The coordinator's retrieve_memory tool searches prior incidents at scope = project (the finance-investigations project). Hit shape: incident summary, date range, root cause, resolution.

Writes

On a confirmed root cause, the coordinator proposes a memory candidate:

fieldvalue
scopeproject (finance-investigations)
typefact
commit_policycandidate → memory_consent policy → require_approval → commit
conflict_resolutionsupersede with audit
Rule

No memory write commits without explicit consent from the requester. The candidate is durable; the commit is gated.

09

Artifacts

fieldvalue
artifact_typereport
status pathdraft → created → validated → published
storagePostgres pointer; body in object storage (location ref)
versioningnew artifact per command run; supersedes prior via parent_artifact_id
archival90-day visibility; archived to cold storage thereafter
contentsflagged window, metric breakdowns, retrieved prior incidents, analyst notes (if spawned), coordinator's hypothesis, confidence band

The artifact also references a query_result sub-artifact per SQL run, for traceability.

10

Approvals

One approval. Triggered by the approval_requirement policy when the artifact transitions from validated to published.

Review packet

Approver: finance director (role-resolved, not a hardcoded user).

On rejection: artifact stays validated, command transitions to succeeded (the investigation ran), and the rejection reason becomes an audit event. No alternate publish path.

On expiration: artifact stays validated; a follow-up approval_callback may re-request after edits.

Approval flow
sequenceDiagram participant Swarm as Swarm participant Concord as Concord participant Director as Finance Director participant FinanceTeam as Finance Team Swarm->>Concord: artifact.status = validated Concord->>Concord: policy: external_sharing + approval_requirement Concord->>Director: approval.pending
review_packet Director->>Concord: approve | reject alt approved Concord->>Concord: artifact.status = published Concord-->>FinanceTeam: in-app visibility else rejected Concord->>Concord: artifact stays validated Concord->>Concord: audit: approval_rejected else expired (72h) Concord->>Concord: approval.status = expired end
11

Agent / swarm config

SwarmRun

fieldvalue
objectiveInvestigate flagged revenue window; produce finance-grade summary
execution_modehierarchical
join_strategycoordinator_synthesis
max_agents2 (coordinator + at most one analyst)
max_depth2
max_total_steps40
max_cost_units30

Coordinator AgentRun

fieldvalue
goalinvestigate window, decide if delegation needed, draft artifact
allowed_toolsrun_sql, retrieve_memory, propose_memory_write, spawn_analyst, create_artifact
allowed_connectorssnowflake (reader), postgres (Concord state), vector_db (memory)
memory_scopeproject (finance-investigations)
context_scopeworkspace_id, command_id, payload
max_steps20
spawn_depth1 (may spawn one child)

Analyst sub-AgentRun (optional)

fieldvalue
goaldeeper root-cause dive on a sub-window or metric dimension
allowed_toolsrun_sql, retrieve_memory (subset of parent)
allowed_connectorssnowflake (reader), vector_db (subset of parent)
memory_scopeproject (same as parent, no widening)
max_steps20
spawn_depth0 (cannot spawn further; max_depth=2 reached)
Rule

child_scope ⊆ parent_scope for tools, connectors, and memory. The analyst's allowed_tools and allowed_connectors are each a strict subset of the coordinator's; memory scope is identical (project) and may not widen.

Swarm hierarchy
flowchart TB Swarm["SwarmRun
objective: investigate window
execution_mode: hierarchical
join_strategy: coordinator_synthesis"] Swarm --> Coord["Coordinator AgentRun
max_steps: 20
spawn_depth: 1"] Coord -->|optional spawn| Analyst["Analyst sub-AgentRun
max_steps: 20
spawn_depth: 0
scope ⊆ coordinator"] Coord -->|synthesizes| Join["Join: coordinator_synthesis
draft artifact"] Analyst -->|returns findings| Join style Swarm fill:#EFE6F0,stroke:#7A5560 style Coord fill:#EFE6F0,stroke:#7A5560 style Analyst fill:#EFE6F0,stroke:#7A5560 style Join fill:#F1F2EC,stroke:#6B7B5A

Allowed tools

run_sql
retrieve_memory
propose_memory_write
spawn_analyst
create_artifact

Forbidden tools

send_email
slack_post
http_post
write_sql
file_upload_external
commit_memory_direct

Allowed connectors

snowflake (reader)
postgres (concord state)
vector_db (memory)
12

Audit events

All written to domain_events with appropriate purpose.

purpose = audit

command.created
policy.decision
approval.requested
approval.approved
approval.rejected
approval.expired
artifact.published
memory.candidate
memory.committed

purpose = agent_step

Per agent step: step_index, tool_name, latency_ms, token_usage, cost_units. Carries agent_run_id for the coordinator and analyst separately.

purpose = event

investigation.started
investigation.completed
analyst.spawned
analyst.returned
13

Cancellation model

Mode: graceful. Transitions: running → cancelling → cancelled.

Each agent step's prelude checks the cancellation flag. In-flight SQL is allowed to finish (read-only, bounded); the next step is skipped and the swarm exits.

propagate_cancellation = true: cancelling the parent command cancels the coordinator AgentRun, which cancels the analyst sub-AgentRun if active.

Cancellation cascade
flowchart TB Cancel([Cancel signal]) --> Cmd["Command
running → cancelling"] Cmd --> Coord["Coordinator AgentRun
cancelling"] Coord --> Analyst["Analyst sub-AgentRun
cancelling (if active)"] Coord --> SQL["In-flight SQL
allowed to finish"] Analyst --> Done["All children
cancelled"] SQL --> Done Done --> Final["Command
cancelled"] style Cmd fill:#F5E0D2,stroke:#D97757 style Coord fill:#EFE6F0,stroke:#7A5560 style Analyst fill:#EFE6F0,stroke:#7A5560

Not chosen: compensate_then_stop. There are no external writes to undo.

14

Compensation graph

Skipped — not applicable. Read-only investigation: zero CoreEffect rows, no external mutations, no inverse operations to declare. The compensation graph validator passes trivially (empty manifest).

Note

If a future iteration adds an external write (e.g., publish to a finance Slack channel or file a Jira ticket on root-cause confirmation), revisit this section and declare the inverse per side effect.

15

Runtime config

Adapter

DBOS. Required capabilities: DURABLE_WORKFLOWS, DURABLE_STEPS, QUEUES, SIGNALS, SUBWORKFLOWS. SAGA_COMPENSATION_NATIVE not needed (no effects).

Queues

agent_runs
swarm_children
connector_calls

RetryPolicy

operationretryablemax_attemptsbackoff_s
run_sqlyes (transient + rate_limited)35, 15, 45
retrieve_memoryyes (transient)32, 6, 18
spawn_analystno1
create_artifactyes (transient db)32, 6, 18

Constraints

16

Test plan

Unit

Integration

Workflow (end-to-end)

Safety

Boundary discipline

17

Open questions / risks