A coordinator-led agentic swarm investigates a flagged date range — pulls metrics through governed SQL, retrieves prior incidents, optionally spawns a deeper analyst — and produces one finance-review artifact. Read-only outside; no external sends.
An upstream anomaly detector flags a suspicious revenue window and pings this workflow. A coordinator agent investigates, optionally delegates to an analyst sub-agent for deeper root-cause work, and drafts a finance-grade summary artifact. The finance director reviews. On approval, the artifact is marked published in Concord and visible to the finance team in-app. Nothing is emailed or pushed externally.
Two supported ingress paths, same command type:
external_webhook — from the anomaly detector. Payload carries a provider event_id.api_request — manual kickoff by a finance analyst via internal UI. Payload carries a UI-generated action_id.Ingress-level idempotency_key template:
idempotency_key = "revenue_investigation:{ingress}:{event_id_or_action_id}"
| field | type | notes |
|---|---|---|
date_range_start | date | required; inclusive |
date_range_end | date | required; inclusive; max 31 days |
metric_scope | enum | gross | net | recognized | bookings |
anomaly_score | float | 0-1, from detector; null on manual |
requested_by | user_id | analyst on manual; system:detector on webhook |
Implicit context: workspace_id, tenant_id, trace_id.
One command type, parents the entire swarm.
| field | value |
|---|---|
command_type | investigate_revenue_anomaly |
ingress | external_webhook | api_request |
cancellation_mode | graceful |
idempotency_key | revenue_investigation:{ingress}:{event_id_or_action_id} |
status path | created → validated → queued → running → succeeded | failed | cancelled | expired |
The standard status machine applies. compensated is not reachable: no external writes, so no compensation chain exists.
Composed at validate time and before each tool call:
permission — only members of finance_investigators or the detector service account can trigger.cost — hard cap max_cost_units = 30 for the SwarmRun.data_safety — SQL tool runs under a row-level-secured view; raw PII columns are denied at the warehouse.connector_scope — Snowflake reader role; SELECT only, no INSERT/UPDATE/DELETE.agent_risk — max_steps = 20 per AgentRun, spawn_depth ≤ 2, max_agents = 2.memory_consent — every memory write is a candidate; require_approval from the requester before commit.approval_requirement — artifact transition to published requires finance director sign-off.external_sharing — set to deny for all egress connectors; the only "sharing" is in-app artifact visibility.rate_limit — bucket on connector_calls queue; cap SQL calls per minute per tenant.Async, agentic swarm. The coordinator runs on agent_runs; the optional analyst runs on swarm_children. SQL goes through connector_calls.
Read-only investigation. No CoreEffect rows are emitted to domain_effects. The artifact is an Artifact, not an effect.
Effects are side-effect plans on the outside world. This workflow performs no external writes — Snowflake access is SELECT-only, memory writes are internal Concord state, and the artifact is in-app. Therefore the effects table is empty by design.
The coordinator's retrieve_memory tool searches prior incidents at scope = project (the finance-investigations project). Hit shape: incident summary, date range, root cause, resolution.
On a confirmed root cause, the coordinator proposes a memory candidate:
| field | value |
|---|---|
scope | project (finance-investigations) |
type | fact |
commit_policy | candidate → memory_consent policy → require_approval → commit |
conflict_resolution | supersede with audit |
No memory write commits without explicit consent from the requester. The candidate is durable; the commit is gated.
| field | value |
|---|---|
artifact_type | report |
status path | draft → created → validated → published |
storage | Postgres pointer; body in object storage (location ref) |
versioning | new artifact per command run; supersedes prior via parent_artifact_id |
archival | 90-day visibility; archived to cold storage thereafter |
contents | flagged window, metric breakdowns, retrieved prior incidents, analyst notes (if spawned), coordinator's hypothesis, confidence band |
The artifact also references a query_result sub-artifact per SQL run, for traceability.
One approval. Triggered by the approval_requirement policy when the artifact transitions from validated to published.
requested_action: publish revenue investigation artifact to finance team viewrequester: command's requested_byreason: anomaly score and flagged windowaffected_data: artifact draft preview + linked query_result snapshotstriggering_policy: external_sharing + approval_requirementexpected_outcome: artifact visible in-app to finance_team grouprisk_level: medium (financial reporting)expiration: 72 hoursApprover: finance director (role-resolved, not a hardcoded user).
On rejection: artifact stays validated, command transitions to succeeded (the investigation ran), and the rejection reason becomes an audit event. No alternate publish path.
On expiration: artifact stays validated; a follow-up approval_callback may re-request after edits.
| field | value |
|---|---|
objective | Investigate flagged revenue window; produce finance-grade summary |
execution_mode | hierarchical |
join_strategy | coordinator_synthesis |
max_agents | 2 (coordinator + at most one analyst) |
max_depth | 2 |
max_total_steps | 40 |
max_cost_units | 30 |
| field | value |
|---|---|
goal | investigate window, decide if delegation needed, draft artifact |
allowed_tools | run_sql, retrieve_memory, propose_memory_write, spawn_analyst, create_artifact |
allowed_connectors | snowflake (reader), postgres (Concord state), vector_db (memory) |
memory_scope | project (finance-investigations) |
context_scope | workspace_id, command_id, payload |
max_steps | 20 |
spawn_depth | 1 (may spawn one child) |
| field | value |
|---|---|
goal | deeper root-cause dive on a sub-window or metric dimension |
allowed_tools | run_sql, retrieve_memory (subset of parent) |
allowed_connectors | snowflake (reader), vector_db (subset of parent) |
memory_scope | project (same as parent, no widening) |
max_steps | 20 |
spawn_depth | 0 (cannot spawn further; max_depth=2 reached) |
child_scope ⊆ parent_scope for tools, connectors, and memory. The analyst's allowed_tools and allowed_connectors are each a strict subset of the coordinator's; memory scope is identical (project) and may not widen.
All written to domain_events with appropriate purpose.
Per agent step: step_index, tool_name, latency_ms, token_usage, cost_units. Carries agent_run_id for the coordinator and analyst separately.
Mode: graceful. Transitions: running → cancelling → cancelled.
Each agent step's prelude checks the cancellation flag. In-flight SQL is allowed to finish (read-only, bounded); the next step is skipped and the swarm exits.
propagate_cancellation = true: cancelling the parent command cancels the coordinator AgentRun, which cancels the analyst sub-AgentRun if active.
Not chosen: compensate_then_stop. There are no external writes to undo.
Skipped — not applicable. Read-only investigation: zero CoreEffect rows, no external mutations, no inverse operations to declare. The compensation graph validator passes trivially (empty manifest).
If a future iteration adds an external write (e.g., publish to a finance Slack channel or file a Jira ticket on root-cause confirmation), revisit this section and declare the inverse per side effect.
DBOS. Required capabilities: DURABLE_WORKFLOWS, DURABLE_STEPS, QUEUES, SIGNALS, SUBWORKFLOWS. SAGA_COMPENSATION_NATIVE not needed (no effects).
| operation | retryable | max_attempts | backoff_s |
|---|---|---|---|
run_sql | yes (transient + rate_limited) | 3 | 5, 15, 45 |
retrieve_memory | yes (transient) | 3 | 2, 6, 18 |
spawn_analyst | no | 1 | — |
create_artifact | yes (transient db) | 3 | 2, 6, 18 |
max_cost_units = 30 per SwarmRun.workspace_id on every read.metric_scope.SELECT passes, INSERT/UPDATE/DELETE denied at connector.coordinator_synthesis → artifact reflects analyst findings.validated, command succeeded, audit event recorded.validated, approval marked expired.finance_investigators trigger → policy_denied, no SwarmRun row.send_email → blocked at policy, audit event, agent step recorded.agent_risk policy denial.max_cost_units = 30 → terminates with agent_failed.concord_boundary_check.py rejects dbos / temporalio imports outside the runtime adapter file.rate_limit bucket needs to be sized — current cap is TBD per-tenant per-hour. Wire to alerting if rejection rate climbs.parent_artifact_id linkage is manual.config_version. Adapter capability WORKFLOW_VERSIONING is not in DBOS today — flag for Temporal if it becomes load-bearing.anomaly_score = null. The coordinator's prompt should handle this gracefully — confirm in evals.