Sub-Agent Orchestration & Delegation

How a NetStacks agent delegates work to specialist or ephemeral child agents, the workflow graph, parking/await, and the fail-closed guardrails that bound a delegation tree.

Overview

A NetStacks agent is an autonomous, background task runner. It executes a ReAct loop (reason → act → observe) against your devices using a set of tools, and runs to completion without an interactive chat. Sub-agent orchestration lets a running agent delegate a self-contained sub-job to another agent — either a user-declared specialist or an ephemeral child it spins up on the fly — then wait for that child's result and continue.

Delegation exists for three reasons:

Specialization — hand a sub-job to an agent that has a sharper system prompt and tighter focus (for example a "BGP Auditor" or a "Config Diff Reviewer").
Context scoping — isolate a noisy sub-task (a large show tech-support, a long log scan) in a child so the parent's context window stays clean.
Parallelism & auditability — each child is a first-class run with its own transcript, correlated to the parent and independently auditable in the workflow graph.

Where this lives

The two delegation tools (list_specialists and delegate_to_agent) are registered for every agent task automatically — you do not enable them. Specialist agents are declared in the Agents panel; child runs appear in the Workflow view of the parent's run tab. Everything runs locally in the bundled Local Agent.

Core Concepts

Every agent run is an agent_task row. Delegation adds linkage fields so a tree of runs can be reconstructed, bounded, and cancelled as a unit:

parent_task_id: The immediate parent run. None means a top-level (root) task.
root_task_id: The top of the delegation tree, denormalized onto every descendant so the whole subtree can be fetched and counted cheaply.
depth: 0 for a root; incremented by one at each delegation level. Enforced against the depth cap.
spawned_by_agent_definition_id: The specialist definition this child was delegated to. None marks an ephemeral child. This is the field the UI reads to label a node Specialist vs. Ephemeral.
delegation_label: The short, human-readable label the parent gave the sub-job (for example check BGP on core), shown on the node in the workflow graph.

A child task carries status through the same lifecycle as any task — pending → running → one of completed, failed, or cancelled (the three terminal states). The parent blocks until the child reaches a terminal state, then reads its result.

Specialists vs. Ephemeral Children

	Specialist child	Ephemeral child
Created by	Passing `agent_id` to `delegate_to_agent`	Omitting `agent_id`
System prompt	The specialist definition's saved system prompt	The default agent system prompt
Model / temperature / limits	The specialist definition's overrides	The global agent config
Workflow label	Specialist	Ephemeral
Reusable	Yes — declared once, used by name	No — exists only for this sub-job

When a specialist is named, the child is created with that definition written to both agent_definition_id (so the child executes with the specialist's prompt and settings) and spawned_by_agent_definition_id (the delegation marker). An ephemeral child leaves both None and runs with the default agent configuration.

The Two Delegation Tools

Orchestration is exposed to the model as two tools. The intended flow is discover, then delegate:

Call list_specialists to see what declared specialists exist.
If one matches the sub-job, call delegate_to_agent with its id. If none matches, call delegate_to_agent without an id to spin up an ephemeral child.

The default Agent Operating Guide that is appended to every agent's system prompt states this explicitly, so a well-behaved model already knows the pattern:

To delegate a self-contained sub-task, call list_specialists, then
delegate_to_agent with a specialist's id (or omit the id for an
ephemeral child) and use its result.

Editable operating guide

The operating guide is editable in Settings → Prompts (the "Agent Operating Guide", stored at the ai.agent_operating_guide setting). It falls back to the built-in default when unset. This is where you reinforce when your agents should delegate versus do the work themselves.

list_specialists

Returns the enabled user-declared specialist agents the caller may delegate to. Disabled definitions are filtered out. It takes no input.

tool definitionjson

{
  "name": "list_specialists",
  "input_schema": { "type": "object", "properties": {}, "required": [] }
}

The result is a flat list of id, name, and description:

list_specialists resultjson

{
  "specialists": [
    {
      "id": "a1f3...",
      "name": "BGP Auditor",
      "description": "Reviews BGP session health and flags AS/prefix mismatches."
    },
    {
      "id": "b7c2...",
      "name": "Config Diff Reviewer",
      "description": "Compares running vs. startup config and summarizes drift."
    }
  ]
}

Call it first

The tool's own description instructs the model to call list_specialists before delegate_to_agent: if a specialist matches, pass its id; if none matches, delegate without an id. Pass a good description when you declare a specialist — it is the only thing the model sees when deciding whether to route to it.

delegate_to_agent

Creates a linked child run, spawns it, and blocks awaiting its result, then returns that result so the parent can continue. Only prompt is required.

tool definitionjson

{
  "name": "delegate_to_agent",
  "input_schema": {
    "type": "object",
    "properties": {
      "prompt": {
        "type": "string",
        "description": "Full, self-contained instructions for the child agent."
      },
      "agent_id": {
        "type": "string",
        "description": "Specialist agent id from list_specialists. Omit for an ephemeral child."
      },
      "label": {
        "type": "string",
        "description": "Short label for this sub-job (e.g. 'check BGP on core'), shown in the workflow graph."
      },
      "timeout_seconds": {
        "type": "integer",
        "description": "Max seconds to wait for the child (default 300)."
      }
    },
    "required": ["prompt"],
    "additionalProperties": false
  }
}

On success, the tool returns the child's terminal state and result:

delegate_to_agent result (child finished)json

{
  "delegated": true,
  "child_id": "9d4e...",
  "specialist_id": "a1f3...",
  "status": "completed",
  "result": "{ ... the child's result_json ... }",
  "error": null
}

If the child has not finished by timeout_seconds (default 300s), the tool returns control to the parent but leaves the child running in the background — its result is still reachable from the workflow graph:

delegate_to_agent result (timed out wait)json

{
  "delegated": true,
  "child_id": "9d4e...",
  "status": "running",
  "note": "Child still running after 300s; it continues in the background — check the workflow for its result."
}

Refusals are not errors

When a guardrail trips (depth, fan-out, tree size) or a named specialist is missing or disabled, delegate_to_agent returns a structured refusal — { "delegated": false, "reason": "..." } — rather than throwing. The model reads the reason and adapts (does the work itself, combines sub-jobs) instead of retrying blindly. An empty prompt is the one case that is a hard input error.

Parking & Await (the Permit Model)

Agent tasks run under a concurrency cap: the executor holds a semaphore, and the default Local Agent allows 3 concurrent tasks. Every running task holds one permit. This creates an obvious hazard: if a parent holds a permit while it waits for its child, and the cap is full, the child can never acquire a permit to run — the parent waits forever for a child that can never start. That is a deadlock.

NetStacks avoids this with parking tools. A tool that blocks inside its own execution waiting on something external declares parks() = true. Two tools park:

delegate_to_agent — parks while awaiting a delegated child agent.
ask_user — parks while awaiting a human's answer to a clarifying question.

When the ReAct loop dispatches a parking tool, it releases the permit before the tool blocks and re-acquires it (cancellation-aware) after the tool returns. So while a parent parks awaiting its child, the slot is free for the child to take — no starvation, no deadlock, and the cap is never exceeded by long waits.

Why the parent never holds a slot its child needs

This is the whole point of parking: a parent awaiting a child releases its concurrency slot for the duration of the wait. Per-call approval prompts for mutating tools release the slot the same way — a slow human decision will not starve the cap-3 pool.

Cancellation propagates down the tree

Cancelling a run cancels its entire sub-agent subtree breadth-first, so a parent awaiting a child never hangs and no orphaned children keep running. A running child is signalled to stop; a child still pending is marked cancelled before it starts. Driving the child to a terminal state is also what ends the parent's wait.

Guardrails: Depth, Fan-Out & Tree Size

Delegation is fail-closed and bounded so a tree cannot recurse or fan out without limit. A child inherits these limits against its own depth and root, so the bounds hold at every level of the tree. The hard caps are:

Limit	Value	Meaning
Max depth	3	How deep delegation can nest below the root before it is refused.
Max children per task	5	Direct fan-out from any single agent.
Max descendants per root	25	Total runs in one delegation tree.
Default wait timeout	300s	How long a parent blocks before the child continues in the background.

When a cap is hit, the model gets a plain-language refusal it can act on rather than an opaque failure:

structured refusalsjson

// depth cap
{ "delegated": false, "reason": "Delegation depth limit reached (max 3). Do this sub-job yourself." }

// fan-out cap
{ "delegated": false, "reason": "Child limit reached for this agent (max 5). Combine sub-jobs or do them yourself." }

// whole-tree cap
{ "delegated": false, "reason": "This workflow has reached its total sub-agent limit (max 25)." }

// named a disabled specialist
{ "delegated": false, "reason": "Specialist '<id>' is disabled. Omit agent_id for an ephemeral child." }

// named a non-existent specialist
{ "delegated": false, "reason": "No specialist with id '<id>'. Call list_specialists, or omit agent_id." }

These caps are not configurable

Depth (3), children per task (5), descendants per root (25), and the default wait (300s) are compile-time constants in the orchestration tool. They exist to keep an autonomous tree from exploding into an unbounded fan-out. If you need more parallel work, restructure the job — broader, shallower trees with well-scoped child prompts beat deep recursion.

Defining Specialist Agents

A specialist is just an Agent Definition — a named, saved agent configuration. Any enabled definition is automatically discoverable via list_specialists and usable as a delegation target. Create one in the Agents panel: open the Agents list and click Create new agent (the + button). The form fields are:

Name *: Required. The display name (e.g. Network Auditor). Shown in list_specialists and on the workflow node.
Description: A brief description of what the agent does. This is the routing hint the delegating model reads — write it for the model, not just for humans.
System Prompt *: Required. The instructions that define the specialist's behavior. The non-interactive Agent Operating Guide and the non-negotiable safety rules are layered on automatically.
Provider / Model: Optional overrides. Leave Model as "Use default" to inherit the global agent model.
Temperature: Optional. Defaults to 0.7 when blank.
Max Iterations: ReAct loop cap for this agent. Default 15; range 1–50.
Max Tokens: Per-response token cap. Default 4096; range 256–32768 in steps of 256.
Enabled: Shown when editing. Only enabled definitions appear in list_specialists and can be delegated to. Delegating to a disabled specialist returns a refusal.

The same definition can be run directly (type a prompt under the agent in the Agents panel and press Run) or used as a delegation target. There is no separate "sub-agent" type — a specialist is a specialist.

Write tight specialist prompts

The best specialists are narrow. A "BGP Auditor" whose system prompt says only review BGP session state and report mismatches, never reconfigure will produce sharper, more auditable child runs than a do-everything agent. Keep the description specific so the parent model routes to it correctly.

The Workflow Graph

When a run delegates to at least one child, its run tab gains a view toggle with Activity, Both, and Workflow. The Workflow view is a pan/zoom canvas (styled like the Topology view) where each run is a box and edges wire a parent to the children it delegated.

Each node shows:

Status — Queued, Running (with a live dot), Done, Failed, or Cancelled.
Kind — Root, Specialist, or Ephemeral (derived from spawned_by_agent_definition_id).
Title — the delegation_label if the parent provided one, otherwise the child's prompt.

Interactions:

Expand a node (chevron) to see its live activity feed inline.
Drag the header to move a node; resize from the corner.
Double-click a node to open that child run as its own tab.
Ctrl/⌘+wheel (or trackpad pinch) zooms around the cursor; a plain wheel scrolls an expanded node's feed. Reset restores the view and auto-layout.

Before any delegation happens, the Workflow view shows a placeholder:

This run hasn’t delegated to any sub-agents. When it calls
delegate_to_agent, the workflow appears here.

Every child is independently auditable

A delegated child is a full run with its own durable transcript (the glass-box thought/command/tool_result/error steps), result, and status. It is not a hidden sub-call — you can open it, replay it, and audit it exactly like a top-level run.

Worked Examples

1. Delegate to a named specialist

The parent discovers specialists, finds a BGP auditor, and hands it a scoped sub-job with a label for the graph:

specialist delegationjavascript

// Step 1 — discover
list_specialists()
// → { "specialists": [ { "id": "a1f3...", "name": "BGP Auditor", ... } ] }

// Step 2 — delegate by id
delegate_to_agent({
  "agent_id": "a1f3...",
  "label": "audit BGP on core-1",
  "prompt": "On device core-1, review all BGP sessions. For any neighbor not in Established, report the neighbor IP, the configured vs. received AS, and the most likely cause. Read-only only."
})
// → blocks (permit released) until the child finishes, then:
// { "delegated": true, "child_id": "9d4e...", "specialist_id": "a1f3...",
//   "status": "completed", "result": "...", "error": null }

2. Spin up an ephemeral child (no specialist matched)

ephemeral delegationjavascript

list_specialists()
// → { "specialists": [] }   // nothing relevant

// Omit agent_id → ephemeral child on the default agent config
delegate_to_agent({
  "label": "scan syslog for interface flaps",
  "prompt": "Collect the last 200 log lines from dist-2 and summarize any interface up/down flaps in the last hour with timestamps and interface names. Read-only."
})

3. Fan out, then summarize

A parent can delegate several scoped children (up to 5 direct children) and combine their results. Each delegate_to_agent call blocks for that child, so the parent collects results one at a time and stays within the cap:

fan-out pattern (pseudocode)python

for device in ["core-1", "core-2", "dist-1"]:
  delegate_to_agent({
    "label": f"BGP audit {device}",
    "agent_id": "a1f3...",            # BGP Auditor
    "prompt": f"Audit BGP sessions on {device}; report any non-Established neighbors."
  })
# Parent then writes a combined report and saves it with save_document.

Make child prompts fully self-contained

A child does not inherit the parent's conversation. Put everything the child needs in the prompt — target device, exact task, and any constraints (for example "read-only only"). The tool's own guidance is to use delegation for self-contained sub-tasks a specialist is better at, not for trivial steps the parent could do itself.

Safety, Approvals & Sanitization

Delegation does not weaken any safety control — a child agent is governed by the same machinery as a top-level run:

Non-negotiable safety rules are injected into every agent's system prompt and cannot be configured or overridden — never guess device state, never run destructive commands (write erase, reload, format, delete startup-config, zeroize) without explicit human approval, always sanitize credentials, never exfiltrate data, and always identify as AI.
Per-call approval for mutating tools. Every mutating tool call — in a parent or a child — pauses for explicit user consent. While parked on that decision the task releases its concurrency permit, exactly like parking on a child.
Output-side validation. Each LLM-emitted tool call is re-validated against per-tool policy before dispatch; a blocked call returns a reason the model must act on rather than executing.
Credential sanitization. A sanitizing layer scrubs passwords, SNMP communities, API keys, VPN/routing secrets, and private keys (and optionally network identifiers) before anything reaches the configured LLM provider — for parents and children alike.

A child cannot escalate privilege

Delegation is a way to scope and parallelize work, not a way around approvals or safety rules. A child running a destructive command still triggers a human approval prompt, and a child still cannot exfiltrate configs or credentials. Cancel the parent and the entire subtree is cancelled with it.

Q&A

How does an agent decide whether to delegate?: The Agent Operating Guide tells it to call list_specialists first, then delegate_to_agent with a matching specialist's id or without an id for an ephemeral child. The tool descriptions add that delegation is for self-contained sub-tasks a specialist is better at, or to parallelize/scope context — not for trivial steps the agent can do itself. You tune this behavior by editing the operating guide in Settings → Prompts and by writing good specialist descriptions.
What is the difference between a specialist and an ephemeral child?: A specialist is a saved Agent Definition with its own system prompt and settings, delegated to by agent_id. An ephemeral child runs with the default agent config and is created by omitting agent_id. The workflow graph labels them Specialist and Ephemeral respectively.
Can a child delegate to its own children?: Yes, up to the depth cap of 3 levels below the root. Each level increments depth; once it would exceed 3, delegate_to_agent returns a refusal telling the child to do the sub-job itself.
How many sub-agents can one workflow spawn?: At most 5 direct children per agent and 25 total runs across the whole delegation tree. These caps are fixed in the orchestration tool.
What happens if a child takes too long?: delegate_to_agent waits up to timeout_seconds (default 300s). If the child is still running, the tool returns control to the parent with a note that the child continues in the background; its result remains reachable from the Workflow view.
Does a parent block a concurrency slot while waiting?: No. delegate_to_agent is a parking tool: the ReAct loop releases the parent's permit while it awaits the child and re-acquires it (cancellation-aware) afterward. With the default cap of 3 concurrent tasks, this is what prevents a parent from starving the child that needs a slot.
How many agent tasks run at once?: The default Local Agent runs up to 3 concurrent tasks, enforced by a semaphore. Parked tasks (awaiting a child or a human) do not count against the cap while parked.
I delegated to a specialist but got a refusal. Why?: Either the id does not exist, or the specialist is disabled. Only enabled definitions are delegatable. Enable it in the Agents panel, or omit agent_id to use an ephemeral child instead.
Are delegation safety rules different from normal agents?: No. Children inherit the same non-negotiable safety rules, per-call mutating-tool approvals, output validation, and credential sanitization as any run. Delegation scopes work; it does not relax controls.
How do I audit what a sub-agent did?: Open the parent run, switch to the Workflow view, and double-click the child node to open it as its own tab — complete with its durable transcript, result, and status. See NOC Agents for the run/transcript model.

NOC Agents — the agent task runner, the ReAct loop, runs, and transcripts.
AI Modes & Prompt Overrides — edit the Agent Operating Guide and per-feature prompts.
LLM Configuration — choose providers and models for agents and specialists.
MCP Servers — add external tools that agents and their children can call.
Vendor Knowledge Packs — the vendor expertise layered into agent system prompts.
AI Chat — the interactive counterpart to autonomous agents.
Method of Procedures — structured, approval-gated change workflows agents can drive.