What are good patterns for human-in-the-loop approvals and escalation in agent workflows (especially for compliance)?
AI Agent Automation Platforms

What are good patterns for human-in-the-loop approvals and escalation in agent workflows (especially for compliance)?

10 min read

Human-in-the-loop approvals are where agent workflows either pass compliance—or become an unreviewable black box. In a regulated environment, your agents must know when to stop, who can approve, and how to route escalations without leaking data or bypassing controls. With AutoGen, those behaviors live in the runtime and workflow design, not just in prompts.

Quick Answer: The most reliable patterns for human-in-the-loop approvals in agent workflows combine explicit “pending approval” stages, topic-based routing, and escalation policies encoded in your runtime rather than scattered in prompts. In AutoGen, that translates to GraphFlow nodes or Team roles that pause on stop_reason="approval_required", route to a human reviewer via topics/subscriptions, and resume or escalate based on structured responses.

Why This Matters

If you’re in a bank, insurer, or healthcare org, “let the agent just decide” is not an acceptable design. Regulators expect:

  • Clear decision points with human sign-off.
  • Audit trails showing who approved what, with what context.
  • Predictable escalation when an approver is unavailable, conflicted, or the risk is too high.

Agent workflows fail compliance when approvals are ad hoc (“just send a Slack message”) or embedded purely in natural language (“ask a human if this looks risky”). You want approvals to be explicit states in your workflow with deterministic routing and observable state transitions.

Key Benefits:

  • Provable control: Approvals and escalations are encoded in workflow nodes and TaskResult(stop_reason=...), not hidden in prompt text.
  • Auditable history: Every approval, rejection, and escalation becomes an event you can log and replay.
  • Safer automation: Agents can run at full speed while still “hard-stopping” on high‑risk actions, reducing both manual toil and compliance exposure.

Core Concepts & Key Points

ConceptDefinitionWhy it's important
Approval GateA workflow step where the agent must obtain explicit human approval before proceeding. Implemented as a node (GraphFlow) or a specialized Team agent that pauses with stop_reason="approval_required".Makes “approval required” a first-class state, not a vague instruction. Easier to test, audit, and enforce.
Escalation PathA deterministic routing rule for who handles an approval when primary reviewers are unavailable, conflicted, or the risk level is high.Prevents “stuck” workflows and ensures high‑risk decisions go to the right level (e.g., second-line risk, legal).
Topic-Based RoutingUsing Core’s topics/subscriptions (e.g., approvals/claims, escalation/high_risk) to route approval requests to the correct human or agent group.Decouples workflows from hard-coded IDs; makes your approval patterns portable and easier to evolve.

How It Works (Step-by-Step)

A robust human-in-the-loop approval pattern in AutoGen usually follows this sequence:

  1. Risk assessment and gating:
    A “policy” or “triage” agent decides whether a step requires approval (and at what level) and emits a structured message, not just free text.

  2. Approval request and routing:
    The workflow sends an approval task (with minimal necessary context) to an approval topic. Human reviewers (or a thin UI) subscribe to that topic and respond with structured decisions.

  3. Decision handling and escalation:
    The runtime interprets approval responses. On approval, it resumes the workflow path; on rejection or timeout, it either stops or escalates to a different topic and approver group.


Below are patterns I’ve found durable in regulated environments using AutoGen’s stack.


Pattern 1: Explicit Approval Gates in GraphFlow (Core + AgentChat)

Use this when you need strict control over when humans must sign off and how the workflow proceeds after.

Note: GraphFlow is labeled experimental in the docs and is subject to change. Use it when you need deterministic control of multi-step workflows; otherwise start with Teams patterns like SelectorGroupChat.

Installation

pip install -U "autogen-core" "autogen-agentchat" "autogen-ext[openai]"

You’ll also need your model provider configured (e.g., OPENAI_API_KEY in the environment) for OpenAIChatCompletionClient.

Concept: Approval Node

Treat an approval as its own node in the graph:

  • Input: risk assessment and proposed action from prior agents.
  • Output: a decision object, e.g., { "status": "approved" | "rejected" | "escalate", "reason": "...", "approver_id": "..." }.
  • Behavior: sets stop_reason="approval_required" when waiting, and a new message when resumed.

Minimal Example: Draft → Approve → Execute

Below is a skeleton showing the pattern, not production-ready code:

from autogen_core import SingleThreadedAgentRuntime
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import GraphFlow
from autogen_ext.openai import OpenAIChatCompletionClient

runtime = SingleThreadedAgentRuntime()

model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")

writer = AssistantAgent(
    "writer",
    model_client=model_client,
    system_message="Draft an email proposing a fee waiver. Do not send; just draft.",
)

approval_agent = AssistantAgent(
    "approval_gate",
    model_client=model_client,
    system_message=(
        "You are an approval controller. "
        "You NEVER send the email. Instead you output ONLY JSON:\n"
        '{"decision": "approve" | "reject" | "escalate", "reason": "..."}'
    ),
)

executor = AssistantAgent(
    "executor",
    model_client=model_client,
    system_message="You are an execution placeholder. You only log that an email would be sent.",
)

graph = GraphFlow(runtime=runtime)

# Pseudocode: nodes & edges
graph.add_node("writer", writer)
graph.add_node("approval_gate", approval_agent)
graph.add_node("executor", executor)

graph.add_edge("writer", "approval_gate")
graph.add_edge(
    "approval_gate",
    "executor",
    condition=lambda msg: '"decision": "approve"' in msg.content,
)
# Add edges for reject / escalate as needed

async def run():
    result = await graph.run(
        start_node="writer",
        task="Prepare an email offering a fee waiver above $5,000."
    )
    print(result.messages[-1].content)

if __name__ == "__main__":
    import asyncio
    asyncio.run(run())

To make this truly human-in-the-loop:

  • Replace approval_agent with a bridging component that exposes approval tasks to a UI.
  • Have that component set TaskResult(stop_reason="approval_required") and wait until a human posts a message to the appropriate approval topic (see Pattern 3).

When to use this pattern

  • You need deterministic “only proceed on approval” guarantees.
  • You must model different paths: approved → execute, rejected → stop, escalate → new approver.
  • You care about replaying full workflows for audit.

Pattern 2: Human Reviewer as a First-Class Team Member (AgentChat)

If GraphFlow is more structure than you need, but you still want approvals, treat the human as a Team member with a clear role.

Idea

  • You run a Team (e.g., SelectorGroupChat) including:
    • A domain agent (e.g., TravelAgent, ClaimsAgent).
    • A human “proxy” agent that stands in for a real person.
  • When the domain agent detects a high‑risk action (refunds over threshold, policy edge cases), it hands off to the human agent.

Minimal Example: Compliance Reviewer in a Team

pip install -U "autogen-agentchat" "autogen-ext[openai]"
from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import SelectorGroupChat
from autogen_ext.openai import OpenAIChatCompletionClient

model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")

travel_agent = AssistantAgent(
    "travel_agent",
    model_client=model_client,
    system_message=(
        "You handle travel refund requests. "
        "If refund amount > 1000 USD or user mentions 'fraud'/'chargeback', "
        "you MUST hand off to human_reviewer."
    ),
)

# In practice this is your UI bridge, not a model.
human_reviewer = AssistantAgent(
    "human_reviewer",
    model_client=model_client,
    system_message=(
        "You represent a human compliance officer. "
        "You do not auto-approve; you explain what a human should consider. "
        "In production this agent is replaced by a UI."
    ),
)

team = SelectorGroupChat(
    "refund_team",
    agents=[travel_agent, human_reviewer],
    model_client=model_client,
)

async def run():
    result = await team.run(task="Customer requests a $1500 refund, citing fraud.")
    for msg in result.messages:
        print(f"[{msg.source}] {msg.content}")

if __name__ == "__main__":
    import asyncio
    asyncio.run(run())

To go from “demo” to “compliance-ready”:

  • Swap human_reviewer to a custom agent that:
    • Emits events to your approval UI.
    • Pauses the conversation (stop_reason="approval_required").
    • Resumes when a user submits an approve/reject decision.

When to use this pattern

  • You want approval logic but don’t need a full graph yet.
  • You already use Teams like SelectorGroupChat or RoundRobinGroupChat.
  • Most tasks are low-risk and only some need a human; the handoff is conditional and conversational.

Pattern 3: Topic-Based Approval Queues and Escalations (Core)

This is where compliance gets real: approvals as queues, not as direct messages.

Key Primitive: Topic

In AutoGen Core, you can model message routing via topics and subscriptions. Conceptually:

  • Topic = (Topic Type, Topic Source)
  • String form: "Topic_Type/Topic_Source"

Example topics:

  • approvals/claims
  • approvals/high_risk
  • escalation/legal
  • escalation/risk_committee

You then subscribe agents (or UI bridges) using TypeSubscription(topic_type="approvals", agent_type="approval_ui").

Pattern

  1. Business agent decides approval is needed.
    • Emits a message to a topic like approvals/claims.
  2. Approval UI agent subscribes to that topic.
    • Displays the task to a human approver.
  3. Human approves/rejects/escalates through the UI.
    • UI sends a structured message back on a correlated topic, e.g., approvals_result/claims/<task_id>.
  4. Workflow runtime resumes or escalates.
    • On decision="escalate", the runtime posts a new message to escalation/high_risk, which might be handled by a second-line risk team or legal.

Why this matters for compliance

  • You separate who needs approval (topics) from who is currently on call (subscribers).
  • You log and replay all approval messages as events.
  • You can evolve your escalation policy by changing subscriptions, not rewriting agent prompts.

Pro Tip: Default to topics/subscriptions (TypeSubscription) instead of hard-coded agent IDs for approvals and escalations. You will change org structures and approver mappings far more often than you change your underlying agents.


Pattern 4: Approval Outcomes Encoded in TaskResult

Human-in-the-loop only works if your runtime knows why a task stopped and what to do.

When your approval bridge agent pauses a workflow, have it return a TaskResult with a clear stop_reason, e.g.:

  • "approval_required"
  • "approval_rejected"
  • "approval_escalated"

Then make your orchestrator (GraphFlow or a custom Core driver) treat stop_reason as a state machine trigger:

  • stop_reason="approval_required" → wait for human input on a topic.
  • stop_reason="approval_rejected" → terminate workflow with status “blocked”.
  • stop_reason="approval_escalated" → reroute to a different graph segment or topic.

This gives you:

  • Machine-readable states for monitoring and alerts.
  • Clear metrics: how many workflows are pending approval, time to approval, escalation rates.
  • Reproducibility: you can reconstruct what happened without parsing freeform text.

Pattern 5: Message Filtering Around Approval Steps

Compliance teams care about data minimization. Approval UIs should see only what’s needed.

AutoGen’s message filtering lets you:

  • “Reduce hallucinations”
  • “Control memory load”
  • “Focus agents only on relevant information”

Using MessageFilterAgent or PerSourceFilter, you can:

  • Strip PII from messages before they go to approvals/* topics.
  • Hide internal reasoning from approvers but show the relevant facts and suggested decisions.
  • Apply different filters based on source: internal notes vs. customer data.

For example:

  • Before posting to approvals/claims, run a filter that removes raw customer identifiers and attaches a case ID only.
  • Keep the full, unredacted context within the internal workflow for audit, but not in the approval UI.

Common Mistakes to Avoid

  • Putting approval rules only in prompts:
    How to avoid it: Encode approvals as explicit nodes, topics, and stop_reason values. Use prompts to explain why approval is needed, not whether it is needed.

  • Hard-coding user IDs instead of topics:
    How to avoid it: Route approval tasks through topics like approvals/loans and use TypeSubscription to connect the appropriate approver UI or group. This keeps your workflows portable across org changes.


Real-World Example

In our environment, we have an agent workflow that evaluates high-value fee waiver requests:

  1. A TriageAgent classifies the request’s risk level and required approval tier (no approval, supervisor, risk committee).
  2. The workflow sends a message to approvals/fee_waivers with:
    • Case ID
    • Summary
    • Proposed action
    • Risk level
  3. A custom “Approval UI Agent” subscribes to approvals/*, exposing tasks in a web dashboard. Human approvers choose approve/reject/escalate.
  4. The UI posts the decision back on approvals_result/fee_waivers/<case_id> with structured JSON.
  5. The orchestrator resumes the workflow:
    • Approved → executes the change (through a tool or downstream system).
    • Rejected → stops; notifies the original requester.
    • Escalated → posts a new message to escalation/high_risk, which is subscribed to by a risk committee UI.

We capture each step as an event with timestamps and actor identity. When auditors ask “Who approved this waiver and what did they see?”, we replay the event stream, not screenshots or chat logs.

Pro Tip: Design your approval messages as immutable events with explicit fields (case_id, risk_level, proposed_action, decision, approver_id, timestamp). Avoid having approvers rely on freeform chat context; instead, present a structured summary plus the original documents as attachments.


Summary

Human-in-the-loop approvals and escalation in agent workflows are compliance features, not UX sugar. With AutoGen’s stack:

  • Use GraphFlow or Team patterns to model approvals as explicit workflow steps.
  • Use topics/subscriptions (TypeSubscription) to route approval tasks and escalations, rather than hard-coded agent IDs.
  • Use TaskResult(stop_reason=...) and message filtering to make approvals observable, auditable, and data-minimizing.

If you encode approvals at the runtime layer, you get predictable control and clear audit trails—while still letting your agents automate the boring parts of the process.

Next Step

Get Started