
How can we model a regulated business process with an AI agent (triage → execute → approval → audit trail) without it becoming an unpredictable chat?
Most regulated workflows break the moment you treat them like a free-form chat between a user and a clever assistant. Triage turns into brainstorming, “execute” blurs into “negotiate,” and you end up with an audit trail that’s just a wall of tokens instead of a traceable business process. The fix is to model the process as a controlled agentic workflow with explicit states, not as an open conversation.
Quick Answer: Use AutoGen’s AgentChat and Core layers to model triage → execute → approval → audit trail as a deterministic workflow: distinct agents for each role, topics/subscriptions for routing, and runtime conditions (e.g.,
TaskResult(stop_reason=...)) to enforce state transitions. That keeps the AI agent inside your business process, instead of you chasing a creative “chat” with no guarantees.
Why This Matters
If you’re in a regulated environment, “the model responded” is not good enough. You need to prove:
- Who decided to do what (triage)
- What was actually done (execute)
- Who approved it, under what criteria (approval)
- What the final, immutable record is (audit trail)
Doing this as a loose chat creates three problems:
- No deterministic flow: You can’t easily guarantee the order triage → execute → approval → archive.
- Weak separation of duties: One agent can “helpfully” do everything, violating your control design.
- No structured audit trail: You get a blob of conversation, not a structured, queryable record.
By modeling the process using AutoGen’s event-driven runtime and AgentChat patterns, you can express the workflow in code as topics, agents, and conditions. That gives you reliability, traceability, and the ability to reason about the system like any other critical service.
Key Benefits:
- Deterministic control over who acts next: The AutoGen Core runtime and AgentChat Teams (like
GraphFlow) let you define which agent can act in which state, instead of letting the LLM improvise. - Provable separation of duties: Separate
triage_agent,executor_agent, andapprover_agenttypes with distinct subscriptions, so execution and approval are always performed by different roles. - Structured, queryable audit trails: Use
TaskResult(messages=..., stop_reason=...)and filtered message logs as your audit record, rather than unstructured chats.
Core Concepts & Key Points
| Concept | Definition | Why it's important |
|---|---|---|
| Topic | In AutoGen Core, a logical routing key: Topic = (Topic Type, Topic Source) with string form Topic_Type/Topic_Source. | Lets you model workflow stages (triage/execute/approve) and tenants without hard-coding agent IDs, improving isolation and portability. |
| TypeSubscription | A subscription that routes messages based on topic_type and agent_type rather than specific IDs. | Decouples agents from concrete instances so you can scale and swap implementations without rewriting flows. |
| TaskResult(stop_reason=...) | The structured result of a team run in AgentChat, containing messages and a termination reason. | Turns your “chat” into a finished task with a clear stop reason (approved, rejected, timeout) that you can store as an audit event. |
How It Works (Step-by-Step)
At a high level, you’ll:
- Define separate agents for triage, execution, approval, and audit.
- Use a structured workflow pattern (graph-style) instead of a free group chat.
- Persist the structured results and filtered message history as your audit trail.
Below I’ll walk through a minimal version using AgentChat’s GraphFlow. You can later move this logic deeper into AutoGen Core if you need fully custom runtimes or distributed topologies.
1. Install the right packages
Python 3.10 or later is required.
pip install -U "autogen-agentchat" "autogen-core" "autogen-ext[openai]"
You’ll also need an OpenAI-compatible API key in your environment, for example:
export OPENAI_API_KEY="your-key"
2. Model roles as separate agents
Start in AgentChat, which is a high-level API built on top of autogen-core. We’ll create four agents:
triage_agent– classifies and normalizes the request.executor_agent– proposes concrete actions but does not approve them.approver_agent– evaluates the plan against policies and either approves or rejects.audit_agent– composes a structured audit summary.
from autogen_agentchat.agents import AssistantAgent
from autogen_ext.models.openai import OpenAIChatCompletionClient
client = OpenAIChatCompletionClient(model="gpt-4o-mini")
triage_agent = AssistantAgent(
"triage_agent",
model_client=client,
system_message=(
"You are a triage analyst in a regulated environment. "
"Classify the request, extract key fields, and decide if it is in-scope. "
"Output JSON with fields: {\"in_scope\": bool, \"category\": str, \"reason\": str}."
),
)
executor_agent = AssistantAgent(
"executor_agent",
model_client=client,
system_message=(
"You propose execution steps but do not approve them. "
"Given a triaged request, produce a concrete action plan as JSON "
"with fields: {\"steps\": [...], \"risks\": [...]}."
),
)
approver_agent = AssistantAgent(
"approver_agent",
model_client=client,
system_message=(
"You are a compliance approver. Evaluate the proposed plan against policy. "
"Respond with JSON: {\"decision\": \"approve\"|\"reject\", \"justification\": str, "
"\"required_changes\": [str]}."
),
)
audit_agent = AssistantAgent(
"audit_agent",
model_client=client,
system_message=(
"You are an audit logger. Given the triage output, execution plan, and approval decision, "
"produce a concise, immutable record as JSON: "
"{"
"\"request_id\": str,"
"\"category\": str,"
"\"in_scope\": bool,"
"\"actions\": [str],"
"\"approval_decision\": str,"
"\"approval_justification\": str"
"}."
),
)
Note how each agent’s system message encodes scope and output schema. This is your first line of defense against “unpredictable chat.”
3. Turn the roles into a structured workflow
Rather than dropping all agents into a GroupChat, we’ll use a graph-based workflow. In AutoGen AgentChat 0.4, GraphFlow is designed for deterministic, multi-step flows.
Note:
GraphFlowis labeled experimental in the docs and is subject to change. Treat it as a pattern to model, not a guaranteed-stable API.
Example: triage → execute → approve → audit, with a loop if the plan is rejected.
from autogen_agentchat.teams import DiGraphBuilder, GraphFlow
from autogen_agentchat.conditions import MaxMessageTermination
from autogen_agentchat.ui import Console
console = Console()
builder = DiGraphBuilder()
# Add nodes
builder.add_agent_node("triage", triage_agent)
builder.add_agent_node("execute", executor_agent)
builder.add_agent_node("approve", approver_agent)
builder.add_agent_node("audit", audit_agent)
# Define edges: triage → execute → approve → (audit or execute)
builder.add_edge("triage", "execute")
builder.add_edge("execute", "approve")
# approve → audit on approval; approve → execute on rejection
# We'll keep the logic in the approver's prompt and rely on a simple message pattern.
# In a stricter system, you'd implement custom edge conditions.
builder.add_edge("approve", "audit") # on approve
builder.add_edge("approve", "execute") # on reject / required changes
team = GraphFlow(
graph=builder.build(),
termination_condition=MaxMessageTermination(max_messages=20),
ui=console,
)
Now you can run a single task through this regulated flow:
from autogen_agentchat.tasks import TextTask
task = TextTask(
"triage",
"Process this request: 'Increase daily transfer limit for account 12345 to $50,000.'",
)
result = team.run_task(task)
print("Stop reason:", result.stop_reason)
print("Messages:")
for m in result.messages:
print(m)
The key is that this is no longer an open-ended “chat.” It’s a task with:
- A defined starting node (
triage). - A bounded number of messages (
MaxMessageTermination). - A final
TaskResult(stop_reason=...)you can persist.
4. Filter what each agent sees to avoid “chatty” bleed-through
Without controls, agents see the whole message history and start riffing. AutoGen’s MessageFilterAgent lets you restrict what’s visible to each role so they only act on relevant context.
Message filtering helps:
- Reduce hallucinations
- Control memory load
- Focus agents only on relevant information
Example: make the audit_agent see only the first user request, the triage output, the final plan, and the approval decision—not every intermediate back-and-forth.
from autogen_agentchat.agents import MessageFilterAgent, MessageFilterConfig, PerSourceFilter
audit_filter_config = MessageFilterConfig(
per_source_filters=[
# Always include original user request (source "user")
PerSourceFilter(source="user", max_messages=1),
# Include latest triage, execute, and approve outputs
PerSourceFilter(source="triage_agent", max_messages=1),
PerSourceFilter(source="executor_agent", max_messages=1),
PerSourceFilter(source="approver_agent", max_messages=1),
]
)
filtered_audit_agent = MessageFilterAgent(
"audit_agent_filtered",
wrapped_agent=audit_agent,
config=audit_filter_config,
)
Swap this into your graph in place of the original audit_agent, and your audit summary will be stable, focused, and less likely to pick up noise.
5. Persist the audit trail as a first-class artifact
When team.run_task finishes, you get a TaskResult:
from dataclasses import asdict
# Example shape; actual attrs may differ slightly across versions
print(result.stop_reason) # e.g., "max_messages", "terminated_by_condition", etc.
audit_messages = [m for m in result.messages if m.sender == "audit_agent_filtered"]
audit_payload = audit_messages[-1].content if audit_messages else None
record = {
"request": "Increase daily transfer limit for account 12345 to $50,000.",
"workflow_stop_reason": result.stop_reason,
"audit_json": audit_payload,
"raw_log": [asdict(m) for m in result.messages],
}
# Persist to your store of choice; pseudocode:
# audit_store.save(record)
This is your audit trail:
- It’s structured, not just text.
- It’s coupled to a stop reason you define.
- It contains a full but filtered message log for deep investigations.
At this point, your AI workflow looks like any other workflow engine with events, states, and logs—except the “transition logic” is powered by LLMs.
Common Mistakes to Avoid
-
Letting group chat drive the process:
UseGraphFlowor a Core-level workflow instead of a freeGroupChat. Group chats are great for ideation, bad for regulated approvals. -
Hard-coding agent IDs everywhere:
In AutoGen Core, preferTypeSubscription(topic_type="triage", agent_type="triage_agent")style subscriptions. That way, you can swap or scale agent instances without editing your workflow code. -
Skipping termination and stop reasons:
Always define termination conditions (likeMaxMessageTermination) and inspectTaskResult.stop_reason. This protects you from runaway loops and gives you explicit audit semantics (e.g., “terminated by policy,” “approved,” “timed out”).
Real-World Example
At my shop, we had a production prototype where a single “smart” agent handled intake, did execution planning, and drafted approvals in one conversation thread. It worked until an internal audit asked, “How do you prove that approval decisions didn’t come from the same logic that proposed the plan?” We didn’t have a good answer.
We refactored to AutoGen 0.4 with:
- A
triage_agenttype subscribed to topics liketriage/tenant123. - An
executor_agenttype subscribed toexecute/tenant123. - An
approver_agentthat only ever consumed fromapprove/tenant123. - An
audit_agentthat subscribed toaudit/tenant123and saw a filtered subset of messages.
We wrapped this in an AgentChat GraphFlow pattern for local testing, then ported the graph logic into an autogen-core workflow using a SingleThreadedAgentRuntime locally and a distributed runtime (host servicer + workers + gateways) in production. Because we used topics and TypeSubscription, scaling the execution step to multiple workers was a routing change, not an app rewrite.
Pro Tip: Design your topics first—e.g.,
triage/<tenant>,execute/<tenant>,approve/<tenant>,audit/<tenant>—and then hang agents and workflows off those topics. If you start from agent IDs, you’ll paint yourself into a corner when compliance demands tenant isolation or new approval layers.
Summary
You can absolutely model a regulated triage → execute → approval → audit trail with AI agents—without it devolving into an unpredictable chat—if you:
- Treat the process as a workflow with explicit states, not a conversation.
- Use AutoGen AgentChat’s structured Teams (
GraphFlow) and conditions to control who acts next. - Apply Core concepts like topics and
TypeSubscriptionto keep routing clean and portable. - Enforce message filtering and termination conditions to “Reduce hallucinations,” “Control memory load,” and keep agents within strict role boundaries.
- Persist
TaskResultand filtered messages as your authoritative audit trail.
If you do that, the LLM becomes an implementation detail behind a deterministic, inspectable runtime—exactly what you want in a regulated environment.