How do teams implement deterministic agent workflows with branching and loops (like a state machine) instead of prompt spaghetti?

Most teams hit the same wall with agentic apps: the first prototype is a handful of prompts and a “smart” coordinator agent, and the third prototype is unreadable prompt spaghetti that nobody wants to debug. The fix is to stop encoding workflow logic inside prompts, and instead model it explicitly as a deterministic graph: nodes are agents, edges are allowed transitions, and conditions on those edges decide the next step—just like a state machine.

Quick Answer: Use AutoGen AgentChat’s GraphFlow team (built on autogen-core) to define your workflow as a directed graph where each node is an agent and each edge is a valid transition. You get deterministic execution, explicit branching and loops, and clear stop conditions instead of implicit control flow hidden in prompts. When GraphFlow is overkill, you can still use Core directly to wire your own workflow on top of topics/subscriptions and TaskResult(stop_reason=...).

Why This Matters

Once you move beyond a single LLM call, the hard parts are no longer “the right prompt” but “the right control flow.” If the runtime is just “agent talks to agent until something looks done,” you get:

Non-reproducible behavior (different paths on each run)
Hidden dependencies between agents and prompts
No clean way to branch, loop, or bail out safely

By expressing workflows as graphs on top of AutoGen Core, you separate what each agent does (its role + tools) from when it’s allowed to do it (graph edges and conditions). That gives you traceable runs, production-friendly guardrails, and a migration path from simple chats to proper state machines.

Key Benefits:

Deterministic control flow: You control which agent runs next, under what conditions, instead of hoping an LLM “hands off” correctly in its prose.
Safe branching and loops: Explicit conditions on graph edges let you implement branches and loops with clear exit criteria, reducing infinite loops and runaway costs.
Operational observability: Graph-based workflows play nicely with AutoGen Core’s events and TaskResult so you can reason about stop reasons, retries, and failure modes in logs and metrics.

Core Concepts & Key Points

Concept	Definition	Why it's important
Workflow Graph	A directed graph where each node is an agent and edges define possible execution paths between agents.	Gives you a structural, inspectable representation of the workflow instead of burying flow control inside prompts.
GraphFlow (AgentChat)	An AgentChat team that executes agents according to a `DiGraph`, supporting sequential, parallel, conditional, and looping behaviors.	Provides out-of-the-box deterministic workflows while still using familiar AgentChat agents.
TaskResult & Stop Reasons	A structured result object (e.g., `TaskResult(messages=..., stop_reason=...)`) that summarizes execution and why it stopped.	Lets you build reliable state-machine-like behavior with clear termination conditions and predictable error handling.

How It Works (Step-by-Step)

At a high level, implementing deterministic workflows in AutoGen looks like this:

Choose your layer:
- Start with AgentChat + GraphFlow if you want a directed, state-machine-like workflow with minimal plumbing.
- Drop to autogen-core when you need full control over topics, subscriptions, and runtime topologies (e.g., distributed workers).
Model the workflow as a graph:
- Define each step as an agent (e.g., triage_agent, executor_agent, review_agent).
- Connect them with edges representing allowed transitions. Add conditions where branching is needed.
Run and observe the flow:
- Kick off the workflow with an initial message.
- Let GraphFlow (or your own core-driven logic) drive which agent acts next, until a stop condition is reached and a TaskResult is returned.

Below is a concrete, code-first walkthrough.

Installation

Python 3.10 or later is required.

For a GraphFlow-based workflow using OpenAI models:

pip install -U "autogen-agentchat" "autogen-ext[openai]"

Set your OpenAI key:

export OPENAI_API_KEY="sk-..."

If you’re planning to integrate with Azure OpenAI or other providers, use the corresponding extras in autogen-ext and follow their auth requirements.

Defining a Deterministic Workflow with GraphFlow

1. Create a Simple Sequential + Branching + Looping Graph

In AgentChat, GraphFlow is the team that executes a directed graph (DiGraph) of agents. It supports:

Sequential chains
Parallel fan-outs
Conditional branching
Loops with safe exit conditions

Here’s a minimal example that:

Triage → decide if the task is “research” or “code”
Route to a corresponding specialist
Loop through a reviewer until quality is acceptable or a retry limit is hit

from autogen_agentchat.agents import AssistantAgent
from autogen_agentchat.teams import GraphFlow
from autogen_agentchat.messages import TextMessage
from autogen_ext.openai import OpenAIChatCompletionClient
import networkx as nx

# 1. Model client
model_client = OpenAIChatCompletionClient(model="gpt-4o-mini")

# 2. Define agents (nodes in the graph)
triage_agent = AssistantAgent(
    name="triage_agent",
    model_client=model_client,
    system_message=(
        "You triage user requests. "
        "Classify each request as 'research' or 'code' and respond with: "
        "ROUTE: research or ROUTE: code. Do not add extra text."
    ),
)

research_agent = AssistantAgent(
    name="research_agent",
    model_client=model_client,
    system_message="You are a concise research assistant.",
)

code_agent = AssistantAgent(
    name="code_agent",
    model_client=model_client,
    system_message="You write and refactor Python code with explanations.",
)

review_agent = AssistantAgent(
    name="review_agent",
    model_client=model_client,
    system_message=(
        "You review the latest response for quality.\n"
        "Reply with exactly one of:\n"
        "- APPROVE\n"
        "- REVISE: <short feedback>"
    ),
)

# 3. Build the workflow graph
g = nx.DiGraph()

# Add nodes
g.add_node("triage", agent=triage_agent)
g.add_node("research", agent=research_agent)
g.add_node("code", agent=code_agent)
g.add_node("review", agent=review_agent)

# Sequential edges from triage
g.add_edge("triage", "research", condition=lambda ctx: "ROUTE: research" in ctx.last_message_text)
g.add_edge("triage", "code", condition=lambda ctx: "ROUTE: code" in ctx.last_message_text)

# From specialists to review
g.add_edge("research", "review")
g.add_edge("code", "review")

# Loop: if REVISE, go back to the same specialist; if APPROVE, stop
def back_to_research(ctx):
    return "REVISE:" in ctx.last_message_text and ctx.last_agent_name == "review_agent"

def back_to_code(ctx):
    return "REVISE:" in ctx.last_message_text and ctx.last_agent_name == "review_agent"

def approved(ctx):
    return "APPROVE" in ctx.last_message_text and ctx.last_agent_name == "review_agent"

g.add_edge("review", "research", condition=back_to_research)
g.add_edge("review", "code", condition=back_to_code)
# GraphFlow will treat lack of outgoing satisfied edges as a natural stop,
# but you can explicitly define an end node pattern if desired.

# 4. Wrap into a GraphFlow team
team = GraphFlow(graph=g, entry_node="triage", max_iterations=10)

# 5. Run the workflow
async def run_workflow(user_request: str):
    result = await team.run(
        task=TextMessage(content=user_request, source="user"),
    )
    # result is a TaskResult-like object; you can inspect messages and stop_reason
    print("Stop reason:", result.stop_reason)
    print("Final messages:")
    for m in result.messages:
        print(f"{m.source}: {m.content[:300]}")

# You would run this in an async event loop, e.g.:
# import asyncio
# asyncio.run(run_workflow("Write a script to deduplicate rows in a CSV and explain it."))

Notes:

Conditions receive a context object (shape may evolve; check current docs) with access to last message text, last agent, etc.
max_iterations is a critical safety guard: it enforces a loop exit even if your conditions are buggy.
When no outgoing edge conditions are satisfied, GraphFlow stops and returns a result with a specific stop_reason.

When to Use GraphFlow vs Group Chats

Use GraphFlow when you need strict control over the order in which agents act, or when different outcomes must lead to different next steps. This is the “state machine” scenario: onboarding flows, approval workflows, multi-step incident response, etc.
Start with RoundRobinGroupChat or SelectorGroupChat if ad-hoc conversation flow is sufficient and you just want a team to collaborate without fixed structure.
Transition to a structured workflow when your task requires deterministic control, conditional branching, or complex multi-step behavior.

Going Deeper: Core, Topics, and TaskResult

If you’re building serious platform-like systems (multi-tenant, distributed, heavy workloads), you’ll likely work directly with autogen-core underneath AgentChat.

Core gives you:

Event-driven runtimes like SingleThreadedAgentRuntime (local) and distributed runtimes (host servicer + workers + gateways).
Topics and subscriptions as your routing primitive instead of hard-coded agent IDs.
TaskResult objects with stop_reason to describe how a conversation or workflow ended.

The pattern I use in production:

Model each “workflow instance” as a topic:
Topic = (Topic Type, Topic Source) → string form workflow/user_123_incident_456
Agents subscribe via TypeSubscription(topic_type="workflow", agent_type="triage_agent") or similar.
The runtime ensures messages are delivered to the right agents; you manage branching and loops via message content and stop reasons.

A simplified local setup:

pip install -U "autogen-core"

from autogen_core import (
    SingleThreadedAgentRuntime,
    TypeSubscription,
    TaskResult,
)

# Pseudocode-ish; check current core docs for exact class names and signatures.

runtime = SingleThreadedAgentRuntime()

# Register agents with type-based subscriptions
runtime.register_agent(
    agent_type="triage_agent",
    subscriptions=[TypeSubscription(topic_type="workflow", agent_type="triage_agent")],
    # ... agent implementation callback here ...
)

# Similar for research_agent, code_agent, review_agent

# Start a workflow by publishing an initial message to a topic
topic_type = "workflow"
topic_source = "user_123_incident_456"
topic = f"{topic_type}/{topic_source}"

result: TaskResult = runtime.run_task(
    topic=topic,
    initial_message={"role": "user", "content": "User request here"},
    # Optional: max_steps, timeout, etc.
)

print(result.stop_reason)  # e.g., "max_steps_reached", "no_subscribers", "task_completed"

Using topics instead of direct agent IDs makes your workflows more portable:

You can move an agent implementation from single-process (SingleThreadedAgentRuntime) to a distributed runtime without rewriting the workflow logic.
You get natural multi-tenant isolation by encoding tenant or task IDs in the topic source, e.g. workflow/tenantA_case123.

In practice, I define the logical workflow in terms of:

Topic naming conventions
Per-agent responsibilities
Conditions for publishing new messages or emitting a completion event

…and leave the routing mechanics to Core.

Common Mistakes to Avoid

Encoding state machine logic inside prompts:
If you rely on “Now ask the reviewer if it’s good, and if not, try again” inside instructions, you’re letting the LLM invent its own control flow. Instead, encode:
- Which agent runs next (graph edge or core logic)
- Under what condition (predicate on message content)
- With what safety limits (max_iterations, max_steps)
Hard-coding agent IDs instead of using topics/subscriptions:
When you hard-wire “send to agent X,” your workflow and deployment topology get entangled. Use topic types and type-based subscriptions (TypeSubscription) so you can refactor, scale out, or swap implementations without rewriting every call site.
Infinite or unbounded loops:
Loops are necessary, but every loop should have:
- A maximum number of iterations (max_iterations in GraphFlow, max_steps in Core)
- A clear exit condition (e.g., reviewer says APPROVE)
- A fallback stop reason you can monitor (e.g., “max_iterations_reached”)

Real-World Example

We replaced a “smart coordinator” prompt that tried to orchestrate four agents (triage, researcher, coder, reviewer) in one free-form group chat. The coordinator would sometimes:

Skip the reviewer to “save time”
Loop between researcher and coder indefinitely
Produce different sequences for the same input

By moving to GraphFlow:

We defined a sequential chain (triage → specialist → review) with explicit edges.
We added conditional branching at triage based on ROUTE: research vs ROUTE: code.
We implemented a loop between review and the specialist, with max_iterations=5 and an APPROVE exit condition.

Operationally:

We can now explain and replay runs from logs by listing nodes visited and decisions taken.
We alert on workflows that stop with stop_reason="max_iterations_reached" and treat them as “needs human review.”
Migrating from SingleThreadedAgentRuntime to a distributed runtime was just a runtime config change; the graph and agent definitions didn’t change.

Pro Tip: Design your agent messages to carry machine-readable signals for branching—short tags like ROUTE: code or APPROVE—and keep explanation text separate. That makes your edge conditions reliable, testable, and less brittle than parsing free-form prose.

Summary

If your agent system feels like prompt spaghetti, you’re probably missing a proper workflow layer. AutoGen’s stack gives you several levels of control:

AgentChat + GraphFlow: Define workflows as directed graphs with sequential, parallel, conditional, and looping paths.
Core runtimes: Use topics, subscriptions, and TaskResult(stop_reason=...) for platform-grade control and observability.
Message-level conventions: Use machine-readable tags and stop conditions instead of burying logic inside prompts.

Treat your agents like services in a state machine, not like a group chat of humans you hope will self-organize. The runtime—not the prompt—should own who acts next, under what conditions, and when the task is done.

Next Step

Get Started