
AutoGen Studio: how do I prototype a multi-agent workflow in the UI and then move it into a Python repo for production?
Quick Answer: Use AutoGen Studio’s Team Builder to design and test your multi-agent workflow, then export the configuration as JSON and rehydrate it in Python using AgentChat (and optionally Core/GraphFlow) for production. You define agents, tools, and termination conditions in the UI, validate behavior in the Playground, then load the same declarative spec into your repo to wire it into a real runtime and CI/CD.
Why This Matters
If you’re serious about multi-agent systems, your biggest risks rarely come from prompts or model choice—they come from runtime behavior: brittle routing, uncontrolled context growth, and workflows that can’t be moved from a laptop prototype to a production topology. AutoGen Studio plus AgentChat gives you a single declarative spec you can iterate on visually, then re-use in code with proper runtimes (SingleThreadedAgentRuntime locally, distributed runtimes for scale), message filtering, and observability.
Key Benefits:
- Faster iteration: Prototype team structures, roles, and termination conditions in Studio’s UI before touching your Python code.
- Declarative portability: Export the same team definition as JSON and load it into AgentChat, so your “design” and “prod” agents stay in sync.
- Production controls: Attach real model clients, tools, runtimes, and message filters in Python to get reliable
TaskResult(stop_reason=...)outcomes.
Core Concepts & Key Points
| Concept | Definition | Why it's important |
|---|---|---|
| AutoGen Studio (UI) | A web-based UI (built on AgentChat) for prototyping agents, tools, and teams without writing code. | Lets you experiment with multi-agent designs quickly, with visible conversations and metrics before committing to a Python implementation. |
| Declarative Team Spec (JSON) | A JSON representation of teams, agents, tools, models, and termination conditions created in AgentChat/Studio. | Acts as the “contract” between Studio and your Python repo—no more drifting prompts or hand-copied configs. |
| AgentChat & Core Runtimes | autogen-agentchat high-level API built on autogen-core’s event-driven runtime (SingleThreadedAgentRuntime, distributed runtimes). | Turn Studio prototypes into real applications with explicit routing, lifecycle, and result handling. |
How It Works (Step-by-Step)
At a high level, you’ll:
- Prototype a multi-agent workflow in AutoGen Studio’s Team Builder.
- Export the team config as a declarative JSON spec.
- Import that JSON into your Python repo and run it with AgentChat/Core.
1. Installation & Running AutoGen Studio
First, install Studio (Python 3.10+ required):
pip install -U autogenstudio
Start the UI (local, single process):
autogenstudio ui --port 8080 --appdir ./myapp
Then open http://localhost:8080 in your browser.
Note: Studio is local tooling, not a hosted service. Costs come from whatever model providers you configure (e.g., OpenAI, Azure OpenAI).
2. Prototype Your Multi-Agent Workflow in the UI
In Studio, you’ll primarily work in Team Builder and Playground.
2.1 Create a new Team
- Go to Team Builder.
- Click New Team.
- Give it a name like
research_summary_flow.
You’ll now define:
- Agents (e.g., “Researcher”, “Critic”, “Summarizer”)
- Tools (e.g., web search, code executor)
- Models
- Termination conditions
2.2 Define agents and roles
Inside the team:
- Add an agent for each role:
researcher_agentcritic_agentsummarizer_agent
- For each agent, set:
- Model: pick a default from the Studio model list.
- System prompt: e.g., “You are a detailed researcher…”.
- Termination / max turns: how long it should talk.
This maps directly to AgentChat’s agent definitions under the hood, which is what you’ll re-use later.
2.3 Attach tools and models
Still in Team Builder:
- Attach tools to specific agents (e.g., a search tool to
researcher_agent). - Configure model parameters (temperature, max tokens) if Studio exposes them in your build.
Why this matters: You’re defining behavior declaratively. That means later you can dump this configuration to JSON and load it in Python with minimal drift.
2.4 Set termination conditions
Studio lets you define team-level termination conditions—for example:
- “Stop when the
summarizer_agentoutputs a message that includesFINAL ANSWER.” - “Stop after N total turns.”
These map to AgentChat termination configs, which you’ll retrieve from the JSON spec in your repo.
2.5 Test in Playground
After defining your team:
- Click Attach to Session or send it to Playground.
- In Playground, start a new session with your team.
- Run a few tasks—e.g., “Research the last 3 AutoGen releases and summarize the breaking changes.”
Here you can:
- Inspect the “inner monologue” between agents.
- See artifacts (code snippets, tool outputs).
- Monitor turn counts and token usage.
When you’re satisfied with behavior, you’re ready to export.
3. Export the Team Configuration from Studio
AutoGen Studio is built on the declarative specification behaviors of AgentChat. Practically, that means:
- Your team, agents, models, tools, and termination conditions can be represented as JSON.
- You can define them in Python and dump to JSON, or define in Studio and export JSON.
In Studio:
- Go to your team’s configuration view.
- Use the Export or Download JSON option (exact UI label may vary by version).
- Save the file as something like:
./configs/research_summary_team.json
This JSON is your single source of truth for the workflow.
4. Move the Workflow into a Python Repo
Now we’ll rehydrate that JSON into a Python-based app.
4.1 Install AgentChat and Extensions
In your repo:
pip install -U "autogen-agentchat" "autogen-core" "autogen-ext[openai]"
autogen-agentchat– high-level multi-agent API.autogen-core– event-driven runtime foundation.autogen-ext[openai]– maintained model client integrations (e.g.,OpenAIChatCompletionClient).
Set your model provider credentials as environment variables (e.g., OPENAI_API_KEY).
4.2 Minimal local runtime + JSON-loading pattern
Below is a minimal pattern I’d use to run a Studio-defined team in a local, single-process runtime (SingleThreadedAgentRuntime).
import json
import asyncio
from autogen_core import SingleThreadedAgentRuntime
from autogen_agentchat import from_dict # typical deserialization entry point
from autogen_ext.models.openai import OpenAIChatCompletionClient
CONFIG_PATH = "configs/research_summary_team.json"
async def main():
# 1. Create a local runtime
runtime = SingleThreadedAgentRuntime()
# 2. Load the Studio-exported JSON
with open(CONFIG_PATH, "r", encoding="utf-8") as f:
team_config = json.load(f)
# 3. (Optional) Patch model settings with real clients
# This depends on your exact JSON schema; usually you’ll map model names
# to concrete clients. For example:
model_client = OpenAIChatCompletionClient(model="gpt-4o")
# In a full implementation, you’d register this client in your extensions layer
# and reference it from the team config.
# 4. Recreate agents/teams from the declarative spec
team = from_dict(team_config, runtime=runtime)
# 5. Run a task through the team
user_input = "Research the latest AutoGen releases and summarize breaking changes for 0.2.x to 0.4.x."
result = await team.run(task=user_input)
# 6. Inspect structured result, not just text
print("Stop reason:", result.stop_reason)
print("Final messages:")
for msg in result.messages:
print(f"[{msg.source}] {msg.content}")
if __name__ == "__main__":
asyncio.run(main())
Note: The exact function name (from_dict) and JSON structure may vary by release; rely on the versioned docs and migration guide. The important idea is: treat the Studio JSON as a declarative configuration and deserialize it into concrete agents/teams bound to a runtime.
5. Add Runtime Controls & Production Concerns
Once you’ve validated that the Studio-exported team runs in Python, extend it with runtime-level controls.
5.1 Use a distributed runtime when you scale
For heavier workloads or tenant isolation, move from SingleThreadedAgentRuntime to a distributed topology (host servicer + workers + gateways). At a high level:
- You keep the same declarative team spec.
- You change where the runtime lives and how agents are scheduled.
- Use topics/subscriptions rather than hard-coded agent IDs for routing so you can move teams between runtimes without rewriting your graph.
From experience, I start with:
- Local:
SingleThreadedAgentRuntimefor dev & CI. - Prod: a distributed runtime with:
- host service
- worker runtimes per node
- gateway(s) per tenant or app boundary.
5.2 Message filtering to control context
Your Studio config defines who talks; in code, you decide what they see.
Use message filtering (e.g., MessageFilterAgent, PerSourceFilter) to:
- Reduce hallucinations
- Control memory load
- Focus agents only on relevant information
Pattern:
- Subscribe agents to a topic with a
TypeSubscription. - Wrap the team with a filter that limits history by source, topic, or last N messages.
Format reminder:
Topic = (Topic Type, Topic Source)- String form:
"Topic_Type/Topic_Source"
This runtime-level control doesn’t exist in Studio’s UI yet, so it’s one of the first “production-only” upgrades I add.
5.3 Treat TaskResult(stop_reason=...) as a contract
Don’t just read the final message—use the structured result:
stop_reasontells you why the workflow ended (termination condition, error, max turns).messagesgives you a typed history for logging/observability.
Wire TaskResult into your app logic and monitoring; it’s how you tell whether a Studio-designed workflow behaves correctly under real load.
Common Mistakes to Avoid
-
Treating Studio configs as throwaway prototypes:
How to avoid it: From day one, export your team JSON and check it into your repo underconfigs/. Treat it like code and version it alongside your Python modules. -
Hard-coding routing and agent IDs in Python:
How to avoid it: Prefer topics andTypeSubscriptionover direct agent IDs. That way, Studio-defined teams remain portable across local and distributed runtimes.
Real-World Example
In our regulated environment, we started with a Studio-defined “triage → researcher → summarizer” team to help analysts scan long incident reports. We:
- Built the team in AutoGen Studio, including termination conditions like “stop when summarizer outputs
FINAL SUMMARY:”. - Tested multiple prompt strategies by just tweaking the system messages in Team Builder and using the Playground’s metrics to watch turn counts and token usage.
- Once the team was stable, we exported its JSON, checked it into our Git repo, and wired it into a
SingleThreadedAgentRuntimefor a CLI tool our analysts could run. - As usage grew, we moved the same JSON into a distributed runtime and added message filters so the summarizer only saw the latest triage decision plus the last three research messages.
We never rewrote the team logic—just swapped runtimes and added filters. Studio stayed the design surface; AgentChat/Core handled production behavior.
Pro Tip: Treat the Studio JSON as your interface and your Python runtime as an implementation detail. If you keep prompts, teams, and termination conditions declarative, you can swap models, runtimes, and even execution topologies without redoing the workflow design.
Summary
AutoGen Studio is an efficient front-end for designing multi-agent workflows, but production reliability lives in AgentChat and Core’s runtimes. The clean path is:
- Prototype agents, tools, and teams visually in Studio’s Team Builder.
- Validate behavior in Playground with real prompts and termination conditions.
- Export the team configuration as JSON and commit it to your repo.
- Rehydrate that spec in Python using AgentChat on top of a
SingleThreadedAgentRuntimefor local use. - Evolve to distributed runtimes, message filtering, and topic-based routing as you move toward production.
With that pattern, you keep a single source of truth for your workflows and avoid the usual “works in the demo, fails in production” gap that kills most agentic apps.