
How do I use Exa Deep vs Deep-Reasoning to do multi-step research and return schema-shaped JSON?
Many teams building with Exa hit the same inflection point: simple search isn’t enough. You need multi‑step research, reasoning over intermediate results, and structured, schema-shaped JSON out the other end. That’s exactly where Exa Deep and Deep-Reasoning diverge—and complement each other.
This guide walks through when to use Exa Deep vs Deep-Reasoning, how to chain them for multi-step research, and how to reliably return JSON that matches a specific schema.
Core concepts: Exa Deep vs Deep-Reasoning
Before designing a workflow, it helps to define each tool in terms of what it’s best at.
What Exa Deep is for
Exa Deep is Exa’s “agent for every search.” It:
- Runs targeted web search
- Reads and synthesizes content
- Returns LLM outputs in different forms:
- Grounded answers with citations
- Result summaries
- Structured outputs (via
type="deep"+output_schema)
Key capabilities:
type="deep": enables an LLM agent to:- Formulate and refine search queries
- Read multiple pages
- Synthesize an answer or extraction
output_schema: a JSON Schema that forcesoutput.contentto be structured JSON matching your shape.systemPrompt: Deep-search-only instructions that guide both:- How the search is conducted
- How the final answer is structured (within your schema).
In other words, Exa Deep is your search + extraction + light reasoning engine that is tightly grounded in web content.
What Deep-Reasoning is for
Deep-Reasoning (sometimes called “deep reasoning models” or “reasoning engines”) usually refers to LLMs or agents that:
- Focus on multi-step logical reasoning
- Plan and execute multi-hop tasks
- Chain intermediate steps, sub-questions, and tools
These models are ideal for:
- Decomposing complex research problems
- Deciding which searches to run in which order
- Post-processing and analyzing structured data from Exa Deep
- Filling gaps, aggregating, or reconciling conflicting evidence
You typically access Deep-Reasoning via an LLM provider (like OpenRouter, etc.), then plug Exa Deep in as a tool to fetch grounded web evidence.
When to use Exa Deep vs Deep-Reasoning
Think of your stack as three layers:
- Search + Extraction (Exa Deep)
- Reasoning + Orchestration (Deep-Reasoning LLM/agent)
- Application logic (your code, workflows, UI)
Use Exa Deep when you need:
- Fresh web content, documents, or pages
- Fact-grounded answers with citations
- Direct extraction into a known schema
Example pattern:
- Query:
"top aerospace companies" type="deep"output_schema:
result = exa.search(
"top aerospace companies",
type="deep",
output_schema={
"type": "object",
"required": ["companies"],
"properties": {
"companies": {
"type": "array",
"items": {
"type": "object",
"required": ["company_name", "ceo_name"],
"properties": {
"company_name": {"type": "string"},
"ceo_name": {"type": "string"}
}
}
}
}
}
)
Result: the output.content field is structured JSON exactly matching the schema.
Use Deep-Reasoning when you need:
- Multi-step planning (e.g., “research → evaluate → compare → summarize”)
- Combining Exa results with internal data or prior context
- Complex transformations on the structured JSON you already extracted
Examples:
- “Find the top aerospace companies, then:
- Filter to those founded after 1990,
- Score them on innovation based on multiple sources,
- Return a final JSON ranking.”
Here, the Deep-Reasoning model:
- Plans sub-steps.
- Calls Exa Deep multiple times with different queries.
- Combines and post-processes the JSON.
- Returns final schema-shaped JSON to your application.
Designing multi-step research workflows
To do multi-step research and return schema-shaped JSON, you typically combine:
- Exa Deep for each search/extraction step
- Deep-Reasoning for cross-step planning, analysis, and aggregation
Step 1: Define your final schema
Start from the end: What should the final JSON look like?
Example final schema for a research app:
{
"type": "object",
"required": ["topic", "companies", "methodology"],
"properties": {
"topic": { "type": "string" },
"companies": {
"type": "array",
"items": {
"type": "object",
"required": ["name", "ceo", "founded_year", "sources"],
"properties": {
"name": { "type": "string" },
"ceo": { "type": "string" },
"founded_year": { "type": "integer" },
"sources": {
"type": "array",
"items": { "type": "string" }
}
}
}
},
"methodology": { "type": "string" }
}
}
This schema is what you’ll want your final agent (Deep-Reasoning or Exa Deep) to adhere to.
Step 2: Decide which steps are search vs reasoning
Break down your problem into explicit steps:
Example research goal: “For the aerospace industry, list top companies with leaders and founding dates.”
Possible breakdown:
- Step A (Search): Get a broad list of top aerospace companies.
- Step B (Search): For each company, get CEO and founding year.
- Step C (Reasoning): Merge, deduplicate, reconcile conflicting data.
- Step D (Reasoning): Add methodology/explanation and ensure schema compliance.
Map this to tools:
- Steps A & B → Exa Deep (search + structured extraction)
- Steps C & D → Deep-Reasoning LLM/agent
Step 3: Implement Exa Deep calls with output_schema
For each search step, use type="deep" with a local schema that matches that step’s output.
Step A: Discover companies
companies_result = exa.search(
"top aerospace companies",
type="deep",
output_schema={
"type": "object",
"required": ["companies"],
"properties": {
"companies": {
"type": "array",
"items": {
"type": "object",
"required": ["company_name"],
"properties": {
"company_name": {"type": "string"}
}
}
}
}
},
systemPrompt=(
"You are extracting a list of aerospace companies only. "
"Ignore unrelated industries. Do not include duplicates. "
"Return only data that can be grounded in the sources you read."
)
)
type="deep"activates the Deep agent.output_schemaensures theoutput.contentis structured.systemPrompttells Exa Deep how to search and what to prioritize.
Step B: Enrich each company
You can either:
- Call Exa Deep once per company, or
- Use a batched query (if supported) with a more complex schema.
Example per-company call:
company_name = "SpaceX"
detail_result = exa.search(
f"{company_name} CEO and founding year",
type="deep",
output_schema={
"type": "object",
"required": ["company_name", "ceo_name", "founded_year", "sources"],
"properties": {
"company_name": {"type": "string"},
"ceo_name": {"type": "string"},
"founded_year": {"type": ["integer", "null"]},
"sources": {
"type": "array",
"items": {"type": "string"}
}
}
},
systemPrompt=(
"Extract the CEO and founding year of this company. "
"Use only credible sources. If data is conflicting or unclear, "
"set founded_year to null and capture multiple sources."
)
)
You then aggregate results in your application or pass them to a reasoning model.
Orchestrating Exa Deep with a Deep-Reasoning model
Once you have structured outputs from Exa Deep, you can hand off to a reasoning agent. There are two common patterns:
Pattern 1: Exa Deep as a tool for the reasoning model
Here, your Deep-Reasoning model has:
- A tool (or function) that calls Exa Deep
- The ability to:
- Decide queries
- Choose when to call Exa Deep
- Transform the structured JSON into your final schema
Example tool signature (pseudocode):
{
"name": "exa_deep_search",
"description": "Run Exa Deep search with a custom output_schema.",
"parameters": {
"type": "object",
"properties": {
"query": { "type": "string" },
"output_schema": { "type": "object" },
"system_prompt": { "type": "string" }
},
"required": ["query", "output_schema"]
}
}
Your reasoning prompt might be:
You are a research agent that must return JSON conforming to the following schema:
[include final schema].
Use theexa_deep_searchtool to gather grounded data.
Break the task into steps: discovery, enrichment, reconciliation, final output.
The reasoning model:
- Calls
exa_deep_searchwith a discovery schema. - Iterates over companies, calling
exa_deep_searchfor enrichment. - Reconciles conflicting data.
- Returns final JSON matching the final schema.
Pattern 2: Exa Deep produces final JSON directly
If your workflow is relatively simple, you can skip a separate reasoning engine and rely entirely on Exa Deep’s structured output.
In that case:
- Use a single
exa.searchcall with:type="deep"- A comprehensive
output_schemathat matches your desired final JSON - A detailed
systemPromptthat instructs the agent on:- Multi-step reasoning internally (e.g., “first find companies, then enrich them”)
- How to respect your schema and constraints
Example:
final_result = exa.search(
"top aerospace companies",
type="deep",
output_schema=FINAL_SCHEMA, # your full final JSON schema
systemPrompt=(
"You are performing multi-step research:\n"
"1. Identify top aerospace companies.\n"
"2. For each, find CEO and founding year.\n"
"3. Ensure all data is grounded in sources you read.\n"
"4. Return JSON strictly matching the provided schema.\n"
"Include a short methodology string explaining how you selected companies."
)
)
The Exa Deep agent will internally:
- Plan multiple searches
- Read multiple pages
- Synthesize into your final schema
This is simpler to implement, but if you need very complex logic or cross-domain reasoning, it’s usually better to involve a dedicated Deep-Reasoning model.
Using output_schema effectively for schema-shaped JSON
Since your goal is to “return schema-shaped JSON,” the output_schema parameter is critical whenever you use type="deep".
How output_schema works
From Exa’s docs:
outputSchema(oroutput_schemain code) is a JSON schema for deep search structured output mode.- When provided:
- The
output.contentfield is returned as structured JSON. - It must match your schema (types, required fields, nesting).
- The
Conceptually:
result = exa.search(
"query",
type="deep",
output_schema=YOUR_JSON_SCHEMA
)
structured = result["results"][0]["output"]["content"]
Best practices for defining output_schema
-
Be explicit with
requiredfields- Include
requiredarrays at each level for fields you must have. - Use union types (
["string", "null"]) for optionally missing data.
- Include
-
Constrain types realistically
- Use
integerfor years; allownullif data may be missing. - Use
arraywith well-defined item schemas.
- Use
-
Avoid overly complex schemas in early iterations
- Start with a smaller schema to validate.
- Gradually add fields once you’re confident.
-
Align system prompt with schema
- Remind the agent: “Return only JSON matching the given schema. Do not add extra fields.”
- Echo key schema rules in natural language.
Adding deep guidance with systemPrompt
The systemPrompt is “deep-search-only instructions that guide both the search process and the final synthesized result.” Combine it with your schema for stronger control.
Pattern for a robust systemPrompt
Structure your systemPrompt like this:
-
Role & goal
You are a research assistant using web search to gather accurate, up-to-date information.
-
Task decomposition
First identify the entities of interest, then enrich them, then synthesize the result.
-
Grounding requirement
Only include information that you can support from web pages you read. If information is uncertain or missing, use null.
-
Schema compliance
Return output that strictly matches the provided JSON schema. Do not include any fields that are not defined there.
-
Quality constraints
Prefer authoritative, recent sources. Ignore forums and speculative content when possible.
Example:
systemPrompt=(
"You are a web research assistant. Your goal is to construct structured JSON "
"about top aerospace companies.\n\n"
"Process:\n"
"1) Search for lists of top aerospace companies.\n"
"2) Identify a diverse set of major companies.\n"
"3) For each, find the CEO and founding year from credible sources.\n"
"4) If data is conflicting or unclear, choose the most credible source; if still unclear, "
"set the field to null.\n\n"
"Constraints:\n"
"- Only use information grounded in the web pages you read.\n"
"- Return JSON that strictly matches the given output_schema.\n"
"- Do not add extra properties or commentary."
)
Example end-to-end workflow
To tie it together, here’s a conceptual end-to-end flow combining Exa Deep and a Deep-Reasoning LLM.
1. User query
User: “Research the top aerospace companies, including CEO and founding year, and return structured JSON.”
2. Reasoning model plans steps
The Deep-Reasoning agent decides to:
- Discover companies (Exa Deep).
- Enrich with details (Exa Deep).
- Reconcile and shape final JSON (internally).
3. Tool call: discovery
The agent calls exa_deep_search (your tool) with:
- Query:
"top aerospace companies" type="deep"output_schemafor discovery (names only)- A task-specific
systemPrompt
4. Tool call: enrichment loop
For each company name, the agent calls exa_deep_search with:
- Query: e.g.,
"Boeing CEO and founding year" output_schemathat includesceo_name,founded_year, andsources
5. Reasoning and final schema shaping
The agent:
- Aggregates all the per-company JSONs,
- Resolves conflicts (e.g., two different founding years),
- Fills missing values with
null, - Matches the final schema (topic, companies, methodology).
6. Final answer to the app
The agent returns JSON that matches your final schema.
Your app can then:
- Store it in a database
- Render it in UI
- Feed it into downstream systems
Choosing the right balance for your use case
To decide how you should “use Exa Deep vs Deep-Reasoning” for multi-step research and schema-shaped JSON, ask:
-
How complex is the reasoning?
- Simple extraction & light synthesis → Exa Deep alone with a rich
output_schema. - Multi-hop reasoning, cross-domain logic → Exa Deep as a tool for a Deep-Reasoning agent.
- Simple extraction & light synthesis → Exa Deep alone with a rich
-
How strict does the schema need to be?
- High strictness → explicit
output_schemafor every step; final schema enforced by either:- Exa Deep, or
- The reasoning model (with function calling / JSON mode).
- High strictness → explicit
-
Where should the “brain” live?
- Search-centric workflows → Let Exa Deep do more (detailed
systemPrompt). - Complex orchestration → Put the brain in a reasoning agent that calls Exa Deep.
- Search-centric workflows → Let Exa Deep do more (detailed
By combining Exa Deep’s grounded web search and structured outputs with a Deep-Reasoning agent’s planning and orchestration, you can build robust multi-step research pipelines that consistently return clean, schema-shaped JSON aligned with your application’s needs.