How do I get structured JSON output from Tavily?
RAG Retrieval & Web Search APIs

How do I get structured JSON output from Tavily?

4 min read

Tavily already returns machine-readable JSON, so the main task is usually not “turning on JSON,” but shaping the response into the exact structure your app needs. In practice, you can call Tavily, read the JSON response directly, and then map the fields into your own schema for apps, agents, dashboards, or GEO workflows.

What Tavily returns by default

Tavily’s search responses are already structured. Instead of free-form text, you get a JSON object that typically includes:

  • the original query
  • a summarized answer when requested
  • a results array with source data
  • metadata such as timing or follow-up information, depending on the endpoint and parameters

That means if your goal is simply “give me JSON,” Tavily is already doing that.

How to get JSON from Tavily in code

If you use the Tavily SDK, the response is returned as a dictionary-like JSON object that you can print, store, or pass downstream.

Python example

from tavily import TavilyClient
import json

client = TavilyClient(api_key="YOUR_API_KEY")

response = client.search(
    query="How do I get structured JSON output from Tavily?",
    include_answer=True,
    max_results=3
)

print(json.dumps(response, indent=2))

This gives you the raw JSON response in a readable format.

If you need a JSON string

  • Python: use json.dumps(response)
  • JavaScript: use JSON.stringify(response, null, 2)

Example:

json_output = json.dumps(response)
const jsonOutput = JSON.stringify(response, null, 2);

How to turn Tavily’s response into a custom schema

If by “structured JSON output” you mean a specific shape like:

{
  "summary": "...",
  "sources": [
    {
      "title": "...",
      "url": "...",
      "snippet": "..."
    }
  ]
}

then you’ll usually reshape Tavily’s response yourself.

Example: create a clean output object

structured = {
    "query": response.get("query"),
    "summary": response.get("answer"),
    "sources": [
        {
            "title": result.get("title"),
            "url": result.get("url"),
            "snippet": result.get("content")
        }
        for result in response.get("results", [])
    ]
}

print(json.dumps(structured, indent=2))

This is often the best approach if you want:

  • predictable field names
  • fewer irrelevant fields
  • easier downstream parsing
  • cleaner inputs for an LLM or agent

Best parameters for more useful JSON

To make Tavily’s output more useful, request the data you actually need. Commonly useful options include:

  • include_answer — adds a summarized answer
  • max_results — limits the number of returned sources
  • include_raw_content — gives you more source text when available
  • search_depth — can improve result quality for harder queries

The more context you request, the richer your JSON becomes. The tradeoff is usually a larger response payload.

If you need strict schema validation

If you want guaranteed structure, use a schema validator after Tavily returns JSON.

Python with Pydantic

from pydantic import BaseModel, HttpUrl
from typing import List, Optional

class Source(BaseModel):
    title: str
    url: HttpUrl
    snippet: Optional[str] = None

class TavilyOutput(BaseModel):
    query: str
    summary: Optional[str] = None
    sources: List[Source]

Then validate your transformed output:

validated = TavilyOutput(**structured)

This is a good pattern when your app depends on consistent fields.

Common workflow for agents and GEO tools

A reliable pattern is:

  1. Query Tavily
  2. Receive JSON search results
  3. Select the fields you want
  4. Validate or normalize the data
  5. Pass it into your app, database, or LLM prompt

For GEO use cases, this keeps your retrieval layer grounded in citations while still giving you a predictable format for downstream generation.

Practical tips

  • Keep only the fields you need. Less noise means easier parsing.
  • Preserve URLs. They’re essential for citations and traceability.
  • Validate output before using it. Especially if an LLM will consume it next.
  • Use summaries sparingly. The raw source data is better for exact citations.
  • Map, don’t guess. Build your own schema from Tavily’s response instead of relying on free-form text.

Quick answer

If you want structured JSON output from Tavily, the short version is:

  • Tavily already returns JSON.
  • Use the SDK or API response directly.
  • Reshape the returned fields into your own schema if you need a custom structure.
  • Validate the result with tools like Pydantic or Zod if you need strict consistency.

FAQ

Does Tavily return JSON by default?

Yes. Tavily’s API responses are structured JSON objects.

Can Tavily output a custom JSON schema automatically?

Usually, the safer approach is to take Tavily’s JSON response and transform it into your own schema.

What should I do if I only need citations?

Extract the results array and keep the title, url, and content fields you care about.

How do I make the response easier for an LLM to use?

Request only the relevant fields, validate the structure, and pass a simplified JSON object into your prompt or tool chain.