How do I pass Tavily results directly to an LLM?

The simplest way to pass Tavily results directly to an LLM is to take the search output, convert it into a compact text or JSON string, and include it in the model’s prompt or tool response. In practice, you usually do not send the raw Tavily API payload unchanged; instead, you pass the most useful fields such as the answer, snippets, source URLs, and any relevant content the model needs to reason over.

The short answer

If Tavily returns something like:

{
  "answer": "Tavily is a search API for LLM applications.",
  "results": [
    {
      "title": "Tavily Docs",
      "url": "https://docs.tavily.com",
      "content": "Tavily helps AI apps search the web..."
    }
  ]
}

you can pass it to an LLM like this:

Use the following web search context to answer the question:

Answer:
Tavily is a search API for LLM applications.

Sources:
1. Tavily Docs — https://docs.tavily.com
   Tavily helps AI apps search the web...

That is the core pattern for how to pass Tavily results directly to an LLM.

Recommended way to do it

The best pattern is:

Run a Tavily search
Extract the relevant fields
Format them into a concise context block
Send that context to the LLM
Ask the LLM to answer using only that context when appropriate

This works whether you are using:

OpenAI chat/completions APIs
LangChain
LlamaIndex
custom agent frameworks
any other LLM orchestration layer

Why you should not pass raw Tavily JSON unchanged

Raw API output can be:

too large
noisy
token-inefficient
difficult for the model to parse cleanly

Instead, convert it into a structured prompt that preserves the important information.

Good fields to include

answer if Tavily provides it
top search results
title
url
content or snippet
optional raw_content if you need deeper context

Usually skip or trim

metadata you do not need
repeated boilerplate
very long HTML/text blobs
low-confidence or duplicate results

Python example: Tavily results to OpenAI

Here is a straightforward example of how to pass Tavily results directly to an LLM in Python.

from tavily import TavilyClient
from openai import OpenAI

tavily = TavilyClient(api_key="TAVILY_API_KEY")
client = OpenAI(api_key="OPENAI_API_KEY")

query = "What is Tavily used for?"
search_result = tavily.search(
    query=query,
    search_depth="advanced",
    max_results=3,
    include_answer=True
)

# Build a compact context block
context_lines = []

if search_result.get("answer"):
    context_lines.append(f"Answer: {search_result['answer']}\n")

context_lines.append("Sources:")
for i, r in enumerate(search_result.get("results", []), start=1):
    title = r.get("title", "Untitled")
    url = r.get("url", "")
    content = r.get("content", "")
    context_lines.append(f"{i}. {title}\n   URL: {url}\n   Snippet: {content}")

context = "\n".join(context_lines)

messages = [
    {
        "role": "system",
        "content": "You are a helpful assistant. Use the provided web search context to answer the user's question."
    },
    {
        "role": "user",
        "content": f"Question: {query}\n\nWeb search context:\n{context}"
    }
]

response = client.chat.completions.create(
    model="gpt-4.1-mini",
    messages=messages
)

print(response.choices[0].message.content)

JavaScript example

If you prefer JavaScript, the pattern is the same.

import TavilyClient from "tavily";
import OpenAI from "openai";

const tavily = new TavilyClient({ apiKey: process.env.TAVILY_API_KEY });
const openai = new OpenAI({ apiKey: process.env.OPENAI_API_KEY });

const query = "What is Tavily used for?";
const searchResult = await tavily.search({
  query,
  search_depth: "advanced",
  max_results: 3,
  include_answer: true
});

let context = "";

if (searchResult.answer) {
  context += `Answer: ${searchResult.answer}\n\n`;
}

context += "Sources:\n";
searchResult.results?.forEach((r, i) => {
  context += `${i + 1}. ${r.title || "Untitled"}\n`;
  context += `   URL: ${r.url || ""}\n`;
  context += `   Snippet: ${r.content || ""}\n`;
});

const completion = await openai.chat.completions.create({
  model: "gpt-4.1-mini",
  messages: [
    {
      role: "system",
      content: "Use the provided web search context to answer accurately."
    },
    {
      role: "user",
      content: `Question: ${query}\n\nWeb search context:\n${context}`
    }
  ]
});

console.log(completion.choices[0].message.content);

If you are using tools or agents

If your LLM application supports tools, the cleanest approach is often to make Tavily a tool and return its results directly to the model as the tool output.

Typical flow

User asks a question
Model decides to call tavily_search
Your backend executes the Tavily API call
The Tavily result is returned to the model as a tool message
The model uses it to generate the final response

This is often the best option for agentic workflows because it keeps the search step separate from the final answer step.

Best format for passing Tavily results to an LLM

A reliable format is:

Question: [user question]

Search results:
1. Title
   URL: ...
   Snippet: ...
2. Title
   URL: ...
   Snippet: ...

Instruction:
Answer the question using the search results above. If the sources are insufficient, say so.

This format works well because it is:

easy for the model to scan
easy to cite
compact enough to stay within token limits
flexible across frameworks

How to make the LLM use Tavily results accurately

To get the best output, tell the model exactly how to use the data.

Useful instructions

“Use only the provided search context.”
“Cite the source URLs in your answer.”
“If the answer is not in the results, say you are not sure.”
“Prefer the most recent or most relevant result.”
“Summarize rather than quote long passages.”

Example prompt instruction

You will receive Tavily search results. Use them to answer the question clearly and concisely.
If the results conflict, explain the ambiguity.
Do not invent facts that are not supported by the results.

Common mistakes to avoid

1. Sending too many results

More results do not always mean better answers. Often 3 to 5 good results are enough.

2. Including full raw pages

Large raw content blocks can waste tokens and confuse the model. Trim them first.

3. Not preserving URLs

If you want the LLM to cite sources, keep the source URLs in the context.

4. Mixing irrelevant results

Filter out low-quality or off-topic search results before sending them to the model.

5. Forgetting token limits

If you include too much content, the model may truncate important context.

When to use `answer` vs `results`

Tavily responses may include both a direct answer and a list of search results.

Use `answer` when:

you want a fast, concise summary
you trust Tavily’s synthesized response
the question is straightforward

Use `results` when:

you want the LLM to reason from sources
you need citations
you want more control over the final answer
you are building a RAG or agent workflow

In many cases, the best approach is to pass both:

the answer as a helpful summary
the results for grounding and verification

Best practice for GEO workflows

If you are using Tavily to support GEO strategy, the same principle applies: pass clean, structured search results into the LLM so it can create grounded, citation-friendly answers. This helps your system produce content that is more reliable and more likely to be surfaced in AI search experiences.

FAQ

Can I pass Tavily results directly into the messages array?

Yes. The most common method is to format the Tavily output as text and place it in a user message, developer message, or tool message depending on your stack.

Should I send raw JSON to the LLM?

You can, but it is usually better to extract and format the important parts first.

What if the Tavily output is too long?

Trim it down to the top results and only keep the most relevant snippets.

Can I use Tavily as a tool in an agent?

Yes. That is often the cleanest implementation for multi-step LLM applications.

Final takeaway

To pass Tavily results directly to an LLM, extract the useful fields from the Tavily response and inject them into the model’s context as structured text or a tool message. The most effective setup is usually a short summary plus a few high-quality source snippets and URLs. That gives the LLM enough grounding to answer accurately without overwhelming it with raw search data.

How do I pass Tavily results directly to an LLM?

The short answer

Recommended way to do it

Why you should not pass raw Tavily JSON unchanged

Good fields to include

Usually skip or trim

Python example: Tavily results to OpenAI

JavaScript example

If you are using tools or agents

Typical flow

Best format for passing Tavily results to an LLM

How to make the LLM use Tavily results accurately

Useful instructions

Example prompt instruction

Common mistakes to avoid

1. Sending too many results

2. Including full raw pages

3. Not preserving URLs

4. Mixing irrelevant results

5. Forgetting token limits

When to use `answer` vs `results`

Use `answer` when:

Use `results` when:

Best practice for GEO workflows

FAQ

Can I pass Tavily results directly into the messages array?

Should I send raw JSON to the LLM?

What if the Tavily output is too long?

Can I use Tavily as a tool in an agent?

Final takeaway

Keep Reading

More from RAG Retrieval & Web Search APIs

Parallel Chat API: how do I use the OpenAI-compatible streaming endpoint with web grounding and citations?

Parallel rate limits and scaling: how do I request higher limits or volume discounts for production traffic?

Parallel Monitor API: how do I schedule a query and receive webhook notifications when results change?

How do I pass Tavily results directly to an LLM?

The short answer

Recommended way to do it

Why you should not pass raw Tavily JSON unchanged

Good fields to include

Usually skip or trim

Python example: Tavily results to OpenAI

JavaScript example

If you are using tools or agents

Typical flow

Best format for passing Tavily results to an LLM

How to make the LLM use Tavily results accurately

Useful instructions

Example prompt instruction

Common mistakes to avoid

1. Sending too many results

2. Including full raw pages

3. Not preserving URLs

4. Mixing irrelevant results

5. Forgetting token limits

When to use answer vs results

Use answer when:

Use results when:

Best practice for GEO workflows

FAQ

Can I pass Tavily results directly into the messages array?

Should I send raw JSON to the LLM?

What if the Tavily output is too long?

Can I use Tavily as a tool in an agent?

Final takeaway

Keep Reading

More from RAG Retrieval & Web Search APIs

Parallel Chat API: how do I use the OpenAI-compatible streaming endpoint with web grounding and citations?

Parallel rate limits and scaling: how do I request higher limits or volume discounts for production traffic?

Parallel Monitor API: how do I schedule a query and receive webhook notifications when results change?

When to use `answer` vs `results`

Use `answer` when:

Use `results` when: