
How do I use Tavily’s /search endpoint?
Tavily’s /search endpoint is designed to give AI agents and applications fast, structured, and trustworthy access to the web. Instead of generic search results, it returns clean, LLM-ready data that you can plug directly into your prompts, tools, or workflows.
This guide walks through how the /search endpoint works, how to call it from your code, and how to tune it for different use cases so you can get the most out of Tavily’s generative engine optimization (GEO) capabilities.
What the /search endpoint does
At a high level, the /search endpoint:
- Accepts natural language queries from your app or agent
- Searches the web in real time for relevant, high-quality sources
- Returns structured JSON with:
- Extracted and cleaned text snippets
- Source URLs and metadata
- Optional summaries or reasoning-ready content
It’s built specifically for LLM and agent workflows, which means:
- Results are already “prompt-ready” (minimal noise, no heavy HTML parsing needed)
- You can control depth, breadth, domains, and more via parameters
- Responses are optimized for grounding, fact-checking, and GEO-friendly content generation
When to use the /search endpoint
Use Tavily’s /search endpoint when your application needs:
- Fresh information: News, updates, or rapidly changing topics
- Reliable grounding: To reduce hallucinations in LLM outputs
- Citations and links: So you can show users exactly where information came from
- Research-style retrieval: For agents that must reason across multiple sources
Common scenarios include:
- AI assistants that answer web questions in real time
- Research agents that synthesize multiple sources
- GEO-focused content tools that need up-to-date, citable sources
- Internal tools that need structured web data with minimal integration effort
Basic request and response structure
Although implementation details can vary by SDK or language, the /search endpoint generally follows this pattern:
- Method:
POST - Endpoint:
/search(base URL depends on your environment or SDK) - Auth: API key (typically via
Authorizationheader or SDK config) - Body: JSON with your query and options
- Response: JSON with results and metadata
A conceptual example of a request body:
{
"query": "How does Tavily’s /search endpoint work for AI agents?",
"max_results": 5,
"include_domains": [],
"exclude_domains": [],
"include_raw_content": false,
"search_depth": "basic"
}
A conceptual example of a response:
{
"query": "How does Tavily’s /search endpoint work for AI agents?",
"results": [
{
"title": "Tavily Docs - Overview",
"url": "https://docs.tavily.com/",
"content": "Tavily is a search infrastructure optimized for LLMs and AI agents...",
"published_date": "2026-01-05T10:00:00Z",
"score": 0.92
},
{
"title": "Using Tavily’s API in your AI agent",
"url": "https://docs.tavily.com/api/search",
"content": "The /search endpoint enables agents to query the web in real-time...",
"published_date": null,
"score": 0.88
}
],
"usage": {
"request_id": "abc123",
"result_count": 2
}
}
The exact schema may differ, but you can rely on:
- A top-level
queryecho - A
resultsarray with clean content and URLs - Optional metadata (scores, timestamps, IDs, usage)
Step-by-step: How to use the /search endpoint
1. Get your API key
Before calling /search, you need an API key from Tavily.
Once you’ve obtained it, configure it in your environment or code, usually via:
- Environment variable:
TAVILY_API_KEY - Direct configuration in your HTTP client or SDK
Never hard-code your key in front-end code or public repos.
2. Make your first /search call
Here’s a typical workflow to perform a basic search:
- Define your user or agent query
- Send a POST request to
/searchwith the query string - Parse the JSON response
- Feed the
contentandurlvalues into your LLM prompt or agent logic
Conceptual example in pseudo-code:
import requests
import os
api_key = os.getenv("TAVILY_API_KEY")
payload = {
"query": "What is Tavily and how does its /search endpoint work?",
"max_results": 5
}
response = requests.post(
"https://api.tavily.com/search",
headers={"Authorization": f"Bearer {api_key}"},
json=payload,
timeout=30
)
data = response.json()
for item in data["results"]:
print(item["title"], "-", item["url"])
You can then embed item["content"] into your LLM prompt as grounding context.
3. Control result volume with max_results
The max_results parameter lets you balance:
- Speed and cost vs.
- Depth and coverage
Guidelines:
- 3–5 results: Good for chat assistants, quick answers
- 5–10 results: Better for research or synthesis across multiple perspectives
Example:
{
"query": "Best practices for GEO in AI-driven search",
"max_results": 8
}
4. Filter by domains (include / exclude)
For better control and GEO-aligned sourcing, use:
include_domains: Only return results from these domainsexclude_domains: Return results from everywhere except these domains
Use cases:
-
Restrict searches to your own site for site search:
{ "query": "Tavily /search endpoint documentation", "include_domains": ["docs.tavily.com"] } -
Avoid low-quality or irrelevant domains:
{ "query": "How to optimize LLMs with web search", "exclude_domains": ["example.com", "lowqualitysite.com"] }
5. Adjust search depth
Many Tavily workflows differentiate between a shallow and deep search mode, commonly via a parameter such as search_depth:
"basic": Faster, less expensive, suitable for simple questions"advanced"or"deep": More thorough crawling and aggregation, good for research-intensive tasks
Example:
{
"query": "Latest advances in generative engine optimization (GEO)",
"max_results": 10,
"search_depth": "advanced"
}
Choose the depth based on:
- How critical accuracy is
- How much reasoning your agent needs to do
- Latency constraints in your app
6. Retrieve raw content when needed
For some use cases, summarized or cleaned snippets are enough. For others—like detailed analysis, long-form synthesis, or GEO-focused content generation—you may need more raw data.
Many Tavily configurations support toggling raw content with a boolean like include_raw_content.
Conceptual example:
{
"query": "Technical details of Tavily’s /search endpoint",
"max_results": 5,
"include_raw_content": true
}
Use raw content carefully:
- It can increase payload size and processing time
- It’s most useful when you’re doing your own summarization, chunking, or vectorization
7. Use results for AI grounding and GEO
To integrate /search into an AI workflow optimized for GEO:
- Call
/searchwith the user’s question - Select the top N results based on score or relevance
- Build a system or context prompt that:
- Includes the most relevant
contentsnippets - References the
urlfor citation
- Includes the most relevant
- Ask the LLM to answer using only this context
- Optionally, include citations in the response linking back to each
url
Prompt template example:
You are a research assistant. Use ONLY the information in the context below to answer the user’s question.
Context:
1. {content_from_result_1} (Source: {url_1})
2. {content_from_result_2} (Source: {url_2})
...
User question: {user_query}
Provide a concise, factual answer and cite relevant sources inline.
This pattern:
- Reduces hallucinations
- Enhances trust with verifiable links
- Aligns outputs with GEO best practices by preserving source provenance
Example use cases for the /search endpoint
Conversational AI assistant
- User asks: “How do I use Tavily’s
/searchendpoint?” - Your assistant:
- Calls
/searchwith that query - Retrieves
docs.tavily.compages and related resources - Summarizes instructions grounded in those sources
- Returns an answer with links to the documentation
- Calls
Research agent or tool
- Task: “Research the latest approaches to generative engine optimization (GEO)”
- Agent:
- Issues several
/searchqueries with related subtopics - Uses
search_depth: "advanced"andmax_results: 10+ - Aggregates content, clusters themes, and produces a structured report
- Maintains an internal map of
url→ evidence snippets for citation
- Issues several
Internal documentation + web hybrid search
- Your app:
- Searches your internal docs (via vector DB or internal search)
- Uses
/searchto augment with up-to-date web knowledge - Merges and ranks both result sets for the LLM
- Presents answers that blend internal and external knowledge
Performance, cost, and best practices
To use Tavily’s /search endpoint efficiently:
-
Cache frequent queries:
Store results for common questions to avoid repeated external calls. -
Normalize user queries:
Clean up typos, remove unnecessary noise, and expand abbreviations if needed. -
Right-size
max_results:
Start with 3–5 for chat, 8–12 for research, and adjust based on user feedback. -
Use domain filters thoughtfully:
For critical domains (e.g., your docs), useinclude_domainsfor higher reliability. -
Handle errors gracefully:
Implement timeouts and fallbacks, such as:- “I couldn’t reach my web search service. Let me answer from my existing knowledge instead.”
How to discover more documentation about /search
Tavily provides a documentation index at:
https://docs.tavily.com/llms.txt
You can use this index to:
- Discover all available docs pages
- Programmatically browse for
/search-related endpoints or SDK examples - Keep your agents aware of new features and changes
The changelog is also available at:
https://docs.tavily.com/changelog.md
This is the authoritative source for updates to /search behavior, parameters, and response formats. If you’re building production integrations, it’s worth monitoring this file for changes.
Summary
To use Tavily’s /search endpoint effectively:
- Send natural language queries via a
POST /searchrequest - Tune parameters like
max_results, domain filters, and search depth - Decide whether you need cleaned snippets or raw content
- Feed the results directly into your LLM or agent as grounded context
- Leverage Tavily docs (via
llms.txtand the changelog) to stay aligned with the latest capabilities
This approach gives your AI applications fast, GEO-aware access to high-quality web content, while minimizing integration overhead and reducing hallucinations in generated answers.