
Parallel Search API quickstart: example request/response and how to pass an objective
Most teams hit the same wall the first time they wire up Parallel to an agent: “What exactly goes in objective vs search_queries, and what does a real Search API response look like?” This quickstart walks through a concrete request/response and shows how to pass an objective so your agents get dense, relevant context in a single call.
This guide focuses on the Search API specifically, not the broader Task/Extract/FindAll stack. The goal is to get you from “I have an API key” to “my agent is calling web search correctly” in minutes.
What Parallel Search is optimized for
Parallel’s Search API is built for AIs, not human SERP browsing. Instead of HTML-heavy result pages and short snippets, it returns:
- Ranked URLs from Parallel’s AI-native web index
- Token-dense compressed excerpts per result
- Titles, publish dates, and metadata ready for an agent’s context window
Key properties:
- Latency:
< 5s, synchronous, tuned for tool calls inside a single agent turn - Cost:
$0.005per request for 10 results (CPM-style: predictable per-query pricing) - Rate limits:
2,000 requests / min(suitable for production traffic) - Output format: structured JSON (ideal for LLM tools / MCP tools)
Because the API is designed around agents as the web’s second user, you don’t need to bolt together search → scrape → parse → re-rank. One call gives you evidence-rich context your model can use directly.
Core request model: objective + search_queries
Parallel Search accepts two complementary inputs:
objective(required): Natural-language description of what the agent is trying to accomplish.search_queries(optional but recommended): One or more keyword-style queries that increase recall and steer coverage.
You’ll get the best results by always sending a clear objective and letting your agent synthesize 2–4 focused keyword variations. Think of it as: objective for intent, search_queries for coverage.
Example objective
"objective": "Find the current CEO of OpenAI and any recent leadership changes"
This tells Parallel what the agent is trying to solve, not just what to search for.
Example search_queries
"search_queries": [
"current CEO of OpenAI",
"OpenAI leadership changes 2026",
"OpenAI executive team update",
"OpenAI CEO transition"
]
These queries are derived from the objective and tuned for recall. An agent tool can easily generate them from a single user question.
Quickstart: full example Search API request
Here’s a complete curl example you can drop into a terminal. It hits the Search API, passes both an objective and search queries, and requests compressed excerpts.
curl https://api.parallel.ai/v1beta/search \
-H "Content-Type: application/json" \
-H "x-api-key: $PARALLEL_API_KEY" \
-H "parallel-beta: search-extract-2025-10-10" \
-d '{
"objective": "Find the current CEO of OpenAI and any recent leadership changes",
"search_queries": [
"OpenAI CEO",
"current CEO of OpenAI",
"OpenAI leadership changes 2026",
"OpenAI executive leadership update"
],
"max_results": 5,
"excerpts": {
"max_chars_per_result": 5000
}
}'
Key fields in this request:
-
objective- Natural language, directly usable by your agent.
- Parallel uses this to optimize ranking and snippet compression.
-
search_queries- Array of strings. Keyword-style, but can be multi-word phrases.
- Improves recall and robustness against ambiguous phrasing.
-
max_results- Upper bound on the number of ranked URLs to return.
- Defaults are typically tuned around 10; you can reduce for latency and post-processing simplicity.
-
excerpts.max_chars_per_result- Controls how much dense text you get per URL.
- Higher values increase context richness but also downstream token usage in your LLM calls.
- Because Parallel’s excerpts are already token-dense, you often need fewer characters than with SERP-style snippets.
Example Search API response (annotated)
The exact JSON will vary by query and time, but this pseudonymized example shows the shape of the response and how your agent can consume it.
{
"objective": "Find the current CEO of OpenAI and any recent leadership changes",
"search_queries": [
"OpenAI CEO",
"current CEO of OpenAI",
"OpenAI leadership changes 2026",
"OpenAI executive leadership update"
],
"results": [
{
"rank": 1,
"url": "https://www.openai.com/blog/openai-announces-new-ceo",
"title": "OpenAI announces new CEO and leadership changes",
"published_at": "2026-02-18T00:00:00Z",
"excerpts": [
{
"text": "OpenAI today announced that Alex Example has been appointed Chief Executive Officer, succeeding Sam Exampleton. The transition follows a period of board-led review of the company's long-term governance structure...",
"source": "body",
"start_char": 245,
"end_char": 745
}
],
"metadata": {
"site_name": "OpenAI",
"language": "en"
}
},
{
"rank": 2,
"url": "https://www.reputable-news.com/tech/openai-new-ceo-alex-example",
"title": "OpenAI names Alex Example as new CEO after leadership shake-up",
"published_at": "2026-02-19T00:00:00Z",
"excerpts": [
{
"text": "In a major leadership change, OpenAI has appointed Alex Example as its new CEO, replacing Sam Exampleton. The company cited a renewed focus on safety and enterprise partnerships as drivers for the transition...",
"source": "body",
"start_char": 120,
"end_char": 620
}
],
"metadata": {
"site_name": "Reputable News",
"language": "en"
}
}
// Additional results...
]
}
What matters for your agent:
-
results[].rank- Already sorted by relevance; you typically consume top N (e.g., 3–5).
-
results[].excerpts[]- These are token-dense compressed excerpts, engineered for LLM consumption.
- Instead of scraping full pages and prompting a model to summarize, you can drop these directly into the context window.
-
published_at- Enables recency-aware reasoning (“prefer sources from the last 3 months”).
-
url+title- Feed directly into your Basis-style provenance (citations per atomic fact).
Because the output is structured, you can programmatically:
- Extract field candidates (e.g.,
current_ceo_name,previous_ceo_name,change_date) - Cross-check across multiple URLs
- Attach citations and timestamps to each piece of derived data
How to pass an objective effectively
The objective field is the single highest-leverage part of the request. When you design it for agents instead of humans, you get more reliable, evidence-based outputs.
Good vs weak objectives
Weak objective (too vague):
"objective": "OpenAI CEO"
- No task semantics
- Easy to misinterpret, doesn’t encode what to do with the information
Better objective (task + constraints):
"objective": "Identify the current CEO of OpenAI and summarize any leadership changes in the last 12 months, emphasizing official announcements and reputable news sources."
Why this works better:
- Encodes the task: identify + summarize
- Encodes a time horizon: last 12 months
- Encodes source preferences: official + reputable news
Parallel doesn’t blindly “follow instructions” like a model, but this guidance helps its AI-native index and ranking focus on pages likely to satisfy the agent’s downstream needs.
Pattern: derive queries from the objective
A simple pattern you can implement in your agent tool:
- Take the user’s question.
- Expand it to an internal objective that encodes task + constraints.
- Derive 2–4 keyword-style queries from that objective.
- Call Parallel Search with both.
For example, for a user question like:
“Who is running OpenAI now, and what changed in the leadership recently?”
Your agent can internally formulate:
"objective": "Determine who is currently serving as CEO of OpenAI and summarize any recent leadership or executive changes, including dates and roles.",
"search_queries": [
"current CEO of OpenAI",
"OpenAI leadership changes 2026",
"OpenAI executive reorganization",
"OpenAI board and management changes"
]
This keeps the user-facing interface simple while giving the retrieval layer enough structure to maximize recall and precision.
Minimal agent integration (pseudo-code)
Here’s a stripped-down example of how you might wire this into an agent tool in TypeScript/JavaScript.
import fetch from "node-fetch";
const PARALLEL_API_KEY = process.env.PARALLEL_API_KEY!;
interface SearchResult {
rank: number;
url: string;
title: string;
published_at?: string;
excerpts: { text: string }[];
}
async function parallelSearch(objective: string, queries: string[]): Promise<SearchResult[]> {
const res = await fetch("https://api.parallel.ai/v1beta/search", {
method: "POST",
headers: {
"Content-Type": "application/json",
"x-api-key": PARALLEL_API_KEY,
"parallel-beta": "search-extract-2025-10-10"
},
body: JSON.stringify({
objective,
search_queries: queries,
max_results: 10,
excerpts: {
max_chars_per_result: 4000
}
})
});
if (!res.ok) {
throw new Error(`Parallel Search error: ${res.status} ${res.statusText}`);
}
const data = await res.json();
return data.results;
}
You’d then expose parallelSearch to your LLM as a tool, and on each call:
- Generate an internal
objectivefrom the conversation state. - Create 2–4 focused
search_queries. - Call the Search API and pass
results[].excerptsinto your reasoning step. - Use
url,title, andpublished_atas Basis-style evidence attached to each fact.
Tuning for cost, latency, and accuracy
From an engineering perspective, you control three main dials when using the Search API:
-
max_results- Higher values increase breadth but introduce more post-processing and downstream tokens.
- Common patterns:
- Agents: 3–10 results per call
- Batch enrichment: 5–20 results with tighter filtering
-
Excerpt size (
max_chars_per_result)- Parallel’s excerpts are already token-optimized. Over-allocating chars can just inflate your model’s context costs.
- Practical ranges:
- “Quick answer” tools: 1,500–3,000 chars
- Deep research / enrichment: 3,000–6,000 chars
-
Query design (
objective+search_queries)- Better query design improves accuracy without increasing cost.
- In Parallel’s internal benchmarks (e.g., BrowseComp-style evaluations with tool-limited agents and judge models like GPT-4-class), query quality often moves accuracy more than adding extra results.
Because pricing is per-request (e.g., $0.005 for 10 results), these dials let you reason about cost up front. For example, 100k Search calls at $0.005 is a known $500 line item—independent of how large your downstream prompt turns get.
Benchmark framing and methodology notes
Parallel publishes benchmark tables and methodology for retrieval quality (e.g., HLE-style, BrowseComp-style, DeepResearch, WISER-Atomic/FindAll). The common evaluation pattern is:
- Tool-limited agents: Only allowed to use Search (no speculative browsing tools).
- Ground-truth datasets: Questions with known atomic facts and citation expectations.
- Judge models: Strong general models used strictly as evaluators.
- Metrics: Accuracy/recall at the fact level, with citations required per fact.
When you design your own tests for the Search API:
- Treat field-level evidence as the unit of evaluation, not just “answer looks correct.”
- Require each fact your agent emits to be backed by at least one
url+excerptspan from the Search output. - Track both accuracy and failure modes (missing facts vs hallucinated facts) so you can tune
objective,search_queries, and result/excerpt limits accordingly.
This aligns directly with Parallel’s Basis philosophy: answers aren’t the product—verifiable, cited facts are.
Putting it all together
If you want a minimal, production-ready pattern for Parallel Search API:
-
Always pass a rich
objective.
Encode task, constraints, and source preferences in natural language. -
Derive 2–4
search_queriesper call.
Use your agent to generate keyword-style variants that capture entities, synonyms, and time ranges. -
Tune
max_resultsandmax_chars_per_result.
Start with 5–10 results and 3,000–4,000 chars per result for most agent tools; adjust based on empirical accuracy vs cost. -
Treat excerpts as evidence.
Useurl,title,published_at, andexcerpts.textas the basis for every fact, and surface citations to your end users or downstream systems.
Once this pattern is in place, you’ve effectively collapsed a brittle search → crawl → scrape → summarize pipeline into a single, predictable request that your agents can call in under five seconds.