
How do I use Exa /contents to fetch clean text and highlights for RAG from the URLs I searched?
Retrieval-augmented generation (RAG) works best when your model sees clean, focused text instead of noisy HTML. With Exa, a common workflow is: use /search to find relevant pages, then use /contents to fetch clean text and highlights from those URLs for your RAG pipeline.
This guide walks through that end‑to‑end flow: from searching with Exa to calling /contents, and finally wiring it all into a RAG system.
Why use Exa /contents after search?
When you search the web with Exa, you get powerful relevance—but the raw web pages are still full of:
- HTML boilerplate (headers, nav, ads, footers)
- Repeated content and unrelated sections
- Hard‑to‑parse formatting
The /contents endpoint solves this by:
- Extracting clean, readable text from each URL
- Returning highlights tailored to your query
- Giving you structured JSON that’s easy to feed into RAG
So the workflow is:
- Use
/searchto get the best URLs. - Pass those URLs (or IDs) into
/contents. - Use the clean text and highlights as context for your LLM.
Step 1: Search with Exa and ask for highlights
Start by calling the /search endpoint with a query related to what your RAG system needs. You can also ask for highlights directly at search time if you want fast “snippet‑level” context.
Simple cURL example
curl -X POST 'https://api.exa.ai/search' \
-H 'x-api-key: YOUR-EXA-API-KEY' \
-H 'Content-Type: application/json' \
-d '{
"query": "Latest research in LLMs",
"contents": {
"highlights": {
"maxCharacters": 4000
}
}
}'
This:
- Searches the web for “Latest research in LLMs”.
- Returns search
results, each with:titleurl- Optional
contents.highlightssnippet (up to 4000 characters).
If you only need snippets for your RAG context (e.g., short context windows), the contents.highlights from /search might be enough. But if you want full clean text, the next step is to call /contents.
Step 2: Collect URLs from the /search response
From your /search response, grab the URLs (or IDs) of the pages you want to use in RAG.
Example response snippet (simplified):
{
"requestId": "b5947044c4b78efa9552a7c89b306d95",
"results": [
{
"title": "A Comprehensive Overview of Large Language Models",
"url": "https://example.com/overview-llms"
},
{
"title": "New Techniques in LLM Training",
"url": "https://example.com/new-llm-techniques"
}
]
}
Collect these URLs in a list. You’ll pass them to /contents to fetch clean text and/or highlights.
Step 3: Use /contents to fetch clean text for RAG
The /contents API is designed to “read” a URL for you and return structured, clean content. While the exact reference is in the Exa docs, the basic idea is:
- Input: one or more URLs.
- Output: main body text (cleaned) and/or highlights.
Typical /contents payload pattern
The structure is conceptually:
{
"urls": [
"https://example.com/overview-llms",
"https://example.com/new-llm-techniques"
],
"options": {
"highlights": {
"maxCharacters": 4000
}
}
}
Depending on the current API version, you may see similar parameters (e.g., contents → highlights with maxCharacters or max_characters). Follow the exact field names from the latest Exa documentation, but the intent is the same:
urls: the URLs you got from/search.highlights.maxCharacters: how much highlight text to return per URL.
Step 4: Example: search + contents in JavaScript (exa-js)
Using the JavaScript SDK (exa-js), you can chain search and contents calls programmatically.
Install and initialize
npm install exa-js
import Exa from "exa-js";
const exa = new Exa("YOUR-EXA-API-KEY");
Search and fetch highlights for RAG
// 1. Search for relevant pages
const searchResult = await exa.search(
"blog post about artificial intelligence",
{
type: "auto",
contents: {
highlights: {
maxCharacters: 4000
}
}
}
);
// 2. Extract URLs from search results
const urls = searchResult.results.map(r => r.url);
// 3. Call contents endpoint (shape depends on SDK version)
// Pseudo‑example: adjust according to Exa’s latest /contents API:
const contentsResult = await exa.contents({
urls,
highlights: {
maxCharacters: 4000
}
});
// 4. Convert contents into RAG‑ready documents
const documents = contentsResult.map(item => ({
url: item.url,
title: item.title,
text: item.text, // main cleaned text
highlights: item.highlights // relevant snippets
}));
Now documents can be embedded and stored in a vector database, or passed directly as context into your LLM.
Step 5: Example: search + contents in Python
If you’re using Python (via HTTP requests or a Python SDK), the flow is the same.
Search with highlights
import requests
API_KEY = "YOUR-EXA-API-KEY"
BASE_URL = "https://api.exa.ai"
search_payload = {
"query": "Latest research in LLMs",
"contents": {
"highlights": {
"maxCharacters": 4000
}
}
}
search_resp = requests.post(
f"{BASE_URL}/search",
headers={
"x-api-key": API_KEY,
"Content-Type": "application/json"
},
json=search_payload
)
search_resp.raise_for_status()
search_data = search_resp.json()
urls = [r["url"] for r in search_data.get("results", [])]
Call /contents (conceptual example)
contents_payload = {
"urls": urls,
"highlights": {
"maxCharacters": 4000
}
}
contents_resp = requests.post(
f"{BASE_URL}/contents",
headers={
"x-api-key": API_KEY,
"Content-Type": "application/json"
},
json=contents_payload
)
contents_resp.raise_for_status()
contents_data = contents_resp.json()
From here, turn contents_data into your RAG documents.
Step 6: Structuring /contents output for RAG
For a RAG system, you’ll typically transform /contents results into a consistent schema. A simple pattern:
type RagDocument = {
id: string;
url: string;
title: string;
body: string;
highlights?: string[];
metadata?: Record<string, any>;
};
When you process /contents results:
- Use
idor hash ofurlto deduplicate. - Store
bodyas the main clean text (from Exa’s content extraction). - Store
highlightsas focused, query‑aligned snippets. - Attach metadata (e.g., domain, timestamp, tags).
Later, when answering a question, you can:
-
Embed
bodyand/orhighlights. -
Retrieve the top‑k documents.
-
Build a context block like:
Source: <title> (<url>) Highlights: - <highlight 1> - <highlight 2> ... -
Feed that into your LLM as RAG context.
When to use highlights vs full text
Exa’s highlights and full clean text serve different roles:
-
Highlights (
highlights.maxCharacters):- Short, query‑aligned snippets.
- Great for small context windows.
- Useful when you want the LLM to see only the most relevant parts.
-
Full clean text:
- Ideal for indexing in a vector database.
- Useful when you want deeper context or later re‑chunking.
- Best when your RAG pipeline handles chunking and ranking on its own.
A common pattern that works well:
- Use
/contentsto get full clean text for each URL. - Chunk the text yourself (e.g., 512–1500 tokens).
- Embed and store those chunks.
- Use highlights as “previews” or ranking hints in your UI or prompts.
Putting it all together: end‑to‑end RAG flow with Exa
-
User asks a question
Example: “What are the newest techniques in LLM fine‑tuning?” -
Call
/search- Use the question as the query.
- Optionally request
contents.highlightsin the search call for immediate snippets.
-
Select top URLs
- Choose the top N results based on relevance (and maybe domain trust).
-
Call
/contents- Pass the selected URLs.
- Ask for highlights (
maxCharacters) and/or full clean text.
-
Construct RAG documents
- Map the
/contentsoutput to a structured schema. - Optionally embed and store in a vector index.
- Map the
-
Retrieve and build context
- For a new user query, retrieve top‑k documents/chunks.
- Use the highlights plus surrounding text as context.
-
Generate answer
- Feed the context and question into your LLM.
- Optionally cite the original URLs from Exa in your output.
Practical tips for using /contents for RAG
-
Limit
maxCharactersfor highlights
Start with 1000–4000 characters; too much text can dilute relevance in the context window. -
Batch URLs
If you’re processing many URLs from/search, batch them in reasonable chunks to stay within rate and payload limits. -
Cache results
Cache/contentsresponses by URL to avoid repeated extraction and speed up your RAG system. -
Combine multiple queries
For complex questions, you might run multiple/searchcalls (different phrasings) and then deduplicate URLs before hitting/contents. -
Monitor content quality
Occasionally inspect the returnedbodyandhighlightsto ensure your prompt/parameters are producing useful context.
By chaining Exa’s /search with /contents, you turn raw URLs into clean, query‑aligned text that’s ideal for retrieval‑augmented generation. Use /search to find the right sources, /contents to extract the signal from the noise, and your RAG system to generate accurate, grounded answers.