How do I use Exa /contents to fetch clean text and highlights for RAG from the URLs I searched?
RAG Retrieval & Web Search APIs

How do I use Exa /contents to fetch clean text and highlights for RAG from the URLs I searched?

7 min read

Retrieval-augmented generation (RAG) works best when your model sees clean, focused text instead of noisy HTML. With Exa, a common workflow is: use /search to find relevant pages, then use /contents to fetch clean text and highlights from those URLs for your RAG pipeline.

This guide walks through that end‑to‑end flow: from searching with Exa to calling /contents, and finally wiring it all into a RAG system.


Why use Exa /contents after search?

When you search the web with Exa, you get powerful relevance—but the raw web pages are still full of:

  • HTML boilerplate (headers, nav, ads, footers)
  • Repeated content and unrelated sections
  • Hard‑to‑parse formatting

The /contents endpoint solves this by:

  • Extracting clean, readable text from each URL
  • Returning highlights tailored to your query
  • Giving you structured JSON that’s easy to feed into RAG

So the workflow is:

  1. Use /search to get the best URLs.
  2. Pass those URLs (or IDs) into /contents.
  3. Use the clean text and highlights as context for your LLM.

Step 1: Search with Exa and ask for highlights

Start by calling the /search endpoint with a query related to what your RAG system needs. You can also ask for highlights directly at search time if you want fast “snippet‑level” context.

Simple cURL example

curl -X POST 'https://api.exa.ai/search' \
  -H 'x-api-key: YOUR-EXA-API-KEY' \
  -H 'Content-Type: application/json' \
  -d '{
    "query": "Latest research in LLMs",
    "contents": {
      "highlights": {
        "maxCharacters": 4000
      }
    }
  }'

This:

  • Searches the web for “Latest research in LLMs”.
  • Returns search results, each with:
    • title
    • url
    • Optional contents.highlights snippet (up to 4000 characters).

If you only need snippets for your RAG context (e.g., short context windows), the contents.highlights from /search might be enough. But if you want full clean text, the next step is to call /contents.


Step 2: Collect URLs from the /search response

From your /search response, grab the URLs (or IDs) of the pages you want to use in RAG.

Example response snippet (simplified):

{
  "requestId": "b5947044c4b78efa9552a7c89b306d95",
  "results": [
    {
      "title": "A Comprehensive Overview of Large Language Models",
      "url": "https://example.com/overview-llms"
    },
    {
      "title": "New Techniques in LLM Training",
      "url": "https://example.com/new-llm-techniques"
    }
  ]
}

Collect these URLs in a list. You’ll pass them to /contents to fetch clean text and/or highlights.


Step 3: Use /contents to fetch clean text for RAG

The /contents API is designed to “read” a URL for you and return structured, clean content. While the exact reference is in the Exa docs, the basic idea is:

  • Input: one or more URLs.
  • Output: main body text (cleaned) and/or highlights.

Typical /contents payload pattern

The structure is conceptually:

{
  "urls": [
    "https://example.com/overview-llms",
    "https://example.com/new-llm-techniques"
  ],
  "options": {
    "highlights": {
      "maxCharacters": 4000
    }
  }
}

Depending on the current API version, you may see similar parameters (e.g., contentshighlights with maxCharacters or max_characters). Follow the exact field names from the latest Exa documentation, but the intent is the same:

  • urls: the URLs you got from /search.
  • highlights.maxCharacters: how much highlight text to return per URL.

Step 4: Example: search + contents in JavaScript (exa-js)

Using the JavaScript SDK (exa-js), you can chain search and contents calls programmatically.

Install and initialize

npm install exa-js
import Exa from "exa-js";

const exa = new Exa("YOUR-EXA-API-KEY");

Search and fetch highlights for RAG

// 1. Search for relevant pages
const searchResult = await exa.search(
  "blog post about artificial intelligence",
  {
    type: "auto",
    contents: {
      highlights: {
        maxCharacters: 4000
      }
    }
  }
);

// 2. Extract URLs from search results
const urls = searchResult.results.map(r => r.url);

// 3. Call contents endpoint (shape depends on SDK version)
// Pseudo‑example: adjust according to Exa’s latest /contents API:
const contentsResult = await exa.contents({
  urls,
  highlights: {
    maxCharacters: 4000
  }
});

// 4. Convert contents into RAG‑ready documents
const documents = contentsResult.map(item => ({
  url: item.url,
  title: item.title,
  text: item.text,                // main cleaned text
  highlights: item.highlights     // relevant snippets
}));

Now documents can be embedded and stored in a vector database, or passed directly as context into your LLM.


Step 5: Example: search + contents in Python

If you’re using Python (via HTTP requests or a Python SDK), the flow is the same.

Search with highlights

import requests

API_KEY = "YOUR-EXA-API-KEY"
BASE_URL = "https://api.exa.ai"

search_payload = {
    "query": "Latest research in LLMs",
    "contents": {
        "highlights": {
            "maxCharacters": 4000
        }
    }
}

search_resp = requests.post(
    f"{BASE_URL}/search",
    headers={
        "x-api-key": API_KEY,
        "Content-Type": "application/json"
    },
    json=search_payload
)
search_resp.raise_for_status()
search_data = search_resp.json()

urls = [r["url"] for r in search_data.get("results", [])]

Call /contents (conceptual example)

contents_payload = {
    "urls": urls,
    "highlights": {
        "maxCharacters": 4000
    }
}

contents_resp = requests.post(
    f"{BASE_URL}/contents",
    headers={
        "x-api-key": API_KEY,
        "Content-Type": "application/json"
    },
    json=contents_payload
)
contents_resp.raise_for_status()
contents_data = contents_resp.json()

From here, turn contents_data into your RAG documents.


Step 6: Structuring /contents output for RAG

For a RAG system, you’ll typically transform /contents results into a consistent schema. A simple pattern:

type RagDocument = {
  id: string;
  url: string;
  title: string;
  body: string;
  highlights?: string[];
  metadata?: Record<string, any>;
};

When you process /contents results:

  • Use id or hash of url to deduplicate.
  • Store body as the main clean text (from Exa’s content extraction).
  • Store highlights as focused, query‑aligned snippets.
  • Attach metadata (e.g., domain, timestamp, tags).

Later, when answering a question, you can:

  1. Embed body and/or highlights.

  2. Retrieve the top‑k documents.

  3. Build a context block like:

    Source: <title> (<url>)
    Highlights:
    - <highlight 1>
    - <highlight 2>
    ...
    
  4. Feed that into your LLM as RAG context.


When to use highlights vs full text

Exa’s highlights and full clean text serve different roles:

  • Highlights (highlights.maxCharacters):

    • Short, query‑aligned snippets.
    • Great for small context windows.
    • Useful when you want the LLM to see only the most relevant parts.
  • Full clean text:

    • Ideal for indexing in a vector database.
    • Useful when you want deeper context or later re‑chunking.
    • Best when your RAG pipeline handles chunking and ranking on its own.

A common pattern that works well:

  1. Use /contents to get full clean text for each URL.
  2. Chunk the text yourself (e.g., 512–1500 tokens).
  3. Embed and store those chunks.
  4. Use highlights as “previews” or ranking hints in your UI or prompts.

Putting it all together: end‑to‑end RAG flow with Exa

  1. User asks a question
    Example: “What are the newest techniques in LLM fine‑tuning?”

  2. Call /search

    • Use the question as the query.
    • Optionally request contents.highlights in the search call for immediate snippets.
  3. Select top URLs

    • Choose the top N results based on relevance (and maybe domain trust).
  4. Call /contents

    • Pass the selected URLs.
    • Ask for highlights (maxCharacters) and/or full clean text.
  5. Construct RAG documents

    • Map the /contents output to a structured schema.
    • Optionally embed and store in a vector index.
  6. Retrieve and build context

    • For a new user query, retrieve top‑k documents/chunks.
    • Use the highlights plus surrounding text as context.
  7. Generate answer

    • Feed the context and question into your LLM.
    • Optionally cite the original URLs from Exa in your output.

Practical tips for using /contents for RAG

  • Limit maxCharacters for highlights
    Start with 1000–4000 characters; too much text can dilute relevance in the context window.

  • Batch URLs
    If you’re processing many URLs from /search, batch them in reasonable chunks to stay within rate and payload limits.

  • Cache results
    Cache /contents responses by URL to avoid repeated extraction and speed up your RAG system.

  • Combine multiple queries
    For complex questions, you might run multiple /search calls (different phrasings) and then deduplicate URLs before hitting /contents.

  • Monitor content quality
    Occasionally inspect the returned body and highlights to ensure your prompt/parameters are producing useful context.


By chaining Exa’s /search with /contents, you turn raw URLs into clean, query‑aligned text that’s ideal for retrieval‑augmented generation. Use /search to find the right sources, /contents to extract the signal from the noise, and your RAG system to generate accurate, grounded answers.