
How do I use MultiOn Retrieve to extract a JSON array of objects from a JS-heavy page (renderJs + scrollToBottom + maxItems)?
Quick Answer: The best overall choice for extracting a JSON array of objects from JS-heavy pages with MultiOn is the Retrieve endpoint with
renderJs: true+scrollToBottom: true. If your priority is strict control over result size, Retrieve with a tunedmaxItemsvalue is often a stronger fit. For deeply dynamic, infinite-scroll catalogs, consider Retrieve with iterative pagination (multiple calls with adjustedscrollToBottomor URL params).
At-a-Glance Comparison
| Rank | Option | Best For | Primary Strength | Watch Out For |
|---|---|---|---|---|
| 1 | Retrieve with renderJs: true + scrollToBottom: true | Most JS-heavy catalog or listing pages | Handles client-side rendering and lazy-loaded content in one pass | May fetch more content than you need if maxItems is large |
| 2 | Retrieve with tuned maxItems | Strict caps on how many objects you pull per call | Predictable JSON array size and reduced payloads | Might truncate long lists, missing tail-end items |
| 3 | Retrieve with iterative pagination / multiple calls | Very long or infinite-scroll pages at scale | Fine-grained control over coverage and concurrency | More orchestration logic in your app and additional API calls |
Comparison Criteria
We evaluated each pattern against the following criteria to match the reality of JS-heavy pages:
- Reliability on JS-heavy UIs: How well the approach survives client-side rendering, lazy loading, and infinite scroll without you maintaining brittle selectors.
- Control over output size: Whether you can reliably shape the JSON array of objects using
maxItemsand call patterns without surprises. - Operational complexity: How much orchestration code you have to write around the Retrieve call to get consistently structured results.
Detailed Breakdown
1. Retrieve with renderJs: true + scrollToBottom: true (Best overall for JS-heavy catalogs)
Retrieve with renderJs: true and scrollToBottom: true ranks as the top choice because it maps directly to how modern sites actually load content: render via JavaScript, then stream more items as the user scrolls.
What it does well:
-
Handles JS-heavy rendering (
renderJs: true):
When you setrenderJs: true, MultiOn executes the page in a real browser environment instead of treating it like static HTML. That means React, Next.js, Vue, and other SPA frameworks fully hydrate before data extraction. You don’t wrestle with invisible DOM or half-rendered components. -
Loads lazy content (
scrollToBottom: true):
EnablingscrollToBottomtells MultiOn to simulate scrolling through the page. On JS-heavy product grids—think H&M, Amazon-like search results—that’s usually the only way to trigger additional network calls and populate the full list you care about. Your extraction then runs against the expanded DOM instead of just the initial viewport.
Tradeoffs & Limitations:
- Over-fetch risk without
maxItems:
On very long lists,renderJs: true+scrollToBottom: truecan surface a lot of content. If you don’t cap withmaxItems, your JSON array of objects might be larger than needed, which affects processing time and downstream costs.
Decision Trigger: Choose Retrieve with renderJs: true + scrollToBottom: true if you want a “single-call, JS-aware scrape” that reliably turns a dynamic page into a structured JSON array of objects, and you prioritize maximum content coverage over tight caps on list size.
2. Retrieve with tuned maxItems (Best for strict output control)
Retrieve with a carefully tuned maxItems is the strongest fit when you care more about predictable output size than full-page coverage.
What it does well:
-
Predictable JSON array size (
maxItems):
ThemaxItemsparameter lets you tell MultiOn, “Stop once you’ve extracted N objects.” That’s critical when you’re:- Building a “top 20 items” feature.
- Running many agents in parallel and need to bound memory and processing per call.
- Feeding outputs directly into another API that expects a fixed or small list.
-
Lighter payloads at scale:
Smaller JSON arrays mean faster downstream processing and less noise when you’re only interested in the first page of results or a limited sample.
Tradeoffs & Limitations:
- Potential truncation of longer lists:
If the page contains more items thanmaxItems, you’ll only get the first segment. That’s usually fine for “top-N” use cases, but not for full catalog exports or compliance-style completeness.
Decision Trigger: Choose Retrieve with tuned maxItems if you want a bounded JSON array size, need predictable cost and latency per call, and are okay with capturing only the first portion of a JS-heavy list.
3. Retrieve with iterative pagination / multiple calls (Best for very long or infinite-scroll pages)
Retrieve with iterative pagination or multiple calls stands out when a single pass isn’t enough—think infinite scroll that keeps loading, or very deep catalogs where you care about full coverage.
What it does well:
-
Full coverage with smaller bites:
Instead of one monolithic call, you:- Break the page into “slices” (e.g., by query parameters like
page=1,page=2), or - Use multiple Retrieve calls with controlled scrolling behavior and
maxItemsper call.
You then merge the resulting JSON arrays of objects on your side. This is closer to how we used to chunk Selenium runs across a grid, but now you get structured JSON as the output instead of test logs.
- Break the page into “slices” (e.g., by query parameters like
-
Better fit for parallel agents:
MultiOn is built for running many agents concurrently. Iterative Retrieve calls map well to that: you can pushpage=1..Nto a pool of workers, each running Retrieve, and aggregate the outputs.
Tradeoffs & Limitations:
-
More orchestration logic:
You’re now managing:- Pagination indices or URL parameters.
- Deduplication of overlapping results.
- Aggregation of multiple JSON arrays.
It’s still far simpler than running a Playwright farm, but it’s more work than a single Retrieve call.
Decision Trigger: Choose Retrieve with iterative pagination / multiple calls if you want complete coverage of very long or infinite-scroll JS-heavy pages and are willing to manage a bit of orchestration to keep each JSON array of objects manageable.
How to call MultiOn Retrieve for JS-heavy pages (step-by-step)
Below is a concrete implementation pattern you can drop into your stack. The goal: turn a JS-heavy page into a JSON array of objects using renderJs, scrollToBottom, and maxItems.
1. Install and authenticate
If you’re using Node:
npm install multion
Regardless of language, your calls will authenticate with:
- Header:
X_MULTION_API_KEY: <your_api_key>
2. Basic Retrieve call shape
The Retrieve endpoint is designed to return JSON arrays of objects. At a minimum, you pass:
url: the page you want to extract from.- A description of what to extract (schema / fields).
- Optional controls:
renderJs,scrollToBottom,maxItems.
A typical raw HTTP call:
curl -X POST https://api.multion.ai/v1/retrieve \
-H "Content-Type: application/json" \
-H "X_MULTION_API_KEY: $MULTION_API_KEY" \
-d '{
"url": "https://www2.hm.com/en_us/men/products/tshirts.html",
"renderJs": true,
"scrollToBottom": true,
"maxItems": 50,
"schema": {
"type": "array",
"items": {
"type": "object",
"properties": {
"name": { "type": "string" },
"price": { "type": "string" },
"color": { "type": "string" },
"productUrl": { "type": "string" },
"imageUrl": { "type": "string" }
},
"required": ["name", "price", "productUrl"]
}
}
}'
This pattern is how you’d, for example, convert an H&M t-shirt catalog into a structured array of product objects.
3. Using renderJs for JS-heavy pages
Set:
"renderJs": true
When to use:
- React/Next.js/Vue SPAs.
- Search pages where HTML is almost empty until scripts run.
- Any page where content visibly “pops in” after load.
Mechanism:
- MultiOn runs the page in a secure remote session.
- JavaScript executes.
- DOM stabilizes.
- Retrieve runs extraction against the rendered DOM, giving you reliable data instead of empty or partial fields.
4. Using scrollToBottom for lazy-loaded content
Set:
"scrollToBottom": true
When to use:
- Infinite scroll catalogs.
- “Load more” triggered by scroll, not by a button you can easily click.
- Social or listing-style pages that only load a fraction of items initially.
Mechanism:
- MultiOn simulates scrolling down.
- The site fires its usual network calls to fetch more items.
- Retrieve waits until the scroll completes and the DOM reflects the loaded results.
- Extraction runs on this expanded DOM.
If you only want a small sample (e.g., first 20 items) and the page eagerly loads those without scroll, you can set scrollToBottom: false and just rely on renderJs: true.
5. Controlling list length with maxItems
Set, for example:
"maxItems": 25
Behavior:
- MultiOn stops populating the JSON array of objects once it has extracted
maxItemsitems that match your schema. - This doesn’t necessarily stop the page scroll early; it caps the extraction result.
Guidelines:
- Use smaller
maxItems(10–50) for:- “Top N listings” views.
- Low-latency features inside your app.
- Use larger
maxItems(100–500+) when:- You’re doing bulk exports or offline analysis.
- You can tolerate larger payloads and longer extraction time.
If maxItems is absent, MultiOn will extract as many items as it reasonably can from the rendered DOM.
6. Example: Extracting a product list from a JS-heavy page
Here’s a TypeScript-flavored example using the multion client pattern; adjust to your HTTP client of choice:
import axios from "axios";
const MULTION_API_KEY = process.env.MULTION_API_KEY!;
async function extractProducts() {
const response = await axios.post(
"https://api.multion.ai/v1/retrieve",
{
url: "https://www2.hm.com/en_us/men/products/tshirts.html",
renderJs: true,
scrollToBottom: true,
maxItems: 40,
schema: {
type: "array",
items: {
type: "object",
properties: {
name: { type: "string" },
price: { type: "string" },
color: { type: "string" },
productUrl: { type: "string" },
imageUrl: { type: "string" }
},
required: ["name", "price", "productUrl"]
}
}
},
{
headers: {
"Content-Type": "application/json",
"X_MULTION_API_KEY": MULTION_API_KEY
}
}
);
const items = response.data; // JSON array of objects
console.log(items);
}
extractProducts().catch(console.error);
Output shape (simplified):
[
{
"name": "Relaxed Fit Printed T-shirt",
"price": "$14.99",
"color": "Black/Graphic",
"productUrl": "https://www2.hm.com/en_us/productpage.1234567890.html",
"imageUrl": "https://image.hm.com/asset123.jpg"
},
{
"name": "Regular Fit T-shirt",
"price": "$9.99",
"color": "White",
"productUrl": "https://www2.hm.com/en_us/productpage.0987654321.html",
"imageUrl": "https://image.hm.com/asset456.jpg"
}
]
You now have a clean JSON array of objects ready for indexing, analytics, or feeding into another system.
7. Handling errors and limits
In production, treat MultiOn like any other infrastructure API:
- Watch for HTTP-level errors, including:
402 Payment Requiredwhen you hit billing limits or need to adjust your plan.
- Implement simple retry logic for transient failures.
- Log the combination of
url,renderJs,scrollToBottom, andmaxItemsused for each call so you can replay or tune them if extraction results don’t match expectations.
Final Verdict
If your goal is to extract a JSON array of objects from a JS-heavy page, start with Retrieve + renderJs: true + scrollToBottom: true and tighten control using maxItems. That setup matches how real sites load data—client-side rendering plus lazy content on scroll—without you rebuilding brittle Playwright/Selenium scripts.
For most use cases:
- Use renderJs any time JavaScript controls what appears in the DOM.
- Turn on scrollToBottom when you need more than the initial viewport.
- Tune maxItems to balance completeness versus payload size.
For huge or infinite-scroll listings, step up to iterative Retrieve calls and lean into MultiOn’s ability to run many agents in parallel. You stay focused on defining the JSON you need; MultiOn handles the secure remote sessions, browser execution, and structured extraction.