
MultiOn quick start: what’s the fastest way to run my first web/browse call from Node.js using the SDK?
If you want the fastest possible path to a working web/browse call from Node.js, you don’t need a full framework or a giant boilerplate repo. You need three things: install the SDK, set your X_MULTION_API_KEY, and send a cmd + url to the Agent API (V1 Beta).
Below is the minimal Node.js quick start I’d use myself to confirm that MultiOn is wired up correctly and actually operating a real browser session on your behalf.
1. One-minute setup: Node + SDK + API key
Step 1: Create a fresh project (optional but clean)
mkdir multion-quick-start
cd multion-quick-start
npm init -y
Step 2: Install the MultiOn SDK
npm install multion
This SDK is just a thin wrapper around the Agent API (V1 Beta) so you don’t have to hand-roll fetch calls and headers.
Step 3: Set your X_MULTION_API_KEY
Get your API key from your MultiOn account, then export it as an environment variable. This keeps secrets out of source control.
macOS / Linux:
export MULTION_API_KEY="YOUR_API_KEY_HERE"
Windows (PowerShell):
$env:MULTION_API_KEY="YOUR_API_KEY_HERE"
We’ll read this inside Node and pass it straight through as X_MULTION_API_KEY.
2. Fastest working example: a single web/browse call
The goal is simple: hit a real website, execute an instruction, and see a concrete response with a session_id you can reuse.
Create browse-quick-start.mjs (ESM) or browse-quick-start.js (CommonJS). I’ll show ESM; if you’re on CJS, I’ll note the small change afterward.
// browse-quick-start.mjs
import MultiOn from "multion";
const apiKey = process.env.MULTION_API_KEY;
if (!apiKey) {
throw new Error("MULTION_API_KEY is not set. Export it in your environment first.");
}
// Initialize the client
const client = new MultiOn({
apiKey, // this becomes X_MULTION_API_KEY under the hood
});
async function run() {
try {
// Minimal end‑to‑end example: search Amazon for a product
const response = await client.browse({
url: "https://www.amazon.com",
cmd: "Search for 'wireless mouse' and summarize the first 3 results.",
});
console.log("=== Raw response ===");
console.dir(response, { depth: null });
// Common useful fields
console.log("\n=== Session details ===");
console.log("session_id:", response.session_id);
console.log("status:", response.status);
// Some SDK versions return result/summary-like fields; adapt to what you see
if (response.summary) {
console.log("\n=== Summary ===");
console.log(response.summary);
}
} catch (err) {
console.error("Error from MultiOn:", err?.response?.data || err.message || err);
}
}
run();
Run it:
node browse-quick-start.mjs
If everything is wired correctly, you’ll see:
- A
session_idindicating an active secure remote session. - A structured response containing what the agent did in a real browser.
- No brittle selectors, no Playwright/Selenium scaffolding—just intent in, actions executed.
CommonJS variant
If your project is using CommonJS ("type": "module" is not set in package.json), use:
// browse-quick-start.js
const MultiOn = require("multion");
const apiKey = process.env.MULTION_API_KEY;
if (!apiKey) {
throw new Error("MULTION_API_KEY is not set. Export it in your environment first.");
}
const client = new MultiOn({ apiKey });
async function run() {
try {
const response = await client.browse({
url: "https://www.amazon.com",
cmd: "Search for 'wireless mouse' and summarize the first 3 results.",
});
console.log("=== Raw response ===");
console.dir(response, { depth: null });
console.log("\n=== Session details ===");
console.log("session_id:", response.session_id);
console.log("status:", response.status);
} catch (err) {
console.error("Error from MultiOn:", err?.response?.data || err.message || err);
}
}
run();
3. Understanding the web/browse call shape
Under the SDK, you’re effectively hitting:
POST https://api.multion.ai/v1/web/browse
X_MULTION_API_KEY: <your key>
Content-Type: application/json
With a minimal body like:
{
"url": "https://www.amazon.com",
"cmd": "Search for 'wireless mouse' and summarize the first 3 results."
}
Key points:
url: the starting page where the secure remote browser will load.cmd: natural-language instruction describing the workflow. This is where you replace brittle selectors with intent.- The agent runs the workflow in a real browser environment, not a synthetic HTTP client. That handles logins, dynamic UIs, bot protection, and multi-step flows that would normally require a fragile Playwright/Selenium suite.
4. Keeping the workflow alive with session_id (Step mode)
The fastest quick start is a single call, but real automation needs continuity: add to cart → checkout, open X → post, etc. That’s where Sessions + Step mode comes in.
Assume your first call returned:
{
"session_id": "sess_12345",
"status": "success",
"...": "..."
}
You can continue that same browser session in Node by passing session_id into a follow-up browse call.
import MultiOn from "multion";
const client = new MultiOn({ apiKey: process.env.MULTION_API_KEY });
async function stepModeExample() {
// Step 1: Go to Amazon and search
const first = await client.browse({
url: "https://www.amazon.com",
cmd: "Search for 'wireless mouse' and open the first organic result in a new tab.",
});
const sessionId = first.session_id;
console.log("Reusing session:", sessionId);
// Step 2: In the same session, add the opened product to cart
const second = await client.browse({
session_id: sessionId,
cmd: "Add the currently open product to the cart, then stop.",
});
console.log("Cart step status:", second.status);
}
stepModeExample().catch(console.error);
This is the unit of reliability I care about as a former Playwright/Selenium owner:
- You’re not re-solving login or CAPTCHAs on every call.
- You’re not chasing DOM changes with new selectors.
- You treat a live browser session as a first-class primitive.
5. Error handling and “402 Payment Required”
For a quick-start script, you don’t need complex retry logic, but you should at least surface API-level errors clearly.
Typical categories:
- Auth issues: invalid or missing
X_MULTION_API_KEY→ 401/403-style errors. - Usage/billing: if you hit a quota or billing limit, the API will return a
402 Payment Required. You’ll see that inerr.response.statusand the body. - Command/flow issues: the agent couldn’t complete the requested workflow due to page changes, access denial, or other runtime constraints. Handle these as application-level errors (e.g., log and alert) instead of silent failures.
Example lightweight handler:
try {
const res = await client.browse({ url, cmd });
// use res
} catch (err) {
const status = err?.response?.status;
const data = err?.response?.data;
if (status === 402) {
console.error("MultiOn returned 402 Payment Required. Check your plan/billing.", data);
process.exit(1);
}
console.error("MultiOn browse error:", status, data || err.message || err);
}
Surfacing 402 Payment Required explicitly keeps you from misdiagnosing billing issues as “agent flakiness.”
6. Next steps after your first web/browse call
Once the basic call is working, the fastest way to make it useful is to plug it into a real workflow:
-
Amazon ordering flow:
browseto Amazon with acmdto search and pick a product.- Reuse
session_idto add to cart. - Reuse again to navigate checkout and confirm.
-
Posting on X:
browsetohttps://x.comand log in (with appropriate credentials or logged-in state).- Reuse
session_idto compose and post a tweet via a follow-upcmd.
For data-heavy pages where you want structured output, pivot to the Retrieve function instead of trying to scrape by hand. Retrieve is designed to return JSON arrays of objects with controls like renderJs, scrollToBottom, and maxItems. That’s how you’d turn something like an H&M catalog page into clean fields: name, price, colors, urls, images.
But for a pure quick start focused on web/browse from Node.js, you only need:
npm install multion- Export
MULTION_API_KEY - Call
client.browse({ url, cmd }) - Inspect
session_idandstatusto confirm the agent ran in a real browser.
Once that works, you can treat MultiOn as part of your backend: intent in via Node, actions executed via secure remote sessions, and structured output back as JSON that your application can safely consume.