
MultiOn quick start: what’s the fastest way to run my first web/browse call from Node.js using the SDK?
If you want to see MultiOn actually click around a real website from Node.js, the fastest path is: install the SDK, set X_MULTION_API_KEY, send a single web.browse command with a cmd and url, and log the response. You can get from zero to a live browser-operating agent in under 5 minutes.
Below is the minimal path I’d use as an automation engineer who just wants to see a successful POST https://api.multion.ai/v1/web/browse run from Node, with a clean upgrade path to sessions, step mode, and Retrieve later.
Prerequisites
You only need three things:
- A MultiOn API key (
X_MULTION_API_KEY) - Node.js 18+ (or any recent LTS)
- A new or existing Node project where you can install the MultiOn SDK
If you don’t have an API key yet, create one from your MultiOn account, copy it, and keep it somewhere safe. You’ll use it as X_MULTION_API_KEY in your environment.
Step 1: Initialize a bare-bones Node project
From an empty folder:
mkdir multion-quick-start && cd multion-quick-start
npm init -y
You now have a minimal package.json to work with.
Step 2: Install the MultiOn Node SDK
Install the SDK directly:
npm install multion
This gives you a thin client around the core Agent API (V1 Beta), including access to the web.browse surface that actually drives a real browser session.
Step 3: Configure your X_MULTION_API_KEY securely
Never hardcode the API key in source. Use environment variables and a simple loader like dotenv:
npm install dotenv
Create a .env file at the project root:
touch .env
Add your key (replace with your actual value):
MULTION_API_KEY=sk-YOUR_MULTION_KEY_HERE
Make sure .env is ignored by Git:
echo ".env" >> .gitignore
Step 4: Create the fastest possible web.browse script
Create index.mjs (using ES modules for simplicity):
touch index.mjs
Paste this minimal quick start code:
import 'dotenv/config';
import MultiOn from 'multion';
async function main() {
const apiKey = process.env.MULTION_API_KEY;
if (!apiKey) {
throw new Error('Missing MULTION_API_KEY in .env');
}
// 1. Initialize the MultiOn client
const client = new MultiOn({
apiKey, // sent as X_MULTION_API_KEY under the hood
});
// 2. Fire a single web.browse call
const response = await client.web.browse({
cmd: 'Open the H&M men section and list 5 products with name and price.',
url: 'https://www2.hm.com/',
// Optional controls: renderJs, scrollToBottom, etc.
renderJs: true,
scrollToBottom: true,
});
// 3. Inspect the raw response so you see the real artifact
console.dir(response, { depth: null });
}
main().catch((err) => {
console.error('Error during web.browse:', err);
process.exit(1);
});
What this does:
- Authenticates via
X_MULTION_API_KEYfromprocess.env.MULTION_API_KEY - Calls the Agent API V1 Beta through
client.web.browse(which maps toPOST https://api.multion.ai/v1/web/browse) - Sends a natural-language
cmdplus aurl - Enables
renderJsandscrollToBottomso H&M’s dynamic catalog can load - Logs the full response object so you can see both the agent’s narrative and the session metadata
Step 5: Run your first web.browse call
From the project root:
node index.mjs
On a successful call you should see a JSON-like object printed. At a minimum, expect:
- A description of the actions taken in the browser (pages visited, clicks, scrolling)
- Extracted content matching your
cmdintent - A
session_idyou can reuse to continue the workflow in step mode
If billing limits are hit or you’re not provisioned correctly, you may see an error pattern such as 402 Payment Required in the response. Treat that as a signal to check your plan or quota, not a coding issue.
Understanding the web.browse response (without overcomplicating)
For a quick sanity check, log just the key pieces:
console.log('Session:', response.session_id);
console.log('Agent output:', response.output_text || response.result || response.output);
Different SDK builds may expose slightly different fields, but the important things to look for are:
session_id– the browser session handle; this is the unit of continuity for multi-step workflows- Output text / structured result – the agent’s interpretation of your
cmdagainst the givenurl
If you only see high-level narrative and want more structured data later, that’s where Retrieve comes in (covered at the end).
Step 6: Turn the quick start into a multi-step session (optional but recommended)
The fastest way to feel the difference between brittle Playwright/Selenium flows and MultiOn is to keep a single session alive across steps: think “add to cart” → “checkout” or “compose post” → “publish”.
Extend your script to reuse session_id:
import 'dotenv/config';
import MultiOn from 'multion';
async function main() {
const client = new MultiOn({ apiKey: process.env.MULTION_API_KEY });
// Step 1: Open Amazon and search
const step1 = await client.web.browse({
cmd: 'Search for "usb c hub" and open the first product.',
url: 'https://www.amazon.com/',
});
console.log('Step 1 session:', step1.session_id);
// Step 2: Continue in the same session (step mode)
const step2 = await client.web.browse({
session_id: step1.session_id,
cmd: 'Add the current product to cart and stop before checkout.',
});
console.log('Step 2 actions complete. Same session:', step2.session_id);
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
Here you’re:
- Using
session_idto maintain a secure remote session in a real browser - Letting the agent handle all the dynamic selectors, login walls, and rendering
- Avoiding the usual “one broken selector breaks everything” failure mode
This step-by-step use is what MultiOn calls Sessions + Step mode: intent in, actions executed, then another intent continuing the same session.
Step 7: Add Retrieve when you need structured JSON
The quick start web.browse example shows narrative output. When you want “JSON arrays of objects” (e.g., for catalog extraction), use MultiOn’s Retrieve surface.
A common pattern I use:
web.browseto navigate and get the page into the right state (e.g., scrolled, filters applied)- Retrieve to pull out structured data with render/scroll controls
Example using H&M again:
import 'dotenv/config';
import MultiOn from 'multion';
async function main() {
const client = new MultiOn({ apiKey: process.env.MULTION_API_KEY });
const result = await client.web.retrieve({
url: 'https://www2.hm.com/en_us/men/products/view-all.html',
renderJs: true,
scrollToBottom: true,
maxItems: 10,
fields: ['name', 'price', 'url', 'image'],
});
// Expect a JSON array of objects: [{ name, price, url, image }, ...]
console.dir(result.items || result, { depth: null });
}
main().catch((err) => {
console.error(err);
process.exit(1);
});
Key points:
renderJs: truehandles JavaScript-heavy catalogsscrollToBottom: trueloads lazy contentmaxItemscontrols how much to extract- Output is a structured JSON array ready for your app/backend, not just scraped HTML
Handling common errors fast
A few patterns you might hit on that first web.browse run:
-
Missing API key
Error:Missing MULTION_API_KEY in .env(from our own guard).
Fix: Double-check.envand that you ran withnode index.mjsfrom the same folder. -
Auth / quota issues
You might see an error payload with status like401 Unauthorizedor402 Payment Required.
Fix: Confirm your key is valid, hasn’t been rotated, and that your account has capacity. -
Network / timeout
Treat this as an infrastructure issue: retry with backoff, or check for outbound network restrictions in your environment.
How this compares to rolling your own Playwright/Selenium
To run the same “Amazon search + add to cart” flow with Playwright or Selenium, you’d usually:
- Stand up a browser farm or use a cloud provider
- Handle selectors that break every time Amazon shuffles the DOM
- Bolt on proxies and cookie jars to get through bot protection
- Write a custom extraction layer for every page type
With MultiOn’s Agent API V1 Beta from Node:
- You send a
cmd+urltoweb.browse - MultiOn runs against a secure remote session with native proxy support tuned for real sites
- You keep a
session_idfor continuity instead of juggling driver instances - You can scale to “millions of concurrent AI Agents” as a backend capability rather than nursing brittle scripts
The quick start script you just ran is intentionally small, but it’s sitting on top of that infrastructure.
Next: wire this into your own app
Once you’re comfortable with a single web.browse call from Node.js, the natural next steps are:
- Wrap
client.web.browsein your own service layer with per-feature commands
e.g.,amazonService.addToCart({ query: 'usb c hub' }) - Use
session_idto keep flows stateful in your backend (store it against user or task IDs) - Invoke Retrieve where you need structured JSON instead of free-form text
- Run calls in parallel for fan-out workloads (e.g., many H&M category pages in parallel agents)
You’ve already done the hardest part: confirming your first web.browse call works from Node using the SDK. Everything else is layering proper orchestration on top of that same command-and-session pattern.