MultiOn quick start: what’s the fastest way to run my first web/browse call from Node.js using the SDK?
On-Device Mobile AI Agents

MultiOn quick start: what’s the fastest way to run my first web/browse call from Node.js using the SDK?

7 min read

If you want to see MultiOn actually click around a real website from Node.js, the fastest path is: install the SDK, set X_MULTION_API_KEY, send a single web.browse command with a cmd and url, and log the response. You can get from zero to a live browser-operating agent in under 5 minutes.

Below is the minimal path I’d use as an automation engineer who just wants to see a successful POST https://api.multion.ai/v1/web/browse run from Node, with a clean upgrade path to sessions, step mode, and Retrieve later.


Prerequisites

You only need three things:

  • A MultiOn API key (X_MULTION_API_KEY)
  • Node.js 18+ (or any recent LTS)
  • A new or existing Node project where you can install the MultiOn SDK

If you don’t have an API key yet, create one from your MultiOn account, copy it, and keep it somewhere safe. You’ll use it as X_MULTION_API_KEY in your environment.


Step 1: Initialize a bare-bones Node project

From an empty folder:

mkdir multion-quick-start && cd multion-quick-start
npm init -y

You now have a minimal package.json to work with.


Step 2: Install the MultiOn Node SDK

Install the SDK directly:

npm install multion

This gives you a thin client around the core Agent API (V1 Beta), including access to the web.browse surface that actually drives a real browser session.


Step 3: Configure your X_MULTION_API_KEY securely

Never hardcode the API key in source. Use environment variables and a simple loader like dotenv:

npm install dotenv

Create a .env file at the project root:

touch .env

Add your key (replace with your actual value):

MULTION_API_KEY=sk-YOUR_MULTION_KEY_HERE

Make sure .env is ignored by Git:

echo ".env" >> .gitignore

Step 4: Create the fastest possible web.browse script

Create index.mjs (using ES modules for simplicity):

touch index.mjs

Paste this minimal quick start code:

import 'dotenv/config';
import MultiOn from 'multion';

async function main() {
  const apiKey = process.env.MULTION_API_KEY;
  if (!apiKey) {
    throw new Error('Missing MULTION_API_KEY in .env');
  }

  // 1. Initialize the MultiOn client
  const client = new MultiOn({
    apiKey, // sent as X_MULTION_API_KEY under the hood
  });

  // 2. Fire a single web.browse call
  const response = await client.web.browse({
    cmd: 'Open the H&M men section and list 5 products with name and price.',
    url: 'https://www2.hm.com/',
    // Optional controls: renderJs, scrollToBottom, etc.
    renderJs: true,
    scrollToBottom: true,
  });

  // 3. Inspect the raw response so you see the real artifact
  console.dir(response, { depth: null });
}

main().catch((err) => {
  console.error('Error during web.browse:', err);
  process.exit(1);
});

What this does:

  • Authenticates via X_MULTION_API_KEY from process.env.MULTION_API_KEY
  • Calls the Agent API V1 Beta through client.web.browse (which maps to POST https://api.multion.ai/v1/web/browse)
  • Sends a natural-language cmd plus a url
  • Enables renderJs and scrollToBottom so H&M’s dynamic catalog can load
  • Logs the full response object so you can see both the agent’s narrative and the session metadata

Step 5: Run your first web.browse call

From the project root:

node index.mjs

On a successful call you should see a JSON-like object printed. At a minimum, expect:

  • A description of the actions taken in the browser (pages visited, clicks, scrolling)
  • Extracted content matching your cmd intent
  • A session_id you can reuse to continue the workflow in step mode

If billing limits are hit or you’re not provisioned correctly, you may see an error pattern such as 402 Payment Required in the response. Treat that as a signal to check your plan or quota, not a coding issue.


Understanding the web.browse response (without overcomplicating)

For a quick sanity check, log just the key pieces:

console.log('Session:', response.session_id);
console.log('Agent output:', response.output_text || response.result || response.output);

Different SDK builds may expose slightly different fields, but the important things to look for are:

  • session_id – the browser session handle; this is the unit of continuity for multi-step workflows
  • Output text / structured result – the agent’s interpretation of your cmd against the given url

If you only see high-level narrative and want more structured data later, that’s where Retrieve comes in (covered at the end).


Step 6: Turn the quick start into a multi-step session (optional but recommended)

The fastest way to feel the difference between brittle Playwright/Selenium flows and MultiOn is to keep a single session alive across steps: think “add to cart” → “checkout” or “compose post” → “publish”.

Extend your script to reuse session_id:

import 'dotenv/config';
import MultiOn from 'multion';

async function main() {
  const client = new MultiOn({ apiKey: process.env.MULTION_API_KEY });

  // Step 1: Open Amazon and search
  const step1 = await client.web.browse({
    cmd: 'Search for "usb c hub" and open the first product.',
    url: 'https://www.amazon.com/',
  });

  console.log('Step 1 session:', step1.session_id);

  // Step 2: Continue in the same session (step mode)
  const step2 = await client.web.browse({
    session_id: step1.session_id,
    cmd: 'Add the current product to cart and stop before checkout.',
  });

  console.log('Step 2 actions complete. Same session:', step2.session_id);
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Here you’re:

  • Using session_id to maintain a secure remote session in a real browser
  • Letting the agent handle all the dynamic selectors, login walls, and rendering
  • Avoiding the usual “one broken selector breaks everything” failure mode

This step-by-step use is what MultiOn calls Sessions + Step mode: intent in, actions executed, then another intent continuing the same session.


Step 7: Add Retrieve when you need structured JSON

The quick start web.browse example shows narrative output. When you want “JSON arrays of objects” (e.g., for catalog extraction), use MultiOn’s Retrieve surface.

A common pattern I use:

  1. web.browse to navigate and get the page into the right state (e.g., scrolled, filters applied)
  2. Retrieve to pull out structured data with render/scroll controls

Example using H&M again:

import 'dotenv/config';
import MultiOn from 'multion';

async function main() {
  const client = new MultiOn({ apiKey: process.env.MULTION_API_KEY });

  const result = await client.web.retrieve({
    url: 'https://www2.hm.com/en_us/men/products/view-all.html',
    renderJs: true,
    scrollToBottom: true,
    maxItems: 10,
    fields: ['name', 'price', 'url', 'image'],
  });

  // Expect a JSON array of objects: [{ name, price, url, image }, ...]
  console.dir(result.items || result, { depth: null });
}

main().catch((err) => {
  console.error(err);
  process.exit(1);
});

Key points:

  • renderJs: true handles JavaScript-heavy catalogs
  • scrollToBottom: true loads lazy content
  • maxItems controls how much to extract
  • Output is a structured JSON array ready for your app/backend, not just scraped HTML

Handling common errors fast

A few patterns you might hit on that first web.browse run:

  • Missing API key
    Error: Missing MULTION_API_KEY in .env (from our own guard).
    Fix: Double-check .env and that you ran with node index.mjs from the same folder.

  • Auth / quota issues
    You might see an error payload with status like 401 Unauthorized or 402 Payment Required.
    Fix: Confirm your key is valid, hasn’t been rotated, and that your account has capacity.

  • Network / timeout
    Treat this as an infrastructure issue: retry with backoff, or check for outbound network restrictions in your environment.


How this compares to rolling your own Playwright/Selenium

To run the same “Amazon search + add to cart” flow with Playwright or Selenium, you’d usually:

  • Stand up a browser farm or use a cloud provider
  • Handle selectors that break every time Amazon shuffles the DOM
  • Bolt on proxies and cookie jars to get through bot protection
  • Write a custom extraction layer for every page type

With MultiOn’s Agent API V1 Beta from Node:

  • You send a cmd + url to web.browse
  • MultiOn runs against a secure remote session with native proxy support tuned for real sites
  • You keep a session_id for continuity instead of juggling driver instances
  • You can scale to “millions of concurrent AI Agents” as a backend capability rather than nursing brittle scripts

The quick start script you just ran is intentionally small, but it’s sitting on top of that infrastructure.


Next: wire this into your own app

Once you’re comfortable with a single web.browse call from Node.js, the natural next steps are:

  • Wrap client.web.browse in your own service layer with per-feature commands
    e.g., amazonService.addToCart({ query: 'usb c hub' })
  • Use session_id to keep flows stateful in your backend (store it against user or task IDs)
  • Invoke Retrieve where you need structured JSON instead of free-form text
  • Run calls in parallel for fan-out workloads (e.g., many H&M category pages in parallel agents)

You’ve already done the hardest part: confirming your first web.browse call works from Node using the SDK. Everything else is layering proper orchestration on top of that same command-and-session pattern.

Get Started