How do I use MultiOn sessions so my agent can continue a workflow using the same session_id across multiple calls?
On-Device Mobile AI Agents

How do I use MultiOn sessions so my agent can continue a workflow using the same session_id across multiple calls?

14 min read

Most teams run into the same wall with browser agents: you can start a workflow, but you can’t reliably continue it. Every call feels stateless, so logins, carts, and in-progress forms disappear between steps. MultiOn’s Sessions + Step mode is designed to fix exactly that by giving you one core primitive: a session_id you can reuse across multiple calls so your agent keeps operating inside the same secure remote browser session.

This guide walks through how to use MultiOn sessions so your agent can continue a workflow with the same session_id—from first call to completion—using patterns that hold up under real-world, login-heavy flows.


How MultiOn sessions work (mental model first)

With the Agent API (V1 Beta), every call to POST https://api.multion.ai/v1/web/browse runs inside a secure remote browser session.

You have two options:

  • Let MultiOn create the session for you (no session_id in the request).
  • Explicitly continue a session by sending a session_id you got from a previous response.

That session_id is your handle to:

  • Stay logged in across steps.
  • Keep cookies, localStorage, and in-page state alive.
  • Chain multiple commands (add to cart → checkout → confirm) as if a single user was driving the browser.

You can think of it as: intent in → remote browser actions in the same session → JSON out, repeated until the workflow finishes.


Basic session flow: one workflow, many calls

At a high level, using sessions with a stable session_id looks like this:

  1. Start the workflow (no session_id yet).
  2. Capture the returned session_id from the response.
  3. Continue the workflow by passing that same session_id in subsequent calls.
  4. Finish when the task is done (or when you decide to drop the session).

Let’s break that down with concrete patterns.


Step 1: Start a session and get the session_id

Your first call usually has two jobs: open the target URL and execute the first command. You don’t send a session_id here; MultiOn will create one and return it.

Example: start an Amazon session and search for a product

curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "url": "https://www.amazon.com/",
    "cmd": "Search for wireless noise-cancelling headphones and open the results page."
  }'

A typical response (simplified) will include something like:

{
  "session_id": "sess_abc123",
  "status": "success",
  "result": {
    "summary": "Amazon search results for wireless noise-cancelling headphones.",
    "url": "https://www.amazon.com/s?k=wireless+noise+cancelling+headphones"
  }
}

Critical step: persist "sess_abc123" (e.g., in your DB, Redis, or in-memory depending on your architecture). Every subsequent call that should “live” in this same browser must include that exact session_id.


Step 2: Continue the workflow using the same session_id

Once you have a session_id, the pattern is:

  • Keep the url pointed at the current or relevant page (MultiOn can navigate from there).
  • Send a new cmd that describes the next action.
  • Include the session_id in the body.

Example: add a specific Amazon result to cart in the same session

curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "session_id": "sess_abc123",
    "url": "https://www.amazon.com/s?k=wireless+noise+cancelling+headphones",
    "cmd": "Open the first product with at least 4 stars and add it to the cart."
  }'

Because session_id is the same:

  • Any login you performed earlier remains valid.
  • The cart and user preferences are preserved.
  • You’re operating on the same remote browser instance, not a fresh one.

You repeat this pattern for each step until checkout is complete.


Step 3: Use Step mode for fine-grained control (optional but recommended)

For long or sensitive workflows (e.g., checkout, posting on X), you may want to:

  • See intermediate state.
  • Correct or adjust commands mid-flow.
  • Confirm that a specific action (like price or shipping) is correct before continuing.

This is where Step mode comes in. While implementation details can change, the core idea is:

  • MultiOn runs the workflow in steps.
  • You can explicitly move to the next step while keeping the same session_id.

A common pattern is:

  1. Call web/browse with a cmd that describes the next chunk of work.
  2. Get back:
    • session_id (same as before).
    • A step-level summary or state.
  3. Decide if you:
    • Send another cmd in the same session to continue the flow, or
    • Stop and close out the session on your side.

For long workflows (e.g., multi-page checkout with address selection, shipping selection, and payment), you’d do one step per decision point, using the same session_id across all of them.


Session usage example: full Amazon checkout

Here’s a simplified end-to-end pattern using the same session_id across multiple calls.

Call 1: search (session created)

# 1) Search for a product on Amazon (no session_id yet)
curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "url": "https://www.amazon.com/",
    "cmd": "Search for the latest Apple AirPods Pro and open the results page."
  }'

Response includes:

{
  "session_id": "sess_checkout_001",
  "result": { "summary": "Search results page loaded." }
}

Call 2: open product and add to cart (same session)

# 2) Add a specific product to cart in the same session
curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "session_id": "sess_checkout_001",
    "url": "https://www.amazon.com/s?k=apple+airpods+pro",
    "cmd": "Open the latest Apple AirPods Pro product page and add one item to the cart."
  }'

Call 3: go to checkout (still same session)

# 3) Navigate to checkout
curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "session_id": "sess_checkout_001",
    "url": "https://www.amazon.com/cart",
    "cmd": "Proceed to the checkout page."
  }'

Call 4: finalize order (session continuity preserved)

# 4) Confirm and place the order
curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "session_id": "sess_checkout_001",
    "url": "https://www.amazon.com/checkout",
    "cmd": "Confirm the default shipping address and payment method, then place the order."
  }'

Throughout all four calls, you’re in the same remote browser session because the session_id never changes.


Session usage example: posting on X in multiple steps

Same pattern applies to social flows, which are typically login-dependent.

Call 1: log in and land on the home timeline

curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "url": "https://x.com/login",
    "cmd": "Log in using the saved credentials for this test account and go to the home timeline."
  }'

Assume the response returns:

{ "session_id": "sess_x_post_01", "result": { "summary": "Logged in and on home timeline." } }

Call 2: draft and post

curl -X POST "https://api.multion.ai/v1/web/browse" \
  -H "Content-Type: application/json" \
  -H "X_MULTION_API_KEY: $MULTION_API_KEY" \
  -d '{
    "session_id": "sess_x_post_01",
    "url": "https://x.com/home",
    "cmd": "Compose a new post that says: `Testing MultiOn sessions with a persistent browser agent.` and publish it."
  }'

The login holds across calls because the browser session is the same.


How Retrieve fits into session-based workflows

When you need structured data mid-workflow (e.g., pull product metadata after search, or validate items in a cart), use MultiOn’s Retrieve function to convert dynamic pages into JSON arrays of objects.

While specifics can vary, the general pattern is:

  1. Use web/browse to move the browser to the right page using session_id.
  2. Call the Retrieve function with controls like:
    • renderJs – to fully render JS-heavy pages.
    • scrollToBottom – to handle lazy-loaded content.
    • maxItems – to limit items extracted.
  3. Receive a JSON array of objects with structured fields you care about.

This lets you make decisions in your own application (e.g., pick the cheapest item, filter by rating) and then send another web/browse call in the same session to act on that decision.


Implementation best practices for managing session_id

As someone who’s spent years watching brittle test harnesses break on login flows, here’s what matters for making MultiOn sessions reliable:

1. Treat session_id as a first-class resource

  • Persist it per user workflow (DB row, Redis key, or in-memory map in short-lived services).
  • Include metadata like:
    • user_id
    • workflow_type (e.g., amazon_checkout, x_post)
    • started_at, last_used_at
  • Explicitly decide when to expire or drop a session in your app layer.

2. Pass the session_id on every continuation call

It’s easy to accidentally omit or mismatch the ID. Guard against this:

  • Enforce a rule in your client wrapper:
    • “If this is not the first step of a workflow, a session_id is required.”
  • In typed languages, model this as:
    • StartSessionRequest (no session_id allowed).
    • ContinueSessionRequest (requires session_id).

3. Handle error states and billing gating

MultiOn surfaces operational constraints explicitly, including responses like 402 Payment Required. Your session handling logic should:

  • Check for a non-success status on every call.
  • Avoid reusing a session_id if:
    • The session is clearly expired/invalid.
    • You are repeatedly hitting payment or quota gating.

Instead, start a new session, or surface the issue to the user.

4. Respect secure remote sessions and bot protection

MultiOn runs in secure remote sessions with native proxy support for tricky bot protection. From your side, that means:

  • Don’t spin up your own fragile proxy farm to “help” it—let the platform handle that layer.
  • Focus on good cmd design and correct session_id continuity instead of hacking around bot checks.

Common pitfalls when using MultiOn sessions

Here are the failure modes I’ve seen most often in session-based agent setups—and how to avoid them.

Pitfall 1: Forgetting to store the initial session_id

If you treat the first response as just “result text,” you’ll lose the session handle.

Fix: Make capturing session_id non-optional. For example:

const response = await multionBrowse(request);
if (!response.session_id) {
  throw new Error("MultiOn response did not include session_id");
}
saveSession(response.session_id, workflowContext);

Pitfall 2: Reusing a session_id across unrelated workflows

Using one session for multiple independent tasks can cross-contaminate state (wrong cart contents, wrong logged-in account).

Fix: One session_id per logical workflow. If the user starts a brand-new flow, start a new MultiOn session even if the previous one still exists.

Pitfall 3: Assuming a session lasts forever

Sessions are a runtime resource. Treat them as bounded.

Fix:

  • Implement timeouts (e.g., invalidate after N minutes of inactivity).
  • Handle “session is no longer valid” by starting a fresh session and prompting the user if needed.

Quick checklist: continuing workflows with the same session_id

When you want your MultiOn agent to continue a workflow across multiple calls:

  1. First call

    • Call POST https://api.multion.ai/v1/web/browse with url + cmd.
    • Do not send session_id.
    • Capture session_id from the response.
  2. Subsequent calls

    • Always include that session_id in the request body.
    • Keep the url aligned with the current/target page.
    • Update cmd to describe the next step (e.g., add to cart, navigate to checkout, post, etc.).
  3. Optional: Step mode

    • Use stepwise commands to inspect and control each part of the flow.
    • Continue stepping with the same session_id until the workflow is complete.
  4. Data extraction mid-flow

    • Use Retrieve with renderJs, scrollToBottom, and maxItems to get structured JSON.
    • Use that JSON to drive your next cmd in the same session.
  5. End-of-life

    • Decide when your app considers the session done.
    • Drop or archive the session_id and start fresh for new workflows.

Ranking comparison: best ways to use MultiOn sessions for continuous workflows

Quick Answer: The best overall choice for production-grade, continuous workflows is Sessions + Step mode via the Agent API (V1 Beta). If your priority is quick local experimentation in your own browser, the Chrome Browser Extension is often a stronger fit. For high-volume data-driven flows that mix browsing with structured JSON outputs, consider Agent API + Retrieve in tandem.

At-a-Glance Comparison

RankOptionBest ForPrimary StrengthWatch Out For
1Sessions + Step mode via Agent API (V1 Beta)End-to-end, multi-step web workflows in productionPrecise control over session_id and stepwise progressionRequires you to manage session lifecycle in your backend
2Chrome Browser ExtensionLocal, single-user sessions and quick workflow prototypingReuses your own browser context for intuitive debuggingNot designed for backend-scale, multi-user orchestration
3Agent API + Retrieve in tandemData-rich flows that mix actions with structured extractionCombines persistent sessions with JSON arrays of objectsYou must design the handoff between action steps and retrieval logic

Comparison Criteria

We evaluated each option against the following criteria to ensure a fair comparison:

  • Session Continuity Control: How directly you can manage and reuse a session_id across multiple calls and steps.
  • Scalability & Parallelism: How well the option supports many concurrent workflows, each with its own session.
  • Data & Integration Depth: How easily you can pull structured outputs (JSON) and plug session-based workflows into your existing backend.

Detailed Breakdown

1. Sessions + Step mode via Agent API (V1 Beta) (Best overall for production workflows)

Sessions + Step mode via the Agent API (V1 Beta) ranks as the top choice because it gives you full control over session_id continuity and stepwise execution in a secure remote browser environment.

What it does well:

  • Session continuity control:
    You explicitly capture session_id from the first web/browse response and pass it in every subsequent call. That means the same remote browser is used for the full workflow—login, cart, forms, and all.

  • Scalable orchestration:
    Because the Agent API is just HTTP (POST https://api.multion.ai/v1/web/browse with X_MULTION_API_KEY), you can run “millions of concurrent AI Agents” conceptually by pairing session_id with your own workflow identifiers and spinning them in parallel.

Tradeoffs & Limitations:

  • You own lifecycle logic:
    You’re responsible for deciding when to create, reuse, or expire a session_id. That’s a feature (flexibility) but also a responsibility—ignore it and you’ll leak sessions or conflate workflows.

Decision Trigger: Choose Sessions + Step mode via Agent API (V1 Beta) if you want end-to-end automation (e.g., Amazon ordering, posting on X) running from your backend and you’re willing to manage session_id lifecycle as a first-class concern.


2. Chrome Browser Extension (Best for local prototyping and single-user flows)

The Chrome Browser Extension is the strongest fit here because it lets you test and iterate session-based behavior “locally” in your own browser before you operationalize it via the Agent API.

What it does well:

  • Fast feedback loops:
    You can see how commands behave in a real browser that you control, including logins and complex UIs, without writing API calls first. This is ideal for shaping good cmd patterns before you encode them into code.

  • Leverages your existing browser context:
    The extension can work within your current session (cookies, logins, tabs), making it easy to debug visual flows and confirm the steps you later replicate in the Agent API with session_id.

Tradeoffs & Limitations:

  • Not a backend orchestration layer:
    The extension is tied to a user’s Chrome instance. It’s not meant for “infinite scalability with parallel agents” or multi-tenant session management in your backend.

Decision Trigger: Choose the Chrome Browser Extension if you want to prototype workflows and understand how sessions should behave before you wire up the Agent API and take full control of session_id in a server environment.


3. Agent API + Retrieve in tandem (Best for data-driven, mixed action + extraction flows)

Agent API + Retrieve in tandem stands out for this scenario because it combines persistent sessions with structured extraction, letting you use session_id both to act and to read data in a controlled way.

What it does well:

  • Action + JSON output loop:
    You use web/browse with a stable session_id to navigate and act (e.g., search an H&M catalog), then call Retrieve with renderJs, scrollToBottom, and maxItems to get a JSON array of objects (name, price, colors, URLs, images). Your app makes a decision, and you send another web/browse in the same session to act on that decision.

  • Robust on dynamic pages:
    Because Retrieve is built for JS-heavy, lazy-loaded pages, you avoid rolling your own brittle scrapers. The session ensures you’re extracting from the same browser state you used for actions.

Tradeoffs & Limitations:

  • You design the orchestration:
    You must design how your backend alternates between “act” (web/browse) and “read” (Retrieve) for each session_id. That’s powerful but more complex than an action-only flow.

Decision Trigger: Choose Agent API + Retrieve in tandem if you want workflows that not only act on websites but also yield structured JSON outputs mid-flow to feed pricing logic, selection rules, or downstream systems.


Final Verdict

To keep a MultiOn agent reliably continuing a workflow with the same session_id across multiple calls, use the Agent API (V1 Beta) with Sessions + Step mode as your default. Start a session with web/browse, capture the session_id, and include it on every subsequent call until the workflow is complete. For visual debugging and shaping good commands, lean on the Chrome Browser Extension first, then encode those flows in your backend. When you need both persistent actions and structured data, pair the Agent API with Retrieve so you can navigate in-session and still get JSON arrays of objects from dynamic pages.

The real reliability lever isn’t another flaky selector—it’s treating session_id as your unit of continuity.

Next Step

Get Started