MultiOn Chrome extension: how do I run actions locally in the user’s own browser session for authenticated sites?
On-Device Mobile AI Agents

MultiOn Chrome extension: how do I run actions locally in the user’s own browser session for authenticated sites?

10 min read

Most teams hit the same wall with authenticated sites: your users are already logged in, their cookies and sessions live in their own browser, but your automation can’t “see” any of it. The MultiOn Chrome extension exists for exactly this gap—it lets MultiOn’s agents operate inside the user’s active Chrome session, using their real auth state, without you rebuilding login flows or fighting bot protection from a remote environment.

This guide walks through how to run actions locally in the user’s own browser session with the MultiOn Chrome extension, when to reach for the extension vs the Agent API, and how to wire it into an app that needs reliable, authenticated web actions.


Why use the MultiOn Chrome extension for authenticated sites?

If you’ve ever tried to automate authenticated flows with Playwright/Selenium, you already know the pain:

  • You either hard-code credentials or build a full login harness per site.
  • Cookies and session tokens expire on you.
  • Bot protection trips as soon as your “test browser” doesn’t look human.

The MultiOn Chrome extension flips that pattern:

  • Runs in the user’s real browser – Reuses the user’s existing cookies, logins, and local storage.
  • Executes actions locally – MultiOn’s agent drives the current tab instead of a remote headless browser.
  • No credential handling – Your app never has to see passwords or 2FA flows; the extension works on top of whatever session the user already has.

That makes it ideal for:

  • Authenticated dashboards (SaaS tools, CRMs, analytics)
  • User-specific carts, wishlists, and order histories
  • Social accounts (posting, scheduling, profile updates) from the user’s own login
  • Any “logged-in-only” page where building and maintaining a login harness is a losing battle

How local actions with the Chrome extension actually work

Conceptually, the extension is a local “agent host” for a single browser profile:

  1. User installs and authorizes the MultiOn extension.
  2. Your app sends an instruction (natural language or structured) to MultiOn.
  3. MultiOn routes the task to the extension (when the user is active and the policy allows it).
  4. The extension:
    • Attaches to the active tab (or opens a new one).
    • Uses the existing auth session in that tab.
    • Executes the sequence of actions (click, type, scroll, navigate).
  5. The agent returns:
    • Confirmation/status that the workflow completed.
    • Optional structured output (if you’re doing retrieval or verification).

You’re not replicating the full Agent API environment locally; instead, you’re delegating the “browser control” part into the user’s current Chrome context where authentication already lives.


When to use the Chrome extension vs the Agent API

You effectively have two surfaces:

  • Agent API (V1 Beta)POST https://api.multion.ai/v1/web/browse with cmd + url, running in MultiOn’s secure remote sessions (great for server-side workflows, parallel agents, and automation that doesn’t depend on a user’s live browser session).
  • Chrome extension – Runs actions in the user’s own browser, perfect when:
    • You need their logged-in session.
    • The site has aggressive bot protection.
    • You want visible, user-trust-building interactions (“you can watch it do the thing”).

Use the Agent API when you own the account/session and need scale. Use the Chrome extension when the user owns the session and you want to piggyback on that context.


Step-by-step: Running actions locally in the user’s browser session

1. Have the user install the MultiOn Chrome extension

From your product, link users to the MultiOn Chrome extension listing and prompt installation during onboarding or when they enable “AI actions on websites.”

Implementation patterns you can use:

  • Inline CTA: “Enable AI actions in your browser” with a direct link to the extension.
  • Capability gating: Show a banner on authenticated pages: “To let us act on your behalf on this page, install the MultiOn Chrome extension.”

Once installed, the extension attaches to the user’s Chrome profile and can operate on any site within that profile, subject to permissions.

2. Ensure the user is authenticated on the target site

Because the extension operates inside the user’s browser, the auth model is simple:

  • The user logs in to the target site like they always do.
  • The site sets cookies, local storage, or indexedDB.
  • The MultiOn agent, via the extension, inherits that session state automatically.

As long as:

  • The tab is open (or can be opened).
  • The user session is still valid.

…the agent can act as that user without any additional login scripting on your side.

3. Trigger an action from your app

Your app is still the “intent source.” Typical flows look like:

  • A button like “Have MultiOn do this for me” next to a task.
  • A textbox where the user types “Clean up my cart on this site” or “Update my profile settings to X”.
  • Contextual triggers, like “Sync my latest wishlist items” while they’re on an ecommerce site.

Under the hood, you send a request to MultiOn describing:

  • What the user wants done (natural-language cmd-style instruction).
  • Optional target URL/context if you want to force navigation or validation.

In code terms (mirroring the Agent API shape, even if you’re using a client SDK), you’d conceptually be sending something like:

{
  "cmd": "On the current page, go to my order history and export my last 5 orders.",
  "context": {
    "mode": "local-chrome-extension",
    "tabBehavior": "useActiveTab"
  }
}

MultiOn uses that intent to route the workflow to the extension and construct the action plan.

4. Extension executes the workflow in the active session

Once the task is routed:

  • The extension attaches to the relevant tab (or opens a new tab with the target url).
  • It inspects the DOM in the real, fully-rendered page (with all dynamic JS applied).
  • It performs the same class of actions you’d script in Playwright/Selenium:
    • Click buttons or links.
    • Type into form fields.
    • Scroll, wait for elements, handle navigation.
  • It adapts when selectors or layouts shift, since the agent is plan-driven rather than hard-coded per-selector.

You don’t have to:

  • Script selectors.
  • Manage waits/timeouts.
  • Maintain brittle locators per site.

You simply rely on the agent to translate the cmd into the exact actions needed for the live page.

5. Get results and state back into your app

Once the action completes, MultiOn can return:

  • A success/failure status: “Order placed”, “Post published”, “Wishlist item added to cart.”
  • Optional data extraction, e.g., JSON describing:
    • Items purchased.
    • Updated settings.
    • Posted content and resulting URLs.

For teams that already use the Agent API’s Retrieve function to get “JSON arrays of objects” from web pages, the mental model is similar. The difference here is: the extension operates on the user’s auth-bound pages rather than a remote, generic view.


Example scenarios: running local actions on authenticated sites

Example 1: Act on a logged-in ecommerce wishlist

User story: “Buy my latest saved item from my wishlist.”

Flow with the Chrome extension:

  1. User is logged into an ecommerce site in Chrome with the extension installed.
  2. In your app (or a command interface powered by MultiOn), the user issues:
    “Buy my latest saved item from my wishlist.”
  3. Your backend sends the intent to MultiOn, specifying local execution via the extension.
  4. The extension:
    • Opens the wishlist (if not already there).
    • Identifies the latest item.
    • Adds it to cart.
    • Walks through checkout using the stored shipping and payment details.
  5. MultiOn returns:
    • A completion status.
    • Optional structured JSON: {"itemName": "...", "price": "...", "orderId": "..."}.

No scripts. No cookies exported. Everything happens in the user’s own logged-in session.

Example 2: Managing a logged-in social media account from the user’s browser

User story: “Post this text and image to my X account.”

  1. User installs the extension and logs into X in their browser.
  2. In your product, they connect their “X automation” feature (which really just ensures the extension is active and authorized).
  3. They click “Post” on some content. Your app sends something like:
    “In the active tab, create a new post on X with this text and attached image.”
  4. The extension:
    • Brings X to the foreground (or opens it).
    • Clicks Compose.
    • Types the given text.
    • Uploads the image (if that’s part of the flow).
    • Submits the post.
  5. MultiOn responds to your app with:
    {"status": "posted", "postUrl": "https://x.com/.../status/..."}

Again, auth and 2FA are fully handled by the user’s own login; you’re not touching those flows directly.


How this differs from running in remote sessions

The core Agent API (V1 Beta) and Sessions + Step mode are built for secure remote sessions you manage server-side:

  • You call POST https://api.multion.ai/v1/web/browse with cmd + url.
  • You get back a session_id that lets you:
    • Keep workflows alive across multiple steps (add to cart → then checkout).
    • Run parallel agents using that same pattern at scale.
  • You use controls like renderJs, scrollToBottom, and maxItems in Retrieve to shape how content is loaded and extracted as structured JSON.

This is ideal when:

  • You’re building backend workflows (like bulk Amazon ordering or catalog extraction from H&M).
  • You own or control the accounts involved.
  • You care about “infinite scalability with parallel agents.”

The Chrome extension variant is a complement, not a replacement:

  • It gives you access to sites where auth and bot protection make remote sessions impractical.
  • It keeps sensitive credentials local to the user’s browser.
  • It exposes visible, predictable actions that users can watch and verify.

Many teams end up using both:

  • Remote Agent API for scalable backend flows.
  • Chrome extension for user-specific, high-friction, logged-in experiences.

Reliability and constraints to keep in mind

Running locally via the extension is powerful, but you still need to be explicit about boundaries:

  • User presence: The local browser needs to be running; the extension can’t act if Chrome is closed.
  • Tab context: If you depend on a specific tab, make your instructions clear (“on the current tab” vs “open this URL in a new tab”).
  • Permissions: The user must grant the extension the required site access in Chrome. If they decline, your automations will be constrained.
  • Error handling: Treat failure states as first-class:
    • The user isn’t logged in.
    • The site layout has changed enough that the agent can’t complete the task.
    • Network issues on the user’s machine.

From a product standpoint, surface these clearly—“We couldn’t complete this action, please check you’re logged into [Site] and try again”—and let the user retry.


How this fits into your build vs. buy equation

If you’re currently:

  • Maintaining brittle Playwright/Selenium scripts just to click through authenticated sites.
  • Fighting session persistence, proxies, and dynamic rendering for user-specific flows.
  • Avoiding “take actions on the web on behalf of your users” features because of operational risk.

The MultiOn Chrome extension plus Agent API gives you a more sane architecture:

  • Intent in – your app collects a high-level user instruction.
  • Actions executed in a real browser – either in MultiOn’s secure remote sessions (Agent API) or the user’s own Chrome session (extension).
  • Structured JSON out – where needed, use Retrieve-style extractions to pull structured data back into your app.

You stop writing and maintaining selectors. You start wiring clear, reproducible workflows to predictable primitives: endpoints, sessions, and a local extension that already “sees” the user’s authenticated world.


Final verdict

To run actions locally in the user’s own browser session for authenticated sites with MultiOn:

  • Use the Chrome extension as your bridge into the user’s real, logged-in Chrome environment.
  • Treat your app as the intent orchestrator, sending clear instructions that MultiOn routes to the extension.
  • Let the extension leverage the user’s existing authentication and UI context to execute actions.
  • Combine this with the Agent API, Sessions + Step mode, and Retrieve when you need backend-scale automation and structured JSON outputs.

If you want to move from brittle, selector-based scripts to “intent in, real-browser actions, JSON out,” start by wiring one high-value, authenticated flow through the MultiOn Chrome extension and expand from there.

Next Step

Get Started