
Best tool to automate authenticated portal workflows and return structured JSON (not raw HTML)
Most teams find out the hard way that “automation that logs in” and “production-ready web agents” are two very different things. It’s easy enough to script a login and click a few buttons. It’s brutal to keep that script alive across dozens of portals, CAPTCHAs, and layout changes—and still get clean JSON back, not brittle HTML blobs your engineers have to parse.
Quick Answer: The best tool is an enterprise-grade Web Agent platform that can navigate authenticated portals like a user, handle anti-bot/CAPTCHAs, and return structured JSON via API. TinyFish is built specifically for this: one API, any website, authenticated workflows, and live, structured outputs—not raw HTML.
Frequently Asked Questions
What kind of tool do I actually need to automate authenticated portal workflows and get JSON back?
Short Answer: You need Web Agents—an infrastructure layer that can authenticate, navigate multi-step workflows, and return structured JSON outputs via API, not just a headless browser that dumps HTML.
Expanded Explanation:
Most stacks start with Playwright/Selenium plus proxies and a parser. They work for one or two sites. Then portals add MFA, rotate anti-bot vendors, or refactor their DOM, and you’re back to fixing test code instead of shipping product. The failure mode is always the same: the “automation” doesn’t understand the workflow; it only understands selectors.
The right tool abstracts all of that. You define the workflow (which portals, which credentials, what data you need), and the platform runs Web Agents that behave like a user: log in, navigate across tabs, fill forms, handle OTP/MFA where appropriate, and submit. Instead of pages, you get structured JSON: the specific fields or entities you care about, normalized for downstream systems. That’s the difference between a demo script and a production data pipeline.
Key Takeaways:
- Look for Web Agents that navigate/authenticate/extract/transact, not just “scrape pages.”
- Structured JSON outputs via API should be a first-class product feature, not an afterthought on top of raw HTML.
How does TinyFish automate authenticated portal workflows end‑to‑end?
Short Answer: You define the workflow and data schema; TinyFish Web Agents handle login, multi-step navigation, and form submission across portals, then return normalized JSON via a single API.
Expanded Explanation:
TinyFish is built as enterprise infrastructure for web data operations. You describe your portal workflows—targets, credentials, steps, and required outputs—as a goal, not a set of selectors. TinyFish then deploys Web Agents that run those workflows live, in parallel, against real websites.
Under the hood, each agent can authenticate (username/password, SSO, cookies, and compatible MFA flows), navigate multi-step forms, deal with dynamic content, and withstand CAPTCHAs and bot detection. It chooses the right interaction pattern per portal and keeps adapting as the UI shifts. You don’t manage browsers, proxies, or LLM calls; you just receive structured results as JSON via API, plus observability (logs, screenshots, run history) so you can audit exactly what happened.
Steps:
- Define the workflow: Which portals, which credentials, what actions, and what fields/entities you need in the final JSON.
- Deploy Web Agents: Use the TinyFish API to run agents concurrently across your target portals (from 1 to 1,000+ in parallel).
- Consume structured outputs: Receive JSON with the requested fields, alongside metadata, timestamps, and run traces—ready for your data warehouse, pricing engine, or internal tools.
How is TinyFish different from traditional scrapers, RPA bots, or custom Playwright/Selenium stacks?
Short Answer: Traditional tools give you HTML and headaches; TinyFish gives you live, authenticated execution at scale with structured JSON outputs and no browser/proxy/LLM infrastructure to manage.
Expanded Explanation:
Most teams end up choosing between three flawed options:
- Scrapers / search APIs: Fast, but only work on public pages and indexed content. They break on portals and give you stale data.
- RPA or DIY Playwright/Selenium: Can log in and click through complicated flows, but don’t scale. You own selectors, proxies, CAPTCHA solvers, and constant maintenance.
- Manual ops: Humans in portals. Accurate but slow, expensive, and impossible to scale when volume spikes.
TinyFish is designed as the alternative: Web Agents that run in the cloud, handle dynamic, authenticated workflows, and return structured JSON, not HTML. You get the accuracy of manual workflows with the speed and scale of a search engine.
Comparison Snapshot:
- Option A: DIY automation (Playwright/Selenium + proxies + CAPTCHAs):
Works behind auth, but breaks often, requires in-house specialists, and scales poorly across portals/countries. - Option B: TinyFish Web Agents:
One API. Any website. Live, authenticated workflows executed concurrently, with structured JSON results and 99.99% uptime. - Best for:
Teams that can’t tolerate stale or partial data—insurance quotes, competitor checkout totals, availability/eligibility checks, or any workflow where the “truth” only exists after completing multi-step forms inside a portal.
What does implementation look like if I want to use TinyFish for my portals?
Short Answer: You share your workflows and target portals; TinyFish helps define the agent behaviors and JSON schema, then you integrate a single API endpoint to run agents and consume results.
Expanded Explanation:
Most implementations follow a pattern: you come in with a set of portals (carrier portals, vendor dashboards, partner platforms, internal tools with no API), plus a clear definition of “what we need out the other side.” That might be rating factors, final prices including taxes/fees, eligibility flags, or availability in specific geos.
TinyFish works with you to codify that into agent goals and output schemas. From there, agents can be deployed across your portal list and scaled up or down based on your volume. Production runs stream progress over Server-Sent Events (SSE), so you can monitor thousands of simultaneous executions without polling. Typical teams go from initial workflow definition to production runs in weeks, not quarters, with success rates above 95% and 30M+ workflows/month already running across customers like Google and DoorDash.
What You Need:
- Clear workflow definitions: Target portals, credential models, steps (login → search → form → checkout → confirmation), and required fields in the JSON output.
- API integration & governance: An owner for integrating TinyFish’s API into your data pipeline or app, plus security alignment (SSO, permissions, audit trail, and data handling requirements).
How should I think strategically about using a tool like TinyFish for portal automation and JSON data pipelines?
Short Answer: Treat TinyFish as your web data infrastructure layer—centralizing authenticated workflows into one API that reliably turns portal actions into structured, decision-ready JSON at production speed.
Expanded Explanation:
The strategic mistake I see over and over: teams treat portal automation as one-off projects. One carrier. One vendor. One marketplace. Each gets its own brittle script and set of proxies, which become operational debt the minute someone changes a login screen.
A Web Agent platform lets you flip that logic. Instead of “one script per portal,” you converge on one infrastructure layer that handles every portal with the same operational guarantees: sub-minute runs where possible, 1,000+ concurrent agents, 99.99% uptime, centralized observability, and unified JSON outputs. That gives you a real-time, reliable view of the web “truth” your business depends on—quotes, rates, offers, inventory, eligibility—without relying on stale indexed data or a human workforce.
Over time, your most critical workflows move from AI-driven adaptation to deterministic execution—codified sequences that are cheaper per run and easier to govern. You get better unit economics per operation while keeping the agility to adapt as portals change.
Why It Matters:
- Operational risk: Relying on cached or partial web data is dangerous when pricing, availability, or eligibility shifts hour by hour. Live, authenticated execution reduces that risk.
- Scalable advantage: Once your portal workflows are standardized as Web Agents with JSON outputs, you can add new markets, carriers, or partners without multiplying complexity—just add targets to the same infrastructure.
Quick Recap
Automating authenticated portal workflows isn’t about “scraping pages”—it’s about reliably executing full user journeys (logins, forms, checkouts, portals) and turning the results into clean, structured JSON your systems can trust. Traditional scrapers, RPA, and DIY Playwright/Selenium stacks struggle with scale, reliability, and maintenance. A Web Agent platform like TinyFish centralizes this into one API: define your workflow, deploy agents concurrently, and receive live, structured outputs with enterprise reliability and observability.