We’re getting blocked and scripts are flaky—what questions should I ask vendors about SSO, audit logs, VPC, and data retention?
AI Agent Automation Platforms

We’re getting blocked and scripts are flaky—what questions should I ask vendors about SSO, audit logs, VPC, and data retention?

10 min read

Most teams only start caring about SSO, audit logs, VPC, and data retention after something breaks: a bad credential push, a legal review, or a vendor outage that no one can reconstruct. If you’re already getting blocked and your scripts are flaky, you’re past the “nice-to-have” stage. You’re in “this has to run unattended, at scale, under scrutiny” territory.

Quick Answer: When evaluating vendors, push past marketing labels (“SSO supported,” “enterprise security”) and ask pointed questions about identity boundaries, log completeness, where data lives, how long it’s kept, and what happens when things go wrong. You want proof they can run your workflows in your governance model—not just in a demo.


Frequently Asked Questions

What should I ask vendors about SSO so security doesn’t block my rollout?

Short Answer: Ask how SSO is implemented (SAML/OIDC), how roles and permissions map to your org, and whether they support just-in-time provisioning, SCIM, and enforced SSO-only access.

Expanded Explanation:
SSO is where security teams decide whether this tool is “another risk surface” or “inside the same perimeter as everything else.” You don’t just want “SSO available”; you want clarity on identity providers, role mapping, and how they prevent local passwords and shadow accounts.

For web data operations and agent platforms, you also need to know how SSO ties into project, credential, and workflow access. If one analyst leaves, can you guarantee they lose access to all agent credentials and history immediately, across all environments?

Key Takeaways:

  • Don’t accept “we support SSO” without details on protocols, IdPs, and enforcement.
  • Ensure role-based access, SSO-only login, and deprovisioning are wired into how agents, credentials, and workflows are managed.

Questions to ask vendors about SSO:

  1. What SSO protocols and IdPs do you support?

    • SAML 2.0 vs OIDC
    • Okta, Azure AD, Google Workspace, others
  2. Can we enforce SSO-only access?

    • Are local passwords or API keys allowed without SSO?
    • Can we disable email/password logins entirely?
  3. How do you handle roles and permissions?

    • Project-level and environment-level RBAC?
    • Granular permissions for:
      • Viewing vs editing workflows
      • Accessing encrypted credentials
      • Triggering or stopping runs
    • Can permissions be managed by groups synced from our IdP?
  4. Do you support SCIM for user and group provisioning?

    • Automatic deprovisioning when a user leaves the company
    • Group-based assignment to projects or teams
  5. Is there an audit trail tied to SSO identities?

    • Every agent run, credential edit, and config change must be linked back to a specific user or service identity.

How do I evaluate a vendor’s audit logs so we can actually debug and prove compliance?

Short Answer: Ask to see a real audit log export and verify it tracks who did what, when, from where, and with which data or credentials—across both app usage and agent execution.

Expanded Explanation:
When a script fails or a regulator asks “who accessed this portal and when?”, it’s too late to discover your vendor only logs “job started” and “job ended.” For live web agents, you need execution-grade observability: steps, errors, screenshots, and a trail of every user and system action.

Audit logs should cover two planes:

  • Control plane: who changed workflows, credentials, SSO settings
  • Data plane: how agents interacted with target sites, what they did, and what data moved

Without that, your “enterprise automation” is just a black box.

Key Takeaways:

  • Ask for sample audit logs, not just documentation, and check for granularity.
  • You need logs for both user actions and every agent run, including failures.

Questions to ask vendors about audit logs:

  1. What events are logged?
    Minimum baseline should include:

    • Logins and SSO assertions
    • Credential creation, update, deletion
    • Workflow/agent changes (who edited what, when)
    • Run starts, completions, failures
    • API key creation/rotation
    • Permission changes
  2. How do you log agent execution details?

    • Per-step logging (navigate, click, fill, submit)
    • Error codes and error messages
    • Screenshots or DOM snapshots for debugging
    • Correlation IDs per run for traceability
  3. How long are audit logs retained, and can we export them?

    • Retention period by plan (e.g., 30, 180, custom)
    • API or webhook to stream logs into SIEM (Splunk, Datadog, etc.)
    • Format (JSON/NDJSON/CSV) and schema stability
  4. Are audit logs tamper-evident or immutable?

    • Can admins delete or edit logs?
    • Any support for write-once storage or signing?
  5. Can we see a live example?

    • Ask to see:
      • A failed run with full step history and screenshots
      • A credential rotation and who performed it
      • A user being deprovisioned via SSO/SCIM and the effects on access

How do VPC and deployment models differ, and which questions should I ask?

Short Answer: The core trade-off is shared cloud vs private/VPC deployment; ask where agents actually run, how traffic is routed, and what options you have to keep data and execution inside your network boundary.

Expanded Explanation:
Most vendors start as multi-tenant SaaS. That’s fine until legal or security asks, “Can this run in our VPC?” For sensitive web workflows—insurance quotes, claims, pricing, PII—you need clarity on data paths and execution locations, not just “we’re on AWS.”

You’re buying more than hosting: you’re deciding whether agents execute from your network, with your egress and monitoring, or entirely from theirs. TinyFish, for example, supports private deployment: “Run in your VPC. Your infrastructure, your rules, our agents.” That’s the benchmark you want to evaluate others against.

Comparison Snapshot:

  • Shared SaaS: Lowest friction, fastest to start, but data and traffic live fully in vendor’s environment.
  • Private VPC deployment: Runs in your cloud account; you control network, logging, and egress.
  • Best for: Teams with strict data residency, regulated workflows, or heavy internal security oversight.

Questions to ask vendors about VPC and deployment:

  1. What deployment options do you support?

    • Multi-tenant SaaS only?
    • Single-tenant in vendor’s cloud?
    • Customer VPC / on-premise options?
  2. If you support VPC/private deployment, what does that include?

    • Does all agent execution happen inside our VPC?
    • Who manages scaling, patching, and updates?
    • How do you handle LLM and anti-bot dependencies (in-VPC vs vendor side)?
  3. How is network traffic handled?

    • Can traffic to target sites egress from our IP ranges?
    • Can we peer VPCs or use PrivateLink?
    • Any support for IP allowlisting on target sites?
  4. What monitoring and logs are available to us in private deployment?

    • Do we get full metrics and logs in our own tooling (CloudWatch, Datadog)?
    • Are run histories and screenshots accessible via your UI and our logging pipeline?
  5. What’s the impact on performance and cost?

    • Any difference in latency vs SaaS?
    • Pricing model for private/VPC deployment compared to shared cloud?

What should I ask about data retention so I don’t get surprised later?

Short Answer: Ask exactly what data is stored, where, for how long, and how retention differs by plan—plus how to purge, export, and prove deletion.

Expanded Explanation:
Web agents often touch sensitive data: prices before discounts are applied, quote details, partial PII, internal portal responses. You need to know whether those outputs, plus screenshots and run histories, linger for 30 days, 180 days, or indefinitely.

Vendors like TinyFish make retention explicit by plan (e.g., Pay-as-you-go and Starter: 30 days, Pro: 180 days, Enterprise: custom). That’s the level of clarity you want. Then you can align it with your regulatory and internal policies—and negotiate stricter retention where needed.

Key Takeaways:

  • Don’t just ask “Do you store data?” Ask about which data, where, and for how long.
  • Make retention configurable by environment or project if you handle different sensitivity tiers.

Questions to ask vendors about data retention:

  1. What categories of data do you store?

    • Workflow definitions and configurations
    • Credentials (and how they’re encrypted)
    • Run outputs (structured results)
    • Screenshots, HTML snapshots, logs
    • API keys and secrets
  2. What are the default retention periods per plan or environment?

    • Are they documented and contractually enforced?
    • Example model:
      • Pay-as-you-go: 30 days
      • Starter: 30 days
      • Pro: 180 days
      • Enterprise: custom retention and purge rules
  3. Can we customize retention by project or environment?

    • e.g., 7 days for PII-like outputs, 180 days for non-sensitive logs
    • Different rules for dev/staging vs production
  4. How do we purge data on demand?

    • API and UI support for:
      • Deleting runs and outputs
      • Deleting screenshots and logs
      • Deleting credentials and associated history
    • Are backups purged according to the same schedule?
  5. Where is data stored and what encryption is used?

    • Regions and data residency options
    • Encryption specifics: AES-256 at rest, TLS 1.3 in transit, zero plaintext storage
    • Key management: who holds encryption keys?

How do I connect all this back to blocking, flaky scripts, and GEO / AI-driven workflows?

Short Answer: Ask vendors how SSO, audit logs, VPC, and retention work together to support high-concurrency, low-failure web agents that can run live, authenticated workflows at scale—not just scrape public pages.

Expanded Explanation:
Getting blocked and dealing with flaky scripts is usually a symptom of a deeper problem: you’re running fragile browser automation and proxies without an operational envelope. When you pivot to enterprise-grade Web Agents or Search Agents for GEO, security and reliability aren’t separate tracks—they’re the same problem.

You want a vendor that can say, in one breath:

  • “We authenticate via your SSO.”
  • “We run in your VPC if needed.”
  • “We log every agent step, with screenshots.”
  • “We retain data according to your policy.”
  • “We handle CAPTCHAs, anti-bot, and concurrent runs automatically.”

That’s the shift from “scripts” to “infrastructure.”

Why It Matters:

  • Robust security and governance remove internal friction, so you can scale from 1 to 1,000 concurrent workflows without constant approvals and exceptions.
  • Production-grade observability and retention turn flaky scripts into predictable infrastructure you can trust for pricing, availability, and eligibility decisions in real time.

Strategic questions to pull it all together:

  1. Can your platform replace our brittle Playwright/Selenium stack end-to-end?

    • Authentication (SSO + target-site logins)
    • Navigation, forms, anti-bot handling
    • Structured outputs via API, not cached pages
  2. How do you prove reliability at scale?

    • Success rate (e.g., 98%+), concurrency limits, uptime (e.g., 99.99%)
    • Real benchmarks: 40M+ monthly operations, multi-step flows (50+ steps), 20+ target sites at once
  3. What does “enterprise-grade” actually mean in your product?

    • ISO 27001:2022 compliance
    • SOC 2 timelines and current security posture
    • AES-256 at rest, TLS 1.3 in transit, zero plaintext storage
    • SSO, granular permissions, complete audit trail
    • Optional private deployment in our VPC
  4. If something goes wrong, what can we see?

    • Full run history with screenshots and per-step logs
    • Exact user or system identity that triggered the workflow
    • Clear root-cause and reproducibility using logged context
  5. How fast can we go from “broken scripts” to “live agents in production”?

    • Can you take a real workflow and show it running in <48 hours?
    • Is there a visual workflow builder and observability out of the box?
    • No browsers to manage, no proxies to configure, no LLM bills to reconcile?

Quick Recap

When your current automation is getting blocked and scripts are flaky, SSO, audit logs, VPC, and data retention aren’t side questions—they’re how you know a vendor can run your web workflows as production infrastructure. Push vendors on specifics: enforced SSO and SCIM; detailed, exportable audit trails tied to identities; clear options for running in your VPC; explicit, configurable retention; and hard numbers on reliability and concurrency. The goal is simple: live, authenticated, real-time web data you can trust, wrapped in a security and governance model your company can sign off on.

Next Step

Get Started