How do I enable Operant “private mode” and inline redaction so PII/PHI doesn’t leave our environment?
AI Application Security

How do I enable Operant “private mode” and inline redaction so PII/PHI doesn’t leave our environment?

10 min read

Most security teams I talk to want one simple guarantee: sensitive data never leaves their environment without their explicit say‑so. That’s exactly what Operant’s “private mode” and inline auto‑redaction are designed to enforce—at runtime, on live traffic, without yet another months‑long instrumentation project.

This guide walks through how private mode works, how inline redaction behaves on the wire, and how to enable and tune them so that PII/PHI and other sensitive data stay inside your boundaries while your AI apps, APIs, and agents keep running at full speed.


What “private mode” and inline redaction actually do

Before flipping switches, it’s important to get the mental model right.

Private mode in Operant means:

  • All runtime analysis happens inside your environment (your Kubernetes cluster / cloud account).
  • You don’t have to ship raw payloads or secrets to another vendor (including Operant) just to get detection.
  • You retain full control over what, if anything, leaves your perimeter for logging, analytics, or support.

Inline auto‑redaction means:

  • Operant inspects data in‑flight across AI prompts, tool calls, APIs, and agent workflows.
  • It detects PII/PHI and other sensitive tokens (e.g., SSNs, API keys, phone numbers).
  • It either:
    • Blocks the transfer outright, or
    • Redacts/obfuscates the sensitive parts inline and lets the rest of the request/response through.

All of this happens inside your environment, before any data is sent to third parties—LLM providers, SaaS APIs, external tools, or other clouds. That’s the core guarantee you’re configuring.


Prerequisites: Get Operant running in your environment

To enable private mode and inline redaction, you first need Operant’s Runtime AI Application Defense Platform deployed on your live stack.

At a minimum, you should have:

  • A Kubernetes cluster (EKS, AKS, GKE, OpenShift, or similar).
  • Network paths going through components Operant can observe and enforce on (e.g., sidecar, daemonset, or gateway mode depending on your deployment plan).
  • Permissions to apply a Helm chart and update cluster configuration.

The rollout pattern:

  • Single step Helm install.
  • Zero instrumentation. Zero integrations.
  • Works in < 5 minutes on live traffic.

Your Operant deployment guide or support contact will give you the exact Helm repo and values file. The key thing: once the Operant control plane and data plane are active in your cluster, you can start enforcing private mode and inline redaction without touching application code.


Step 1: Run Operant in private mode

Most teams deploy Operant in private mode by default as part of their standard Helm values. The high‑level goals:

  • Keep data‑in‑use entirely inside your environment.
  • Limit external communication to aggregated telemetry or metadata (if you choose to enable it).
  • Maintain compatibility with your compliance regimes (HIPAA, PCI DSS v4, NIST 800‑53, EU AI Act controls).

Conceptual configuration

In private mode, Operant:

  • Processes runtime traffic locally (Kubernetes‑native).
  • Keeps sensitive payload content (PII/PHI, secrets, proprietary data) local to the cluster.
  • Restricts any outbound communication to non‑sensitive stats or explicitly allowed logs.

Your Helm values.yaml will typically reflect this with flags such as (illustrative example, naming may vary by version):

operant:
  mode: private           # Keep processing inside your environment
  dataPrivacy:
    sendPayloadsOffCluster: false
    sendAnonymizedEvents: true   # Optional, can also be false for strictest regimes
    piiLogging: redacted         # Ensure logs never contain raw PII/PHI

What matters for your threat model:

  • sendPayloadsOffCluster: false ensures no request/response bodies (prompts, tool calls, API payloads) are transmitted to Operant’s SaaS or any other external endpoint.
  • Any analytics you do enable are built on derived signals, not on raw content, letting you align with privacy and residency requirements.

If you’re in a highly regulated environment (healthcare, financial services, public sector), you can run private mode with all outbound events disabled, and Operant still performs full inline detection and enforcement inside your cluster.


Step 2: Turn on inline auto‑redaction of PII/PHI

With private mode in place, the next move is to enable inline auto‑redaction so PII/PHI never exits your perimeter in the first place—even to approved third parties.

How inline redaction behaves at runtime

When inline auto‑redaction is enabled:

  1. Operant inspects traffic across:
    • AI prompts and responses (GenAI, LLMs, RAG flows).
    • MCP tools and agentic workflows.
    • Internal and external APIs, including AI endpoints.
  2. It detects sensitive data patterns, including:
    • PII (names, emails, phone numbers, addresses).
    • PHI (patient identifiers, medical record numbers in healthcare flows).
    • Secrets and identifiers (SSNs, API keys, tokens).
  3. It applies your policy:
    • Block the request/response containing sensitive data, or
    • Inline auto‑redact/obfuscate the sensitive segments and forward the non‑sensitive remainder.

Redaction happens in‑line, in real time, before the data leaves your application perimeter or your cloud account.

Example policy: redact PII/PHI leaving the cluster

An illustrative configuration snippet might look like:

securityPolicies:
  inlineRedaction:
    enabled: true
    scopes:
      - name: redact-pii-phi-outbound
        match:
          destinations:
            - type: external
              category: llm           # e.g., OpenAI, Anthropic, hosted models
            - type: external
              category: saas-api      # CRM, support tools, etc.
        detect:
          pii: true
          phi: true
          secrets: true
        action:
          mode: redact               # or "block" for hard‑fail
          redactionStyle: tokenized  # e.g., ****, or structured masking
          logSample: false           # don’t log raw examples

Key behaviors:

  • Any outbound call to an external LLM or SaaS API will have PII/PHI removed or masked inline.
  • Non‑sensitive fields (e.g., prompt instructions, product metadata) continue to flow normally.
  • You maintain full data privacy while still leveraging AI tools and external services.

Step 3: Decide when to block vs redact

Inline auto‑redaction is powerful because it avoids blocking entire workflows when only a few fields are problematic. But for some surfaces, you’ll still want hard stops.

A pragmatic pattern I recommend:

  • Block on:

    • Direct attempts to export full datasets (e.g., entire patient tables to an LLM).
    • Known secret types (API keys, access tokens, database passwords).
    • AI supply chain paths where you don’t trust the endpoint.
  • Redact on:

    • User‑generated prompts that occasionally contain PII/PHI.
    • Internal app‑to‑LLM flows where context is important but identifiers aren’t.
    • Agentic workflows where you want the workflow to proceed but not leak identity.

Example:

securityPolicies:
  piiAndSecretsProtection:
    match:
      destinations:
        - type: external
    detect:
      pii: true
      phi: true
      secrets: true
    action:
      # Mixed behavior based on data type
      onSecrets: block
      onPII: redact
      onPHI: redact

This gives you a clean operational stance: no secrets ever leave; PII/PHI get masked, and developer teams don’t get paged for every user who typed a phone number into a chat.


Step 4: Scope redaction to AI apps, APIs, and agentic workflows

The real attack surface is the “cloud within the cloud”: AI agents, MCP tools, internal APIs, and east–west traffic that never hits a WAF. Private mode plus inline redaction should be scoped to that surface, not just to obvious north–south calls.

Apply redaction to:

  • GenAI / LLM / RAG applications

    • Chat interfaces.
    • Embedded assistants in your SaaS or internal tools.
    • Backend RAG pipelines that fetch from internal data stores before hitting an LLM.
  • MCP and agentic workflows

    • MCP servers and tools serving sensitive internal data.
    • AI agents wired into CI/CD, ticketing, CRM, or observability tools.
    • Cross‑tool “0‑click” workflows where the user never sees the intermediate data.
  • Internal APIs and east–west traffic

    • Service‑to‑service calls carrying PII/PHI across microservices.
    • “Ghost” and “zombie” APIs that were never fully decommissioned but still hold data.
    • Internal AI endpoints (e.g., your own hosted models).

Sample scoping snippet:

securityPolicies:
  inlineRedaction:
    enabled: true
    scopes:
      - name: ai-and-agentic-flows
        match:
          services:
            - labelSelector: "operant.ai/app=genai"
            - labelSelector: "operant.ai/component=agent"
          protocols:
            - http
            - grpc
        detect:
          pii: true
          phi: true
          secrets: true
        action:
          mode: redact

Label‑based scoping lets you roll this out iteratively: start with one AI service, validate behavior, then expand to the rest of your stack.


Step 5: Validate that PII/PHI never leaves your environment

Once you’ve enabled private mode and inline redaction, you should prove to yourself—and to your auditors—that it actually works.

1. Synthetic test flows

  • Craft requests that include:
    • Sample SSNs.
    • Test patient IDs.
    • Faker‑generated PII (names, emails, phone numbers).
  • Send them through:
    • Your AI chat interfaces.
    • Agentic workflows that invoke tools.
    • Internal APIs that fan out to external services.

Observe:

  • Operant’s runtime detections (mapped to OWASP Top 10 for API/LLM/K8s).
  • Whether the outbound payloads show redacted values (e.g., ***REDACTED***) where PII/PHI originally appeared.

2. Boundary checks on external endpoints

  • Inspect logs on your external LLM providers or SaaS tools.
  • Confirm they never see raw PII/PHI or secrets from your tests.
  • For stricter assurance, run targeted packet captures at your egress boundary (short‑lived, tightly scoped) to confirm only masked values cross the line.

3. Audit and compliance evidence

With private mode and inline redaction active, you gain:

  • An auditable record of:
    • Sensitive data detected.
    • Redaction or blocking actions taken.
  • Proof points for:
    • HIPAA/HITECH workflows (PHI never leaves your controlled environment unmasked).
    • PCI DSS v4 requirements around cardholder data exposure.
    • NIST 800‑53 controls on data‑in‑use, data minimization, and access control.
    • EU AI Act expectations around AI data governance and privacy by design.

You can share these artifacts with your security, privacy, and compliance teams as part of your AI risk review.


Step 6: Tune for performance and developer experience

Inline enforcement is only useful if it doesn’t grind your app to a halt or drown teams in noise. Operant’s runtime‑native design helps here, but you should still tune.

Performance

  • Redaction runs in the same cluster, close to your workloads.
  • You can configure:
    • Rate limits and token quotas for sensitive AI endpoints.
    • Timeouts and fallbacks if an enforcement path is overloaded.

Example:

performance:
  aiEndpoints:
    defaultTimeoutMs: 500
    maxTokensPerMinute: 60000
    overloadPolicy: fail-open-safe-redacted  # continue with redaction, never leak raw

Developer experience

Set up policies and views that help developers, not just security:

  • Sanitized logs that show what was redacted, without exposing the original value.
  • Dashboards of top redaction events by service so teams can refactor hotspots (e.g., stop sending entire user objects to LLMs).
  • Clear documentation in your internal portal that:
    • Operant is performing inline redaction.
    • Developers don’t need to add their own regex‑based masking everywhere.
    • How to request exceptions or stricter policies when needed.

This is what “secure by default” looks like in a modern AI stack: developers ship features; Operant quietly blocks and redacts the dangerous parts at runtime.


Putting it all together: A practical rollout sequence

If you want a concrete plan you can execute in a week, I’d structure it like this:

  1. Day 1–2: Deploy Operant in private mode

    • Helm install into a non‑prod cluster.
    • Confirm that no payload data leaves your environment.
    • Validate baseline traffic visibility.
  2. Day 3: Enable inline auto‑redaction in monitor mode

    • Turn on detection + simulated redaction for your primary AI app.
    • Review detections and what would have been redacted or blocked.
  3. Day 4: Flip to enforcement on a narrow surface

    • Enable inline redaction for outbound LLM traffic from that app.
    • Keep secrets in “block” mode; PII/PHI in “redact” mode.
    • Run synthetic tests and verify behavior.
  4. Day 5+: Expand to more services and agentic workflows

    • Add MCP tools, internal APIs, and high‑risk east–west paths.
    • Tweak policies to match your risk tolerance and compliance requirements.
    • Share results with security and privacy stakeholders.

At the end of this cycle, your AI stack should:

  • Run entirely in private mode with no sensitive payloads leaving your environment.
  • Have inline auto‑redaction protecting PII/PHI and secrets in real time.
  • Provide auditable evidence that data exfiltration is being actively blocked or neutralized, not just observed.

Next Step

If you want help mapping these policies to your specific AI apps, MCP tools, and internal APIs—or you’re trying to align them with HIPAA/PCI/NIST/EU AI Act controls—schedule time with the Operant team to walk through your architecture and threat model on live traffic.

Get Started