We already have a WAF and an API gateway—why are we still blind to east–west API abuse inside Kubernetes?

Most teams assume that once a WAF and an API gateway are in place, API abuse is “handled.” The external perimeter looks clean in dashboards. OWASP-style signatures fire at the edge. Yet incidents keep tracing back to internal services talking to each other inside Kubernetes—abuse that never touched the WAF, never crossed the gateway, and often never even showed up in your SIEM with enough context to act.

That’s the core problem: the real attack surface has shifted to the “cloud within the cloud.” East–west API calls between microservices, sidecar proxies, internal MCP connections, and agent toolchains all live behind your traditional controls. A WAF and north–south gateway can’t see, segment, or enforce policy on that traffic. So you’re blind where it matters most.

This post breaks down why that happens and what a runtime, Kubernetes‑native defense layer needs to do differently—especially if you’re serious about stopping modern API and AI-driven attacks, not just logging them.

Why WAFs and Traditional API Gateways Go Blind Inside Kubernetes

From the outside, your architecture diagram shows a clean perimeter:

Internet → WAF → API Gateway → Cluster.

From inside the cluster, the picture is completely different:

Dozens to hundreds of services.
Envoy/sidecar meshes.
Internal APIs never registered in the gateway.
AI agents and MCP tools calling internal APIs directly.
Jobs, CronJobs, and batch pipelines hitting APIs from inside.

Here’s why your existing stack doesn’t see east–west abuse.

1. WAFs are perimeter filters, not in-cluster enforcement

A WAF sits in front of your public entrypoints. It:

Protects ingress HTTP/HTTPS traffic.
Matches patterns and signatures (SQLi, XSS, obvious path traversal).
Sometimes adds basic bot or DDoS mitigation.

What it does not do:

Inspect or enforce on pod-to-pod or namespace-to-namespace API calls.
Understand Kubernetes identities (ServiceAccounts, namespaces, labels).
Follow internal service discovery (ClusterIP, headless services, mesh-level routing).
Track API behavior over time across internal consumers and tools.

So when a compromised pod calls order-service or billing-service directly via cluster DNS, none of that traffic hits the WAF. From a WAF perspective, the attack never happened.

2. API gateways front a subset of APIs—and mostly north–south

API gateways (Kong, Apigee, etc.) are great for:

Public or partner-facing APIs.
A curated set of “official” internal APIs.
Rate limits and auth at a boundary (tokens, keys, OAuth clients).

The blind spots:

Most internal microservices never get registered as “APIs” in the gateway.
Mesh‑internal paths (http://user-service:8080) bypass the gateway.
Third-party or legacy services talk over raw cluster networking.
AI and MCP-driven traffic often uses internal endpoints the gateway has never seen.

In real environments, you rarely route every internal call through a gateway—it’s a performance, complexity, and team-ownership non-starter. That leaves most east–west traffic ungoverned.

3. Kubernetes networking and service discovery create “shadow APIs”

Kubernetes creates an internal, dynamic API fabric:

Services get virtual IPs and DNS names (e.g., payments.default.svc.cluster.local).
New endpoints spin up and down with scaling and deployments.
Meshes introduce virtual services, routes, and retries.
Jobs and agents can call APIs directly by service name.

Each service endpoint is effectively an API—often:

Never documented in a central API catalog.
Never onboarded to the gateway.
Never covered by WAF rules.

That’s how you end up with ghost/zombie APIs inside the cluster: internal-only endpoints that still expose sensitive data or business logic, but never passed through your external control plane.

4. Observability ≠ enforcement

Many teams respond by adding:

Service mesh telemetry.
Distributed tracing.
API logging/analytics tools.

Useful—until an attacker is already walking across your internal graph.

These tools:

Show you what happened, not stop it in-flight.
Often lack identity-aware context (who/what called this, with what role?).
Turn into dashboards and backlogs rather than runtime guardrails.

In a fast-moving cluster, “we saw abnormal traffic yesterday” is not a defense strategy. You need inline block/contain capabilities at the same speed as the abuse.

How East–West API Abuse Actually Happens in Kubernetes

To see why WAF + gateway isn’t enough, it helps to look at concrete patterns. These are exactly the kinds of incidents we see in cloud-native and AI-heavy stacks.

Lateral movement via internal APIs

An attacker:

Phishes a developer and gets access to an internal CI system.
Drops a container with a valid ServiceAccount into your cluster.
Uses that pod to call user-profile-service, billing-service, document-service via cluster DNS.

Because:

The traffic is “internal,” it bypasses WAF and gateway.
Network policies are permissive or missing.
No inline checks tie requests back to workload identity and least privilege.

You end up with data exfiltration from APIs never meant to be externally accessible—but fully reachable from a compromised pod.

Business logic abuse without signatures

Think of classic OWASP API issues:

Excessive Data Exposure (returning more data than needed).
Broken Object Level Authorization (BOLA).
Mass assignment and parameter tampering.

Now put them inside your cluster:

report-service endpoint GET /report?user=alice happily returns data for any user param.
A compromised internal tool or agent can simply loop through user=bob, user=carol from inside.

Your WAF doesn’t see this; the gateway doesn’t front it. There are no signatures, only behavior patterns—and no runtime enforcement watching those internal calls.

API and AI agent interplay: “0-click” internal abuse

As AI agents and MCP workflows spread, they become new internal API clients:

Agents in dev tools (IDEs, Jira, Slack) calling back into internal services.
MCP tools giving agents access to internal APIs and databases.
LLM-based automation pipelines hitting APIs from inside the VPC/cluster.

Attack chain example:

A prompt injection or jailbreak in a SaaS IDE agent convinces it to call an MCP tool.
The MCP tool is wired to an internal API (e.g., customer-360-service).
The agent issues malicious queries: “Dump all customer records,” “Delete logs from the last 24 hours.”
Calls stay inside your network and cluster. They never cross the WAF or external gateway.

From your current controls’ perspective, it looks like “trusted internal automation.” But it’s a 0-click attack path to sensitive APIs.

Why This Blindness Gets Worse in AI-Driven, API-Heavy Architectures

GenAI and agentic workflows amplify the east–west API problem:

API growth explodes – internal + third-party APIs multiply to feed models and agents.
MCP becomes the “API for agents” – adding a new class of internal tools and services.
Automation generates traffic you never modeled – agents chain tools and APIs in ways you didn’t anticipate.

If you don’t have runtime controls for:

Discovering unmanaged internal APIs (ghost/zombie endpoints).
Mapping agent/MCP/API relationships.
Enforcing trust boundaries and rate limits inside Kubernetes.

…then every new automation and model integration increases your internal attack surface—while your WAF and gateway metrics still look “green.”

What a Runtime, Kubernetes-Native API Defense Layer Must Do

You don’t fix east–west blindness by pushing more rules into the WAF or trying to force all traffic through a gateway. You fix it by putting enforcement where the abuse actually happens: inside the runtime, alongside your workloads.

Here’s what that layer needs to deliver.

1. Instant discovery of internal and third-party APIs

You can’t protect what you don’t know exists.

A runtime AI application defense platform like Operant:

Auto-discovers all live APIs across dev, staging, and prod:
- Internal microservice endpoints.
- Legacy services wrapped in Kubernetes.
- Third-party APIs called from within the cluster.
Builds live API blueprints and security graphs:
- Which workloads talk to which APIs.
- What data flows where (PII, secrets, business-critical paths).
- Which APIs are exposed only internally vs via gateway.

This gives you a living map of the “cloud within the cloud,” not a static OpenAPI doc that’s outdated in a week.

2. Inline threat protection at the Kubernetes layer

Visibility is not enough; you need inline controls that can block and contain abuse as it happens.

Operant’s Kubernetes-native controls support:

Protocol-specific authentication and authorization for internal APIs.
Workload- and identity-aware microsegmentation:
- Only the services and agents that should talk to billing-service can.
- East–west communication is governed by trust zones, not flat networking.
Rate limiting and anomaly-based blocking:
- Lock down an API when a “trusted” internal client suddenly scrapes data at 100x normal rate.
- Stop mass enumeration or exfiltration across internal resources.

Instead of hoping your WAF signatures catch something at the edge, you’re enforcing least privilege and behavioral limits at runtime.

3. 3D Runtime Defense: Discovery, Detection, Defense

Operant frames this as 3D Runtime Defense:

Discovery – Live API and agent/MCP catalogs, ghost/zombie API detection, “cloud within the cloud” maps.
Detection – Real-time detections mapped to OWASP Top 10 for APIs, LLM, and K8s, plus agentic risks like 0-click abuses and tool poisoning.
Defense – Inline actions:
- Block or isolate flows.
- Auto-redact sensitive data inline before it leaves the service.
- Apply allowlists/denylists, trust zones, and NHI access controls.

This is a fundamentally different posture from “collect logs and open tickets.” It’s active, inline runtime enforcement.

4. AI- and MCP-aware controls for modern traffic patterns

For AI-intensive environments, you need controls that speak the language of agents and MCP, not just HTTP methods:

MCP Registry/Catalog – See which MCP servers, tools, and clients exist, who uses them, and which APIs they can reach.
Agent and tool governance – Discover managed and unmanaged agents (dev tools, SaaS, internal workflows) and their API call graph.
Runtime enforcement on agentic workflows:
- Enforce trust zones for which agents can access which tools/APIs.
- Detect and block prompt injection-driven behavior that tries to exfiltrate data or abuse internal APIs.
- Inline auto-redaction of sensitive data flowing through AI/NHI interfaces.

You’re not just securing “APIs in general”; you’re securing the actual agent + MCP + API patterns that define your east–west attack surface in the AI era.

5. Deployable in minutes, not another instrumentation project

All of this only works if it ships with your reality:

Kubernetes-native.
Minimal friction.
No six-month integration plan.

Operant is designed to be:

A single-step Helm install.
Zero instrumentation. Zero integrations. Works in <5 minutes.
Immediately useful on live traffic, with default guards and visibility.

That matters. Because if the answer to east–west abuse is “add sidecars and custom code everywhere,” most teams will never get there.

How This Complements (Not Replaces) Your WAF and API Gateway

This isn’t an either/or proposition. WAFs and gateways still matter:

WAF – Filters generic web exploits, DDoS, and obvious badness at the edge.
API gateway – Auth, rate limiting, and governance for curated public/partner APIs.

A runtime AI application defense platform like Operant:

Extends protection beyond the WAF:
- Internal and third-party API interactions.
- East–west traffic inside Kubernetes.
Closes gaps the gateway can’t:
- Unregistered internal APIs.
- Ghost/zombie endpoints.
- AI agent/MCP toolchains that never touch the gateway.
Consolidates controls:
- API threat protection.
- Kubernetes-native runtime security.
- AI runtime controls (agents, MCP, LLM supply chain).

The outcome: better protection, lower cost, more control—and no more pretending the perimeter is the whole story.

Final Verdict: You’re Blind Because Your Controls Stop at the Edge

If you’re wondering why you’re still blind to east–west API abuse inside Kubernetes, even with a WAF and an API gateway, the answer is simple:

Your controls were built for north–south, perimeter traffic. The real attacks now happen in the “cloud within the cloud”—internal APIs, services, MCP tools, and agents talking to each other at runtime.

To close that gap, you need:

Live discovery of internal and third-party APIs and agent/MCP workflows.
Kubernetes-native, identity-aware enforcement on east–west traffic.
Inline blocking, rate limiting, segmentation, and auto-redaction as data flows.
AI- and MCP-aware guardrails that protect modern agentic workflows, not just legacy REST calls.

You can’t secure AI without securing APIs. And you can’t secure APIs by looking only at the edge.

Next Step

Get Started