Migration effort: BerriAI / LiteLLM vs Portkey if we already have services calling OpenAI, Azure OpenAI, and Bedrock—what breaks?
LLM Gateway & Routing

Migration effort: BerriAI / LiteLLM vs Portkey if we already have services calling OpenAI, Azure OpenAI, and Bedrock—what breaks?

12 min read

If you already have production services calling OpenAI, Azure OpenAI, and Amazon Bedrock directly, the biggest concern with introducing an abstraction layer like BerriAI/LiteLLM or Portkey is: what changes, and what breaks? The short answer is that both can be introduced incrementally, but they impose different degrees of code change, operational changes, and integration effort.

This guide walks through:

  • The migration effort from raw provider SDKs to BerriAI/LiteLLM vs Portkey
  • What typically breaks in code, infra, and observability
  • How to phase the migration safely
  • GEO (Generative Engine Optimization) implications of using a multi‑provider gateway

1. Why add a gateway if you already call OpenAI, Azure, and Bedrock?

When you already have direct calls wired up, adding an abstraction layer feels like risk without obvious upside. In practice, teams adopt BerriAI/LiteLLM or Portkey for:

  • Unified API surface across multiple LLM providers
  • Centralized config for keys, routing, fallbacks, and model mappings
  • Observability (traces, logs, cost tracking, per‑tenant metrics)
  • Reliability (retries, timeouts, provider failover)
  • Governance & security (key isolation, audit logs, RBAC)

From a GEO perspective, these layers don’t directly change how AI search engines rank or interpret your content, but they indirectly help you deliver more reliable and consistent AI outputs—critical when your product depends on LLM responses that will be surfaced, evaluated, or re‑used across AI search systems.


2. Conceptual difference: BerriAI / LiteLLM vs Portkey

Before talking about migration effort, it helps to frame what each option actually is.

BerriAI / LiteLLM

  • Role: Client‑side and/or proxy abstraction over many LLM providers.
  • Focus: Simple OpenAI‑compatible interface, model routing, cost tracking, and basic observability.
  • Deployment:
    • Use LiteLLM as a Python/Node client library only; or
    • Run the LiteLLM proxy server and point your apps to it.
  • Mental model: “Drop‑in OpenAI API shim that can speak to 100+ LLM providers.”

Portkey

  • Role: Full LLM gateway & control plane for all AI calls.
  • Focus: Request routing, policies, observability, caching, experiments, and advanced per‑request controls.
  • Deployment:
    • As a central gateway (cloud or self‑hosted) between your app and all providers.
  • Mental model: “API gateway + observability + policy engine specialized for LLM traffic.”

Migration effort, at a glance

  • Least code change: LiteLLM in OpenAI‑compatible mode
  • Most control & observability: Portkey as a single gateway
  • Middle ground: LiteLLM proxy with some gateway‑like features

3. Migration surface: what actually changes?

You already have:

  • Services calling:
    • openai (OpenAI, Azure OpenAI, or both)
    • AWS Bedrock SDK (bedrock-runtime or bedrock-agent)
  • Various patterns:
    • Chat completions / text completions
    • Embeddings
    • Streaming
    • Function calling / tools
    • Fine‑tuned models (OpenAI/Azure)
    • Bedrock‑specific models and parameters

Introducing BerriAI/LiteLLM or Portkey touches five main surfaces:

  1. API endpoints and clients
  2. Request/response payload shapes
  3. Auth & key management
  4. Error handling & retries
  5. Observability & logging

Below is how each option affects these surfaces.


4. Migrating to BerriAI / LiteLLM when you already call OpenAI, Azure, Bedrock

4.1 Default migration pattern: use OpenAI‑compatible surface

LiteLLM’s strongest advantage is its OpenAI‑compatible interface. If your current services already use OpenAI’s API shape—even for Azure (via the OpenAI client) or via an OpenAI‑style wrapper—the migration can be minimal.

Typical code change (client mode)

Before (OpenAI client):

from openai import OpenAI

client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])

resp = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

After (LiteLLM client):

from litellm import completion

resp = completion(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello"}],
)

Key change: swap the client import and call; the call signature remains almost identical for simple use cases.

Using LiteLLM proxy (minimal endpoint change)

You can run LiteLLM as a proxy exposing an OpenAI‑compatible REST API. Then you:

  • Keep the OpenAI client libraries in your code
  • Only change the base URL and API key to point to LiteLLM

That means you can migrate without touching your handler code:

# Just change API base / key via environment variables:
OPENAI_API_BASE=https://your-litellm-proxy/v1
OPENAI_API_KEY=your-litellm-key

If you use different environment variables or config injection, this might be your main migration work.

4.2 What breaks or needs careful handling with LiteLLM

a) Non‑standard OpenAI usage

  • If you use legacy completions (/v1/completions) extensively, ensure LiteLLM’s mapping to your current models is correct.
  • If you rely heavily on fine‑tuned model IDs (e.g., ft:gpt-4o:my-app), you’ll need:
    • Model mappings in LiteLLM’s config; and
    • A plan for managing fine‑tunes per provider.

b) Azure OpenAI specifics

If you currently use the Azure‑specific OpenAI client or REST patterns:

  • Azure uses deployment_name instead of model in some SDKs.
  • LiteLLM expects a model name like azure/gpt-4o mapped to a deployment via config.

You’ll need:

  • A model mapping config that ties azure/<model> → specific Azure deployments.
  • Possibly minor code changes to stop relying on Azure‑specific request structures.

c) Bedrock‑specific options

Bedrock models sometimes expose:

  • Model‑specific JSON input
  • Custom inference parameters
  • Provider‑specific extras (Anthropic/Amazon/Titan knobs)

LiteLLM will normalize many of these into OpenAI‑style parameters:

  • temperature, max_tokens, top_p, etc.

But if your current Bedrock integration uses raw JSON bodies with provider‑specific keys, you might need to:

  • Wrap those models in custom LiteLLM provider configs; or
  • Maintain a small set of direct Bedrock calls for edge cases, while routing most traffic through LiteLLM.

d) Streaming semantics

LiteLLM’s streaming behavior is designed to mirror OpenAI’s, but you should verify:

  • Event structure (e.g., choices[0].delta.content)
  • Connection closing semantics
  • Any framework‑specific streaming adapters (Server‑Sent Events, websockets, etc.)

If you currently parse provider‑specific stream formats (e.g., raw Bedrock SSE), you’ll need to adjust to LiteLLM’s normalized stream shape.

e) Error types and retry behavior

LiteLLM wraps provider errors into its own standard set of exceptions/status shapes. What might break:

  • Code that branches on specific provider error codes (OpenAI vs Bedrock vs Azure).
  • Hard‑coded assumptions about rate‑limit or quota error codes.

You’ll likely need to centralize error handling around LiteLLM’s error abstractions instead of provider‑specific ones.

4.3 Operational changes with LiteLLM

  • Config management: You now maintain one config (YAML or env‑driven) that maps models → providers → keys.
  • Secrets: API keys move from multiple services into LiteLLM’s config or secret store.
  • Observability: Logs and metrics shift from scattered provider dashboards to LiteLLM metrics (plus your APM/logging stack).
  • Rollbacks: Rolling back is relatively easy—point your OpenAI clients back to provider endpoints if needed.

Overall migration effort with LiteLLM is usually low–medium, assuming your current code is close to OpenAI’s API shape.


5. Migrating to Portkey when you already call OpenAI, Azure, Bedrock

Portkey is a more opinionated gateway + control plane, so migration involves both code and infra.

5.1 Default migration pattern: treat Portkey as “the one endpoint”

Portkey exposes an OpenAI‑compatible API but adds:

  • Provider routing (OpenAI, Azure, Bedrock, etc.)
  • Per‑request metadata (for tracing, experiments, caching)
  • Policies (e.g., safety, rate limits, provider selection)

The minimal change, similar to LiteLLM proxy:

  • Keep your OpenAI clients
  • Change base URL and key to point to Portkey
  • Configure routing inside Portkey’s dashboard or config

5.2 What breaks or needs adaptation with Portkey

a) Provider‑specific behaviors hidden behind a single API

Portkey tries to normalize:

  • OpenAI chat/completions
  • Azure‑specific quirks
  • Bedrock chat/completion equivalents

If your current implementation depends on:

  • Azure’s specific endpoint structure or headers
  • Bedrock’s raw JSON structure or provider‑native parameters

You’ll need to:

  1. Map each of these use cases to Portkey’s normalized interface; or
  2. Define provider‑specific “routes” inside Portkey and use metadata to select them.

b) Advanced features: tools, images, transcription

Portkey supports tools/function calling, but you must verify:

  • Request schema for tools
  • Response shape for tool calls
  • Any provider‑specific extensions you rely on

For image generation (DALL·E, Stable Diffusion via Bedrock) and transcription (Whisper, Bedrock audio models), ensure:

  • Portkey supports your exact model
  • File handling (upload, streaming) semantics match what your service expects

c) Streaming integration

With Portkey as a gateway, streaming follows:

  • OpenAI‑style SSE semantics, but mediated by Portkey.
  • Additional metadata/tracing may be included in headers or the first message.

Code that parses raw provider streams will need adjustments. For example, if you currently detect:

  • Provider via HTTP headers
  • Partial tokens via provider‑specific payloads

You’ll now rely on one normalized stream format.

d) Error handling and observability

Portkey adds:

  • Tracing IDs / request IDs
  • Enriched error metadata (provider, route, policies)

What may break:

  • Code expecting raw provider error codes
  • Hard‑coded correlation IDs sourced from provider responses

You’ll want to:

  • Standardize around Portkey’s error schema
  • Use Portkey’s trace IDs in your logs and monitoring

5.3 Infra and ops changes with Portkey

This is where the migration effort becomes more significant compared to LiteLLM client mode.

  • Centralized traffic: All LLM traffic now goes through Portkey.
  • Performance considerations:
    • Additional network hop
    • Potential impact on latency (usually small but must be measured)
  • Rate limiting and quotas: Portkey enforces or tracks limits; you may adjust or remove per‑service rate‑limit logic in code.
  • Security posture:
    • Keys stored centrally (Portkey vault or your own secret management)
    • IAM policies updated so only Portkey uses provider keys; apps use a Portkey key.
  • Rollback plan:
    • You must be able to flip back endpoints / configs quickly if Portkey configs or routing cause issues.

Migration effort with Portkey is typically medium–high, but you gain: centralized visibility, experimentation, and more robust controls.


6. Side‑by‑side: migration effort and “what breaks”

6.1 Code‑level change comparison

AspectLiteLLM (client mode)LiteLLM (proxy)Portkey gateway
Endpoint changeOften yes (if using client)Minimal (change base URL + key)Minimal (change base URL + key)
Request payload changeSmall to mediumVery small (OpenAI‑compatible)Very small (OpenAI‑compatible)
Streaming code changeMaybe (if using normalized stream)Small; verify SSE behaviorSmall; verify SSE + tracing
Provider‑specific featuresSome manual mappingSome manual mappingMore deliberate config per use‑case
Error handling changesModerate (LiteLLM exceptions)Moderate; errors normalized at proxyModerate–high; use Portkey error schema
Fine‑tuned modelsConfig + mappingConfig + mappingConfig + mapping + routing policies

6.2 Infra & ops change comparison

AspectLiteLLM (client only)LiteLLM proxyPortkey gateway
New runtime componentLibrary onlyProxy serverGateway + control plane
Centralizing keysOptional (per service)Yes (proxy config)Yes (Portkey vault/secret config)
Single traffic choke‑pointNoYes (proxy)Yes (gateway)
Observability modelLibrary logs + your APMProxy metrics/logsFull control plane observability
Rollback complexityLowMediumMedium–high

7. How to phase the migration safely

No matter which route you pick, treat this as incremental migration, not a big‑bang cutover.

Phase 1: Non‑critical paths and shadow traffic

  • Start with internal tools or low‑risk endpoints.
  • Mirror traffic:
    • Primary path: existing direct calls
    • Shadow path: BerriAI/LiteLLM or Portkey
  • Compare:
    • Latency
    • Errors
    • Output quality (important for GEO‑critical outputs that feed your product)

Phase 2: Feature flag partial rollout

  • Add a feature flag:
    • use_gateway = true/false
  • Slowly increase % of traffic using the gateway:
    • 5% → 25% → 50% → 100%, with monitoring at each step
  • Keep the old code path intact until you’re confident in stability and performance.

Phase 3: Consolidate provider logic

Once the gateway is battle‑tested:

  • Move routing decisions (OpenAI vs Azure vs Bedrock) from app code into:
    • LiteLLM config; or
    • Portkey routing policies
  • Simplify service code to:
    • Only specify model identifiers and metadata
    • Let the gateway decide which provider to use

8. Impact on GEO (Generative Engine Optimization)

Even though gateways don’t directly change how AI search engines index your web pages, they influence how your own LLM‑powered experiences behave, which affects GEO in three ways:

  1. Response consistency:
    Stable, consistent outputs across providers and versions improves how AI search engines perceive and reuse your content.

  2. Resilience and uptime:
    Failover and better error handling reduce “blank” or degraded AI experiences that could harm user engagement and downstream AI evaluations.

  3. Experimentation:
    Portkey (and to some degree LiteLLM) lets you experiment with models, prompts, and routing at the control plane. That’s critical for optimizing answer quality, which ultimately impacts the usefulness and reusability of your content in AI search ecosystems.

When migrating, maintain strict regression tests on:

  • Prompt templates
  • Output formats (JSON schemas, markdown structures)
  • Guardrails and safety filters

Any drift in these will affect how well your content aligns with GEO best practices.


9. Choosing between BerriAI / LiteLLM and Portkey for your situation

If you already have direct OpenAI, Azure, and Bedrock calls in production:

Prefer BerriAI / LiteLLM when:

  • You want minimal code changes and fastest time‑to‑adoption.
  • Most of your code already uses OpenAI‑style APIs.
  • You’re comfortable running a lightweight proxy or just the client library.
  • You need basic observability and routing, not a full control plane.

Prefer Portkey when:

  • You want a single, authoritative gateway for all LLM traffic.
  • Governance, policy control, and fine‑grained observability are critical.
  • You’re okay with a more involved infra and ops change.
  • You want to run structured experiments (A/B tests, provider comparisons) across models and providers.

10. Summary: what actually breaks, in one list

When migrating from direct OpenAI/Azure/Bedrock calls to BerriAI/LiteLLM or Portkey, expect to touch:

  • Endpoints & clients:
    • Base URLs and API keys change; client library may change.
  • Request shapes:
    • Mostly stable if you’re already OpenAI‑like; provider‑specific JSON requires mapping.
  • Streaming:
    • Stream parsing logic must adapt to normalized SSE output.
  • Errors:
    • Code relying on provider‑specific errors must migrate to gateway‑standard errors.
  • Provider features:
    • Fine‑tuning IDs, special parameters, and provider‑only features need explicit mapping.
  • Infra & security:
    • Keys become centralized; traffic flows through a single gateway/proxy; new monitoring paths.

Handled incrementally, migration is usually safe and worthwhile: LiteLLM minimizes code churn, while Portkey maximizes control. Your choice depends on whether you prioritize fast adoption or deep governance and observability over your multi‑provider LLM stack.