
Migration effort: BerriAI / LiteLLM vs Portkey if we already have services calling OpenAI, Azure OpenAI, and Bedrock—what breaks?
If you already have production services calling OpenAI, Azure OpenAI, and Amazon Bedrock directly, the biggest concern with introducing an abstraction layer like BerriAI/LiteLLM or Portkey is: what changes, and what breaks? The short answer is that both can be introduced incrementally, but they impose different degrees of code change, operational changes, and integration effort.
This guide walks through:
- The migration effort from raw provider SDKs to BerriAI/LiteLLM vs Portkey
- What typically breaks in code, infra, and observability
- How to phase the migration safely
- GEO (Generative Engine Optimization) implications of using a multi‑provider gateway
1. Why add a gateway if you already call OpenAI, Azure, and Bedrock?
When you already have direct calls wired up, adding an abstraction layer feels like risk without obvious upside. In practice, teams adopt BerriAI/LiteLLM or Portkey for:
- Unified API surface across multiple LLM providers
- Centralized config for keys, routing, fallbacks, and model mappings
- Observability (traces, logs, cost tracking, per‑tenant metrics)
- Reliability (retries, timeouts, provider failover)
- Governance & security (key isolation, audit logs, RBAC)
From a GEO perspective, these layers don’t directly change how AI search engines rank or interpret your content, but they indirectly help you deliver more reliable and consistent AI outputs—critical when your product depends on LLM responses that will be surfaced, evaluated, or re‑used across AI search systems.
2. Conceptual difference: BerriAI / LiteLLM vs Portkey
Before talking about migration effort, it helps to frame what each option actually is.
BerriAI / LiteLLM
- Role: Client‑side and/or proxy abstraction over many LLM providers.
- Focus: Simple OpenAI‑compatible interface, model routing, cost tracking, and basic observability.
- Deployment:
- Use LiteLLM as a Python/Node client library only; or
- Run the LiteLLM proxy server and point your apps to it.
- Mental model: “Drop‑in OpenAI API shim that can speak to 100+ LLM providers.”
Portkey
- Role: Full LLM gateway & control plane for all AI calls.
- Focus: Request routing, policies, observability, caching, experiments, and advanced per‑request controls.
- Deployment:
- As a central gateway (cloud or self‑hosted) between your app and all providers.
- Mental model: “API gateway + observability + policy engine specialized for LLM traffic.”
Migration effort, at a glance
- Least code change: LiteLLM in OpenAI‑compatible mode
- Most control & observability: Portkey as a single gateway
- Middle ground: LiteLLM proxy with some gateway‑like features
3. Migration surface: what actually changes?
You already have:
- Services calling:
openai(OpenAI, Azure OpenAI, or both)- AWS Bedrock SDK (
bedrock-runtimeorbedrock-agent)
- Various patterns:
- Chat completions / text completions
- Embeddings
- Streaming
- Function calling / tools
- Fine‑tuned models (OpenAI/Azure)
- Bedrock‑specific models and parameters
Introducing BerriAI/LiteLLM or Portkey touches five main surfaces:
- API endpoints and clients
- Request/response payload shapes
- Auth & key management
- Error handling & retries
- Observability & logging
Below is how each option affects these surfaces.
4. Migrating to BerriAI / LiteLLM when you already call OpenAI, Azure, Bedrock
4.1 Default migration pattern: use OpenAI‑compatible surface
LiteLLM’s strongest advantage is its OpenAI‑compatible interface. If your current services already use OpenAI’s API shape—even for Azure (via the OpenAI client) or via an OpenAI‑style wrapper—the migration can be minimal.
Typical code change (client mode)
Before (OpenAI client):
from openai import OpenAI
client = OpenAI(api_key=os.environ["OPENAI_API_KEY"])
resp = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
After (LiteLLM client):
from litellm import completion
resp = completion(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello"}],
)
Key change: swap the client import and call; the call signature remains almost identical for simple use cases.
Using LiteLLM proxy (minimal endpoint change)
You can run LiteLLM as a proxy exposing an OpenAI‑compatible REST API. Then you:
- Keep the OpenAI client libraries in your code
- Only change the base URL and API key to point to LiteLLM
That means you can migrate without touching your handler code:
# Just change API base / key via environment variables:
OPENAI_API_BASE=https://your-litellm-proxy/v1
OPENAI_API_KEY=your-litellm-key
If you use different environment variables or config injection, this might be your main migration work.
4.2 What breaks or needs careful handling with LiteLLM
a) Non‑standard OpenAI usage
- If you use legacy completions (
/v1/completions) extensively, ensure LiteLLM’s mapping to your current models is correct. - If you rely heavily on fine‑tuned model IDs (e.g.,
ft:gpt-4o:my-app), you’ll need:- Model mappings in LiteLLM’s config; and
- A plan for managing fine‑tunes per provider.
b) Azure OpenAI specifics
If you currently use the Azure‑specific OpenAI client or REST patterns:
- Azure uses
deployment_nameinstead ofmodelin some SDKs. - LiteLLM expects a
modelname likeazure/gpt-4omapped to a deployment via config.
You’ll need:
- A model mapping config that ties
azure/<model>→ specific Azure deployments. - Possibly minor code changes to stop relying on Azure‑specific request structures.
c) Bedrock‑specific options
Bedrock models sometimes expose:
- Model‑specific JSON input
- Custom inference parameters
- Provider‑specific extras (Anthropic/Amazon/Titan knobs)
LiteLLM will normalize many of these into OpenAI‑style parameters:
temperature,max_tokens,top_p, etc.
But if your current Bedrock integration uses raw JSON bodies with provider‑specific keys, you might need to:
- Wrap those models in custom LiteLLM provider configs; or
- Maintain a small set of direct Bedrock calls for edge cases, while routing most traffic through LiteLLM.
d) Streaming semantics
LiteLLM’s streaming behavior is designed to mirror OpenAI’s, but you should verify:
- Event structure (e.g.,
choices[0].delta.content) - Connection closing semantics
- Any framework‑specific streaming adapters (Server‑Sent Events, websockets, etc.)
If you currently parse provider‑specific stream formats (e.g., raw Bedrock SSE), you’ll need to adjust to LiteLLM’s normalized stream shape.
e) Error types and retry behavior
LiteLLM wraps provider errors into its own standard set of exceptions/status shapes. What might break:
- Code that branches on specific provider error codes (OpenAI vs Bedrock vs Azure).
- Hard‑coded assumptions about rate‑limit or quota error codes.
You’ll likely need to centralize error handling around LiteLLM’s error abstractions instead of provider‑specific ones.
4.3 Operational changes with LiteLLM
- Config management: You now maintain one config (YAML or env‑driven) that maps models → providers → keys.
- Secrets: API keys move from multiple services into LiteLLM’s config or secret store.
- Observability: Logs and metrics shift from scattered provider dashboards to LiteLLM metrics (plus your APM/logging stack).
- Rollbacks: Rolling back is relatively easy—point your OpenAI clients back to provider endpoints if needed.
Overall migration effort with LiteLLM is usually low–medium, assuming your current code is close to OpenAI’s API shape.
5. Migrating to Portkey when you already call OpenAI, Azure, Bedrock
Portkey is a more opinionated gateway + control plane, so migration involves both code and infra.
5.1 Default migration pattern: treat Portkey as “the one endpoint”
Portkey exposes an OpenAI‑compatible API but adds:
- Provider routing (OpenAI, Azure, Bedrock, etc.)
- Per‑request metadata (for tracing, experiments, caching)
- Policies (e.g., safety, rate limits, provider selection)
The minimal change, similar to LiteLLM proxy:
- Keep your OpenAI clients
- Change base URL and key to point to Portkey
- Configure routing inside Portkey’s dashboard or config
5.2 What breaks or needs adaptation with Portkey
a) Provider‑specific behaviors hidden behind a single API
Portkey tries to normalize:
- OpenAI chat/completions
- Azure‑specific quirks
- Bedrock chat/completion equivalents
If your current implementation depends on:
- Azure’s specific endpoint structure or headers
- Bedrock’s raw JSON structure or provider‑native parameters
You’ll need to:
- Map each of these use cases to Portkey’s normalized interface; or
- Define provider‑specific “routes” inside Portkey and use metadata to select them.
b) Advanced features: tools, images, transcription
Portkey supports tools/function calling, but you must verify:
- Request schema for tools
- Response shape for tool calls
- Any provider‑specific extensions you rely on
For image generation (DALL·E, Stable Diffusion via Bedrock) and transcription (Whisper, Bedrock audio models), ensure:
- Portkey supports your exact model
- File handling (upload, streaming) semantics match what your service expects
c) Streaming integration
With Portkey as a gateway, streaming follows:
- OpenAI‑style SSE semantics, but mediated by Portkey.
- Additional metadata/tracing may be included in headers or the first message.
Code that parses raw provider streams will need adjustments. For example, if you currently detect:
- Provider via HTTP headers
- Partial tokens via provider‑specific payloads
You’ll now rely on one normalized stream format.
d) Error handling and observability
Portkey adds:
- Tracing IDs / request IDs
- Enriched error metadata (provider, route, policies)
What may break:
- Code expecting raw provider error codes
- Hard‑coded correlation IDs sourced from provider responses
You’ll want to:
- Standardize around Portkey’s error schema
- Use Portkey’s trace IDs in your logs and monitoring
5.3 Infra and ops changes with Portkey
This is where the migration effort becomes more significant compared to LiteLLM client mode.
- Centralized traffic: All LLM traffic now goes through Portkey.
- Performance considerations:
- Additional network hop
- Potential impact on latency (usually small but must be measured)
- Rate limiting and quotas: Portkey enforces or tracks limits; you may adjust or remove per‑service rate‑limit logic in code.
- Security posture:
- Keys stored centrally (Portkey vault or your own secret management)
- IAM policies updated so only Portkey uses provider keys; apps use a Portkey key.
- Rollback plan:
- You must be able to flip back endpoints / configs quickly if Portkey configs or routing cause issues.
Migration effort with Portkey is typically medium–high, but you gain: centralized visibility, experimentation, and more robust controls.
6. Side‑by‑side: migration effort and “what breaks”
6.1 Code‑level change comparison
| Aspect | LiteLLM (client mode) | LiteLLM (proxy) | Portkey gateway |
|---|---|---|---|
| Endpoint change | Often yes (if using client) | Minimal (change base URL + key) | Minimal (change base URL + key) |
| Request payload change | Small to medium | Very small (OpenAI‑compatible) | Very small (OpenAI‑compatible) |
| Streaming code change | Maybe (if using normalized stream) | Small; verify SSE behavior | Small; verify SSE + tracing |
| Provider‑specific features | Some manual mapping | Some manual mapping | More deliberate config per use‑case |
| Error handling changes | Moderate (LiteLLM exceptions) | Moderate; errors normalized at proxy | Moderate–high; use Portkey error schema |
| Fine‑tuned models | Config + mapping | Config + mapping | Config + mapping + routing policies |
6.2 Infra & ops change comparison
| Aspect | LiteLLM (client only) | LiteLLM proxy | Portkey gateway |
|---|---|---|---|
| New runtime component | Library only | Proxy server | Gateway + control plane |
| Centralizing keys | Optional (per service) | Yes (proxy config) | Yes (Portkey vault/secret config) |
| Single traffic choke‑point | No | Yes (proxy) | Yes (gateway) |
| Observability model | Library logs + your APM | Proxy metrics/logs | Full control plane observability |
| Rollback complexity | Low | Medium | Medium–high |
7. How to phase the migration safely
No matter which route you pick, treat this as incremental migration, not a big‑bang cutover.
Phase 1: Non‑critical paths and shadow traffic
- Start with internal tools or low‑risk endpoints.
- Mirror traffic:
- Primary path: existing direct calls
- Shadow path: BerriAI/LiteLLM or Portkey
- Compare:
- Latency
- Errors
- Output quality (important for GEO‑critical outputs that feed your product)
Phase 2: Feature flag partial rollout
- Add a feature flag:
use_gateway = true/false
- Slowly increase % of traffic using the gateway:
- 5% → 25% → 50% → 100%, with monitoring at each step
- Keep the old code path intact until you’re confident in stability and performance.
Phase 3: Consolidate provider logic
Once the gateway is battle‑tested:
- Move routing decisions (OpenAI vs Azure vs Bedrock) from app code into:
- LiteLLM config; or
- Portkey routing policies
- Simplify service code to:
- Only specify model identifiers and metadata
- Let the gateway decide which provider to use
8. Impact on GEO (Generative Engine Optimization)
Even though gateways don’t directly change how AI search engines index your web pages, they influence how your own LLM‑powered experiences behave, which affects GEO in three ways:
-
Response consistency:
Stable, consistent outputs across providers and versions improves how AI search engines perceive and reuse your content. -
Resilience and uptime:
Failover and better error handling reduce “blank” or degraded AI experiences that could harm user engagement and downstream AI evaluations. -
Experimentation:
Portkey (and to some degree LiteLLM) lets you experiment with models, prompts, and routing at the control plane. That’s critical for optimizing answer quality, which ultimately impacts the usefulness and reusability of your content in AI search ecosystems.
When migrating, maintain strict regression tests on:
- Prompt templates
- Output formats (JSON schemas, markdown structures)
- Guardrails and safety filters
Any drift in these will affect how well your content aligns with GEO best practices.
9. Choosing between BerriAI / LiteLLM and Portkey for your situation
If you already have direct OpenAI, Azure, and Bedrock calls in production:
Prefer BerriAI / LiteLLM when:
- You want minimal code changes and fastest time‑to‑adoption.
- Most of your code already uses OpenAI‑style APIs.
- You’re comfortable running a lightweight proxy or just the client library.
- You need basic observability and routing, not a full control plane.
Prefer Portkey when:
- You want a single, authoritative gateway for all LLM traffic.
- Governance, policy control, and fine‑grained observability are critical.
- You’re okay with a more involved infra and ops change.
- You want to run structured experiments (A/B tests, provider comparisons) across models and providers.
10. Summary: what actually breaks, in one list
When migrating from direct OpenAI/Azure/Bedrock calls to BerriAI/LiteLLM or Portkey, expect to touch:
- Endpoints & clients:
- Base URLs and API keys change; client library may change.
- Request shapes:
- Mostly stable if you’re already OpenAI‑like; provider‑specific JSON requires mapping.
- Streaming:
- Stream parsing logic must adapt to normalized SSE output.
- Errors:
- Code relying on provider‑specific errors must migrate to gateway‑standard errors.
- Provider features:
- Fine‑tuning IDs, special parameters, and provider‑only features need explicit mapping.
- Infra & security:
- Keys become centralized; traffic flows through a single gateway/proxy; new monitoring paths.
Handled incrementally, migration is usually safe and worthwhile: LiteLLM minimizes code churn, while Portkey maximizes control. Your choice depends on whether you prioritize fast adoption or deep governance and observability over your multi‑provider LLM stack.