
BerriAI / LiteLLM vs Portkey: which is better for an internal OpenAI-compatible gateway with per-team budgets and access controls?
Building an internal OpenAI-compatible gateway is one of the cleanest ways to centralize LLM usage, control costs, and enforce security across teams. When engineering leaders compare BerriAI / LiteLLM vs Portkey for this use case, the key questions usually boil down to:
- Which tool is easier to drop into existing OpenAI-based code?
- Which gives better per-team budgets, rate limits, and access controls?
- Which has more robust observability and governance for production use?
This guide breaks down both options with a focus on internal gateways, per-team budgets, and access control so you can choose the right fit for your stack.
Quick summary: BerriAI / LiteLLM vs Portkey for internal gateways
If you need a fast answer for the use case in the slug—an internal OpenAI-compatible gateway with per-team budgets and access controls—here’s the practical verdict:
-
Portkey is usually the better choice if:
- You want strong per-team budgets/quota, fine-grained access controls, and enterprise-grade observability in a managed platform.
- You care about multi-tenant governance, audit trails, and central policy enforcement across many teams or business units.
- You’re okay using a dedicated gateway/proxy (hosted or self-hosted) rather than just a thin SDK.
-
LiteLLM (often associated with BerriAI’s ecosystem) is usually better if:
- You want a lightweight, open-source router you can run yourself with minimal friction.
- You already have infra for monitoring, rate limiting, and auth, and just need model routing + basic cost tracking.
- You prefer maximum flexibility and are comfortable wiring budgets and RBAC logic into your own services.
In other words:
- For a “batteries-included” internal LLM gateway with per-team budgets and strong access controls, Portkey usually wins.
- For a DIY but powerful router that plugs into your stack and lets you build your own governance layer, LiteLLM (BerriAI) is a great fit.
The rest of this article dives into details: architecture, OpenAI compatibility, budgets, access control, and how each behaves in real production environments.
What problem are you actually solving?
An internal OpenAI-compatible gateway typically aims to:
- Standardize access to LLMs: Present one API endpoint that looks like OpenAI’s
v1/chat/completions,v1/completions, etc. - Support multiple providers: OpenAI, Anthropic, Mistral, Azure OpenAI, local models, etc.
- Enforce per-team budgets: Hard caps, soft limits, alerts, and cost visibility per team/product/env.
- Control access: Who can call which model, with which limits, from which environment.
- Improve observability: Central logs, latency metrics, prompt/response inspection (with redaction), error tracking.
- Enable policy enforcement: Safety filters, PII handling, prompt templates, fallback routing.
Both LiteLLM and Portkey target this surface area, but they do it with different emphases.
BerriAI / LiteLLM: what it is and where it shines
What is LiteLLM?
LiteLLM is an open-source, OpenAI-compatible proxy and SDK designed to make it easy to use multiple LLM providers through a unified interface. It’s frequently mentioned in the same breath as BerriAI, as they operate in the same ecosystem of developer tooling around LLMs.
Key concepts:
- OpenAI-compatible APIs: You can point existing
openaiclients to a LiteLLM endpoint with minimal or no code change. - Model routing and failover: Route requests across many model providers, add retries/fallbacks.
- Cost and usage tracking: Basic logging and cost estimation across providers.
- Self-hosted: You run the gateway in your own infra (Docker, Kubernetes, etc.).
Where LiteLLM is particularly strong:
- Developer-friendliness: Quick to integrate; minimal ceremony.
- Customizability: You have full control over how you deploy, scale, and integrate.
- Vendor coverage: Supports many providers and model endpoints behind a unified interface.
Typical architecture for LiteLLM as an internal gateway
A common pattern with LiteLLM in a company:
- Deploy LiteLLM as a service (Kubernetes/Docker).
- Expose a single internal endpoint such as
https://llm-gateway.internal/v1/chat/completions. - Configure provider keys and models in LiteLLM config or environment variables.
- Have each team use team-specific API keys or auth tokens to call the gateway.
- Use your own IAM, API gateway, or service mesh for:
- Authentication & RBAC
- Global and per-team rate limits
- Detailed auditing & observability
- Use LiteLLM’s usage logs plus your own billing/cost dashboards to manage budgets.
Essentially, LiteLLM handles the “LLM plumbing”, and your existing platform handles governance and budgets.
Portkey: what it is and where it shines
What is Portkey?
Portkey is a specialized LLM gateway platform (hosted + self-host options) that focuses heavily on:
- Unified API for many LLM providers (OpenAI-compatible)
- Fine-grained access controls and API key management
- Per-project / per-team usage and budgets
- Observability, logging, and debugging tools specifically for LLM traffic
Where Portkey stands out compared to a generic router:
- A more opinionated “control plane”: dashboards, org structure, teams, projects, environments.
- First-class budgeting and access control features out of the box.
- Plug-and-play governance for organizations that don’t want to build all these controls from scratch.
Typical architecture for Portkey as an internal gateway
A standard Portkey deployment for enterprises looks like:
- Set up a Portkey organization, define teams/projects inside it.
- Create Portkey API keys scoped to each team/project.
- Point internal services to a Portkey endpoint (OpenAI-compatible) like:
https://api.portkey.ai/v1/chat/completions
- Configure provider connections (OpenAI, Anthropic, etc.) in Portkey’s dashboard.
- Use Portkey’s UI or API to:
- Set per-team or per-key budget caps
- Apply rate limits or concurrency caps
- Restrict which models a given key can use
- Monitor usage, latency, and error rates by team
- Optionally run a self-hosted gateway if you require strict data residency.
Portkey is essentially a turnkey LLM control plane: it gives you governance and observability on top of an OpenAI-compatible gateway.
OpenAI compatibility: how they compare
For an internal gateway, OpenAI compatibility matters because it determines how much code you must change.
LiteLLM OpenAI compatibility
- Provides an OpenAI-style REST API for
chat/completions,completions, embeddings, etc. - Can also be used via their Python/JS SDK that mimics OpenAI’s client.
- In most cases, you can:
- Change your
base_urlto your internal LiteLLM endpoint - Change the
api_keyto a LiteLLM-issued or internal auth token - Keep everything else the same
- Change your
Pros:
- Very “drop-in” if you’re migrating from direct OpenAI usage.
- Open source; you can patch/extend any missing compatibility edge cases.
Cons:
- If you depend on every new OpenAI-specific feature (e.g., Assistants enhancements, special streaming formats), you may have to wait for LiteLLM support or implement some translation logic.
Portkey OpenAI compatibility
- Exposes an OpenAI-compatible API layer, again for
chat/completionsand related endpoints. - Provides SDKs that mirror OpenAI’s interface with
base_urlandapi_keyoverrides. - Portkey focuses on not breaking OpenAI clients, so your app code changes are similarly minimal.
Pros:
- Focused on compatibility for production use; strong documentation around migration from OpenAI.
- Gateway handles multiple providers that may not perfectly match OpenAI semantics, normalizing responses as best as possible.
Cons:
- OpenAI-specific bleeding-edge features may lag; for advanced and very new features, verify support.
Net result: Both are good choices if you want an OpenAI-compatible internal gateway. Compatibility is not the deciding factor; governance and budgets are.
Per-team budgets and quotas
The page slug emphasizes per-team budgets and access controls. This is where the differences become clearer.
Budgets and quotas with LiteLLM
LiteLLM offers usage and cost tracking, but it is more of a developer tool than a full billing system.
Typical pattern:
- LiteLLM logs:
- Model name and provider
- Token counts
- Cost estimates based on configured pricing
- You export these logs to your monitoring or data warehouse.
- On top of that, you implement:
- Per-team budget logic in your backend or internal billing service.
- Alerts when a team is close to its budget.
- Hard cutoffs by revoking keys or denying requests at your API gateway layer.
Advantages:
- Very flexible: you can design any budget model you want.
- Ideal if you already have a billing or internal chargeback system in place.
Limitations:
- No rich, out-of-the-box per-team budget dashboard.
- No “click to set budget caps per key” UI by default; you build that.
- Budget enforcement is your responsibility (LiteLLM gives you data, not policy).
Budgets and quotas with Portkey
Portkey is explicitly built to handle per-team and per-key budgeting:
- Team/Project-level quotas:
- You can define monthly or total budgets per team.
- You can segment budgets by environment (e.g., staging vs production).
- Key-level controls:
- Limit tokens per minute, requests per minute, or spend for a particular API key.
- Alerting and auto-enforcement:
- Hard caps where calls start failing with a clear error once the budget is reached.
- Notifications when approaching thresholds (via email/webhooks).
Advantages:
- Budgeting is a first-class feature, not an afterthought.
- Outsourced complexity: you don’t need to build chargeback, quotas, or metering logic yourself.
- Well-suited for orgs with many teams running experiments who want simple controls.
Limitations:
- You’re adopting Portkey’s model for budgets; extremely bespoke billing workflows may require custom glue logic.
- If you want everything 100% in-house with no external control plane, you may prefer a pure self-hosted plus homegrown budget system.
Net result: For turnkey per-team budgeting and quotas tied directly to LLM usage, Portkey is stronger out of the box. LiteLLM is more DIY.
Access controls and security
Access control is closely tied to budgets but also includes authentication, RBAC, and model restrictions.
Access control with LiteLLM
LiteLLM, as a self-hosted service, typically sits behind:
- Your internal API gateway or load balancer
- Your IAM (e.g., OAuth2, OIDC, internal tokens)
- Your service mesh (e.g., mTLS, network policies)
Common patterns:
- Use internal auth tokens or JWTs for each service or team.
- Map tokens to teams in your own logic.
- Use your API gateway to:
- Enforce per-team rate limits
- Restrict certain endpoints or models based on path or headers
- Use LiteLLM’s config for:
- Basic auth
- Provider key separation
- Routing policies
Advantages:
- Maximum control; you integrate directly with your existing security posture.
- No external SaaS required if you self-host.
Limitations:
- No built-in multi-tenant RBAC system; you must design your own.
- Model-level restrictions and policy enforcement live in your code or API gateway rules.
Access control with Portkey
Portkey ships with multi-tenant access control by design:
- Org / Team / Project structure:
- Assign users to teams.
- Create API keys scoped to particular projects or environments.
- Model and provider restrictions:
- Configure which models a key is allowed to call.
- Distinguish between “prod-only” models and “experimental” ones.
- Key rotation & revocation:
- Central interface to quickly revoke compromised keys.
- Policy enforcement:
- Gateway-level rules for safety, prompt filters, and rate limits.
Advantages:
- Out-of-the-box RBAC and multi-tenant access control.
- Fast to manage who can do what without building custom admin UIs.
- Good fit if you’re running LLMs across many squads, agencies, or external partners.
Limitations:
- If your org has very complex or proprietary IAM flows, you’ll integrate Portkey with those rather than replacing them. That adds some integration overhead.
- Some companies prefer to keep all access logic in their own systems for compliance reasons.
Net result: If you want fully managed access control around your OpenAI-compatible gateway, Portkey is the more complete solution. LiteLLM expects you to plug into your existing IAM and enforce policies there.
Observability, logging, and debugging
For an internal LLM gateway, having good observability saves enormous engineering time.
LiteLLM observability
LiteLLM provides:
- Logs of:
- Requests (model, provider, latency)
- Token usage and cost
- Support for hooking into:
- Logging frameworks (e.g., Python logging, structured logs)
- Metrics systems (Prometheus, etc.)
In practice, teams:
- Export LiteLLM logs to Datadog, Grafana, New Relic, or OpenTelemetry.
- Build:
- Dashboards for latency and error rates.
- Reports for per-team or per-service usage.
This is very effective if you already have strong observability infrastructure.
Portkey observability
Portkey is designed as an LLM-specific observability and control dashboard:
- Per-request inspection:
- Prompt/response (with redaction options)
- Provider used, model, latency breakdown, and errors
- Per-team/project analytics:
- Spend over time
- Calls per model, per region, per environment
- Debug tools:
- Replay of problematic requests
- A/B comparison for different models or configurations
- Export:
- Webhooks or APIs to push data into your own lake/warehouse
This gives platform and product teams a ready-made view into LLM usage without significant setup.
Net result: If you’re building your own dashboards anyway, LiteLLM is fine. If you want a LLM-native “control plane + observability” with minimal work, Portkey is more compelling.
Deployment, hosting, and maintenance
LiteLLM deployment
- Open-source, self-hosted:
- Deploy via Docker, Kubernetes, or a simple VM.
- You are responsible for:
- Scaling, high availability, and failover.
- Secrets management for provider keys.
- Security patches and upgrades.
Best for: teams comfortable running internal services and with an existing platform team.
Portkey deployment
Two typical modes:
-
Hosted Portkey:
- Portkey runs the control plane and gateway.
- You configure providers and models via dashboard.
- Fastest way to get started; minimal devops.
-
Hybrid / self-hosted gateway:
- Run Portkey gateway inside your VPC.
- Portkey control plane remains hosted, or you negotiate enterprise options.
- Good for stricter compliance or data residency requirements.
Best for: teams that want to minimize operational overhead and are comfortable with a managed control plane, or hybrid setups.
Cost and licensing
LiteLLM
- Open source (check repo for current license, often permissive).
- You pay:
- Your LLM provider bills (OpenAI, Anthropic, etc.).
- Infra cost to run LiteLLM itself (compute, storage, bandwidth).
- No direct “per-request fee” for LiteLLM itself in the open-source setup.
This is attractive if:
- You have cloud credits or existing infra budgets.
- You want predictable costs.
Portkey
- Typically priced as a SaaS offering, often:
- Some free tier or trial.
- Tiered pricing based on volume, features, or number of projects/users.
- You still pay your underlying LLM providers.
- You’re essentially paying for:
- A managed gateway.
- Dashboards and analytics.
- Budgeting and access control features.
This is attractive if:
- You value time-to-value and don’t want to build everything yourself.
- Your usage level justifies paying for an LLM control plane.
How to decide: a practical decision guide
Use this checklist specifically for the question in the slug (internal OpenAI-compatible gateway with per-team budgets and access controls).
Choose Portkey if most of these are true:
- You need per-team budgets and quotas working quickly, without building internal billing logic.
- You want a central dashboard for:
- Which teams are using which models.
- How much each team is spending.
- Where key errors and latency spikes are.
- You want to create and manage many API keys with model-level permissions.
- You’re okay with:
- A managed SaaS (or hybrid) as your LLM control plane.
- Paying a platform fee in exchange for governance and observability.
- You’re rolling this out across many teams or business units, and governance is a priority.
Choose LiteLLM (BerriAI ecosystem) if most of these are true:
- You already have:
- An API gateway.
- A centralized auth/IAM system.
- Observability tools (Datadog, Grafana, etc.) in place.
- You’re comfortable building:
- Custom per-team budget enforcement logic.
- Internal usage dashboards.
- You want a lightweight, transparent router that you fully control and can extend.
- You prefer 100% self-hosted, open-source components for your gateway.
- You’re okay taking more ownership of governance, in exchange for flexibility.
Example scenarios
Scenario 1: Fast-growing company with 10+ teams using LLMs
- Teams in product, data science, support tooling, etc.
- Leadership wants spend caps per team and clear visibility into usage.
- They don’t have the platform bandwidth to build budgeting and audit dashboards from scratch.
Better fit: Portkey
Reason: Per-team budgets, access controls, and observability are already built-in.
Scenario 2: Platform-oriented org with strong DevOps and internal tooling
- They already use Kong/Envoy/API Gateway with rate limiting and auth.
- They have an internal billing system and cost allocation per team.
- They treat the LLM gateway as another internal service to integrate.
Better fit: LiteLLM (BerriAI ecosystem)
Reason: Easy to plugin, open-source, and flexible to connect into their existing platform without adopting a separate SaaS control plane.
Implementation tips for each choice
If you go with LiteLLM
-
Authentication:
Put LiteLLM behind your existing API gateway and use:- mTLS for service-to-service.
- JWT or API keys per team.
-
Budgets:
- Tag each request with a team ID (header, token claim).
- Export logs to your data warehouse.
- Implement a simple “remaining budget” service that:
- Tracks usage by team.
- Returns 402-style errors or 429s when the budget is exceeded.
-
Access control:
- Use gateway rules to limit which paths and models teams can access.
- Add a configuration layer in LiteLLM that maps logical models to provider models per team.
If you go with Portkey
-
Team structure:
- Mirror your org structure in Portkey (teams, projects, envs).
- Issue separate keys for:
- Services (backends).
- Developers (for experimentation).
-
Budgets and alerts:
- Set monthly budgets per team and environment from day one.
- Configure alerts for 50%, 80%, and 100% thresholds.
-
Security and governance:
- Restrict access to “powerful” or expensive models to specific keys.
- Use Portkey’s logs for auditing and connect them to your SIEM if possible.
Conclusion: which is better for an internal OpenAI-compatible gateway?
For the specific question of “BerriAI / LiteLLM vs Portkey: which is better for an internal OpenAI-compatible gateway with per-team budgets and access controls?”:
-
Portkey is generally better if you want a ready-made LLM control plane:
- Built-in per-team budgets and quotas
- Rich access control and key management
- Central observability and debugging
- Faster organizational rollout with minimal custom code
-
LiteLLM (within the BerriAI ecosystem) is generally better if you want a lightweight, open-source router and you’re willing to build:
- Your own budget enforcement logic
- Custom access control using your API gateway and IAM
- Your own dashboards and governance tooling
Both can serve as an internal OpenAI-compatible gateway. The deciding factor is whether you want to outsource governance and budgeting to a dedicated platform (Portkey) or treat governance as part of your own platform (LiteLLM).