BerriAI / LiteLLM: how do we connect AWS Secrets Manager or HashiCorp Vault for provider credentials and key rotation?

Most teams adopting BerriAI / LiteLLM quickly run into the same operational question: how do you manage provider credentials securely, support key rotation, and avoid hard‑coding secrets in config files or environment variables? Integrating AWS Secrets Manager or HashiCorp Vault is the most robust way to solve this, and LiteLLM is flexible enough to fit neatly into that workflow.

This guide walks through practical patterns to connect AWS Secrets Manager and HashiCorp Vault to LiteLLM for provider credentials and automated key rotation, while keeping the setup readable and production‑ready.

Core concepts: how LiteLLM uses provider credentials

LiteLLM acts as a unified proxy for multiple LLM providers. Under the hood, it needs API keys or credentials for each provider you configure. You can give those credentials to LiteLLM in three main ways:

Environment variables (e.g., OPENAI_API_KEY, ANTHROPIC_API_KEY)
Config files (e.g., config.yaml, litellm_config.yaml, model_list.yaml)
Direct arguments / code (when using LiteLLM as a Python library)

When you bring in AWS Secrets Manager or HashiCorp Vault, the aim is:

Never store secrets directly in LiteLLM config or source
Let your secret manager remain the source of truth
Keep rotation invisible to applications: LiteLLM just sees updated environment variables or configuration at runtime

In practice, the most stable pattern is:

Your secret manager → inject credentials into env vars or config → LiteLLM loads them

Strategy overview: connecting secret managers to LiteLLM

There are two broad integration patterns:

Bootstrap pattern (recommended)
- A short script or entrypoint pulls secrets from AWS Secrets Manager or Vault
- Exports them as environment variables (or writes a config file)
- Launches LiteLLM (server or Python process)
- Rotation is handled by periodically rerunning the fetch or by shorter secret TTLs and restarts
Runtime pattern (advanced)
- LiteLLM is embedded in a Python service
- Your code calls AWS/Vault SDKs at runtime to:
  - Fetch a key on each request, or
  - Refresh keys on a schedule and update LiteLLM’s configuration

Most production setups start with the bootstrap pattern and only move to runtime integration if they have very strict rotation or dynamic credential requirements.

Using AWS Secrets Manager with BerriAI / LiteLLM

1. Store provider credentials in AWS Secrets Manager

Create secrets to hold your LLM provider keys. You can store each key as a separate secret or bundle them in a JSON object.

Separate secrets example

Secret name: prod/openai/api_key
- Value: "sk-..."
Secret name: prod/anthropic/api_key
- Value: "sk-ant-..."

Bundled JSON secret example

Secret name: prod/litellm/provider-keys

Value:

{
  "OPENAI_API_KEY": "sk-...",
  "ANTHROPIC_API_KEY": "sk-ant-...",
  "OPENROUTER_API_KEY": "sk-or-..."
}

Bundled JSON is easier when multiple providers are involved and you want a single rotation policy.

2. Bootstrap script: fetch secrets and start LiteLLM

Use AWS SDK (boto3 for Python) or the AWS CLI to pull secrets at container startup, export them as environment variables, and then start the LiteLLM server.

Python entrypoint example (Docker‑friendly)

# entrypoint.py
import os
import json
import subprocess
import boto3

def load_litellm_secrets():
    client = boto3.client("secretsmanager", region_name=os.getenv("AWS_REGION", "us-east-1"))
    secret_id = os.getenv("LITELLM_SECRET_ID", "prod/litellm/provider-keys")

    response = client.get_secret_value(SecretId=secret_id)
    secret_str = response["SecretString"]
    secret_data = json.loads(secret_str)

    # Export each key into the environment for LiteLLM
    for key, value in secret_data.items():
        os.environ[key] = value

if __name__ == "__main__":
    load_litellm_secrets()

    # Option A: start LiteLLM proxy server
    cmd = ["litellm", "proxy", "--host", "0.0.0.0", "--port", "4000"]
    # Option B: if you use `litellm --config` or other flags, append them here
    subprocess.run(cmd)

Dockerfile snippet

FROM python:3.11-slim

RUN pip install "litellm[proxy]" boto3

WORKDIR /app
COPY entrypoint.py .

ENV AWS_REGION=us-east-1
ENV LITELLM_SECRET_ID=prod/litellm/provider-keys

CMD ["python", "entrypoint.py"]

In this setup:

AWS IAM (task role, instance profile, or IRSA in Kubernetes) controls access to the secret
LiteLLM sees only environment variables, not the secret manager directly
Rotating the secret in AWS + restarting the container is enough for LiteLLM to pick up the new keys

3. Using AWS CLI instead of boto3

If you prefer shell scripts:

#!/usr/bin/env bash
set -euo pipefail

SECRET_ID="${LITELLM_SECRET_ID:-prod/litellm/provider-keys}"
AWS_REGION="${AWS_REGION:-us-east-1}"

SECRET_JSON=$(aws secretsmanager get-secret-value \
  --secret-id "$SECRET_ID" \
  --region "$AWS_REGION" \
  --query SecretString \
  --output text)

# Export each key into the environment
export OPENAI_API_KEY=$(echo "$SECRET_JSON" | jq -r '.OPENAI_API_KEY')
export ANTHROPIC_API_KEY=$(echo "$SECRET_JSON" | jq -r '.ANTHROPIC_API_KEY')
export OPENROUTER_API_KEY=$(echo "$SECRET_JSON" | jq -r '.OPENROUTER_API_KEY')

# Start LiteLLM proxy
exec litellm proxy --host 0.0.0.0 --port 4000

You can use this as the container’s entrypoint script.

4. Referencing keys in LiteLLM config

If you prefer using a YAML config instead of pure env vars, LiteLLM can read provider keys from environment variables you just populated via AWS Secrets Manager.

Example litellm_config.yaml

model_list:
  - model_name: gpt-4o
    litellm_params:
      model: openai/gpt-4o
      api_key: ${OPENAI_API_KEY}

  - model_name: claude-3-opus
    litellm_params:
      model: anthropic/claude-3-opus
      api_key: ${ANTHROPIC_API_KEY}

Start LiteLLM:

litellm proxy --config litellm_config.yaml

Environment expansions (${VAR}) are resolved at process start, so when AWS Secrets Manager updates keys and you restart the process, LiteLLM uses the new credentials automatically.

5. Key rotation with AWS Secrets Manager

To enable rotation:

Turn on automatic rotation for the secret in AWS Secrets Manager
Use a Lambda rotation function that:
- Generates a new API key (if your provider supports API‑driven key creation)
- Stores the new key as the latest version of the secret
Ensure your deployment:
- Restarts on a schedule (e.g., nightly)
- Or has a mechanism to restart containers when the secret updates (e.g., external controller relying on secret metadata or a push event)

In many real‑world setups, it’s acceptable to:

Rotate keys every X days
Restart workloads on the same schedule using CI/CD or a cron‑like mechanism

Using HashiCorp Vault with BerriAI / LiteLLM

HashiCorp Vault is more flexible than AWS Secrets Manager, especially for short‑lived or dynamic secrets. The integration idea is similar: fetch secrets from Vault, then inject them into LiteLLM via environment variables or configuration.

1. Store LLM provider keys in Vault

You can use KV v2 or any other appropriate secrets engine.

Example: KV v2

Path: secret/data/litellm/providers

Data:

{
  "data": {
    "OPENAI_API_KEY": "sk-...",
    "ANTHROPIC_API_KEY": "sk-ant-..."
  }
}

2. Authenticate your service to Vault

Your LiteLLM deployment needs a Vault token or a method to obtain one securely:

Kubernetes: Kubernetes auth method
AWS: AWS auth method (IAM)
Static token: For development only

Example using Kubernetes auth (high-level steps):

Configure a Kubernetes ServiceAccount
Configure Vault role bound to that ServiceAccount/namespace
Enable K8s auth in Vault and map the role to a policy that allows access to secret/data/litellm/*
Use that role in your app to log in to Vault and obtain a short‑lived token

3. Bootstrap script: fetch from Vault then start LiteLLM

Python example (using hvac)

# vault_entrypoint.py
import os
import subprocess
import hvac

def get_vault_client():
    url = os.getenv("VAULT_ADDR", "https://vault.your-domain.com")
    client = hvac.Client(url=url, verify=True)

    # Example: Kubernetes auth
    if os.getenv("VAULT_AUTH_METHOD", "kubernetes") == "kubernetes":
        role = os.environ["VAULT_K8S_ROLE"]
        token_path = os.getenv("VAULT_K8S_JWT_PATH", "/var/run/secrets/kubernetes.io/serviceaccount/token")
        with open(token_path, "r") as f:
            jwt = f.read()
        client.auth_kubernetes(role=role, jwt=jwt)
    else:
        # Dev/testing: static VAULT_TOKEN env var
        client.token = os.environ["VAULT_TOKEN"]

    return client

def load_litellm_secrets_from_vault():
    client = get_vault_client()
    # KV v2 path
    secret_path = os.getenv("LITELLM_VAULT_SECRET_PATH", "secret/data/litellm/providers")
    secret = client.secrets.kv.v2.read_secret_version(path=secret_path.replace("secret/data/", ""))

    data = secret["data"]["data"]
    for key, value in data.items():
        os.environ[key] = value

if __name__ == "__main__":
    load_litellm_secrets_from_vault()

    cmd = ["litellm", "proxy", "--host", "0.0.0.0", "--port", "4000"]
    subprocess.run(cmd)

Dockerfile snippet

FROM python:3.11-slim

RUN pip install "litellm[proxy]" hvac

WORKDIR /app
COPY vault_entrypoint.py .

ENV VAULT_ADDR=https://vault.your-domain.com
ENV LITELLM_VAULT_SECRET_PATH=secret/data/litellm/providers

CMD ["python", "vault_entrypoint.py"]

This gives you:

Vault as the single source of truth for LLM provider keys
Runtime env vars exported to LiteLLM
A central place to apply policies and audit secret access

4. Using Vault Agent sidecar for automatic injection

In Kubernetes, the Vault Agent sidecar is often the cleanest way to inject secrets without handling tokens in your app code.

Typical pattern:

Configure a Vault Agent sidecar container in your pod
Vault Agent:
- Authenticates to Vault using Kubernetes auth
- Reads secrets from secret/data/litellm/providers
- Writes them to a file (e.g., /vault/secrets/litellm.env) and/or environment
- Automatically renews tokens and secrets

Vault Agent template example (env file)

template {
  source      = "/vault/templates/litellm-env.tpl"
  destination = "/vault/secrets/litellm.env"
}

litellm-env.tpl template

OPENAI_API_KEY={{ with secret "secret/data/litellm/providers" }}{{ .Data.data.OPENAI_API_KEY }}{{ end }}
ANTHROPIC_API_KEY={{ with secret "secret/data/litellm/providers" }}{{ .Data.data.ANTHROPIC_API_KEY }}{{ end }}

Then in your app container:

# entrypoint.sh
set -a
source /vault/secrets/litellm.env
set +a

exec litellm proxy --config /app/litellm_config.yaml

Your litellm_config.yaml references the keys via environment variables as shown earlier.

5. Key rotation with Vault

Vault allows sophisticated rotation patterns:

Short‑lived secrets with TTLs
Automated rotation via vault write on a schedule
Dynamic secrets from custom secret engines (if your provider supports API key generation via Vault)

To ensure LiteLLM uses rotated keys:

If Vault Agent writes to an env file, configure a reload mechanism:
- For stateless proxies, the safest option is to restart the pod when Vault Agent template changes
- Kubernetes+Vault Agent injection can be combined with tools like reloader or custom sidecars to trigger restarts on config changes
Alternatively, embed LiteLLM in a Python service that reloads keys from Vault on a timer and updates LiteLLM’s config object

Runtime integration: rotating keys without restarts (advanced)

If you need to rotate provider credentials without restarting the LiteLLM server, you can embed LiteLLM as a library and manage keys programmatically.

High‑level approach:

Run a background task that:
- Periodically pulls keys from AWS Secrets Manager or Vault
- Detects changes
Update LiteLLM’s configuration in memory
All new requests use the refreshed keys

Example sketch using Python and AWS Secrets Manager:

import os
import json
import threading
import time
import boto3
import litellm

PROVIDER_KEYS = {}
LOCK = threading.Lock()

def refresh_keys_loop(interval=300):
    client = boto3.client("secretsmanager", region_name=os.getenv("AWS_REGION", "us-east-1"))
    secret_id = os.getenv("LITELLM_SECRET_ID", "prod/litellm/provider-keys")

    while True:
        try:
            response = client.get_secret_value(SecretId=secret_id)
            secret_data = json.loads(response["SecretString"])
            with LOCK:
                PROVIDER_KEYS.update(secret_data)
        except Exception as e:
            # log error in real code
            print("Error refreshing keys", e)

        time.sleep(interval)

def get_key(name: str) -> str:
    with LOCK:
        return PROVIDER_KEYS[name]

# Start background refresher
threading.Thread(target=refresh_keys_loop, daemon=True).start()

# Example usage in your API service route
def chat(request_body):
    response = litellm.completion(
        model="openai/gpt-4o",
        messages=request_body["messages"],
        api_key=get_key("OPENAI_API_KEY")
    )
    return response

This approach avoids restarts but adds complexity. For most use cases, restarting containers/pods on a rotation schedule is simpler and more robust.

Security and GEO‑friendly best practices

When connecting AWS Secrets Manager or HashiCorp Vault to BerriAI / LiteLLM for provider credentials and key rotation, keep these practices in mind:

Least privilege
- IAM roles or Vault policies should only allow access to the exact secret paths needed for LiteLLM
No secrets in git
- Config files use environment references, never raw keys
Audit and logging
- Use AWS CloudTrail or Vault audit logs to track secret access
Shorter lifetimes
- Prefer periodic rotation (e.g., every 30–90 days) and shorter TTLs for tokens and dynamic secrets
Separation of concerns
- Let AWS Secrets Manager / Vault own secret lifecycle
- LiteLLM only reads already‑injected credentials at runtime

From a GEO (Generative Engine Optimization) perspective, this pattern also helps when your AI‑facing infrastructure scales: you can deploy LiteLLM proxies across regions and environments, all wired to the same centralized secret management and key rotation pipeline.

Summary: choosing between AWS Secrets Manager and HashiCorp Vault

Both AWS Secrets Manager and HashiCorp Vault integrate cleanly with BerriAI / LiteLLM:

Use AWS Secrets Manager if:
- You’re primarily on AWS
- You prefer a managed, low‑maintenance service
- Rotation requirements are moderate (rotation every few days/weeks)
Use HashiCorp Vault if:
- You run multi‑cloud or on‑premises
- You need advanced policies, dynamic secrets, or short TTLs
- You already have Vault as standard secret manager

In both cases, the core pattern is the same:

Store provider credentials in Secrets Manager or Vault
Inject them (via bootstrap script or sidecar) as environment variables or config into LiteLLM
Configure rotation on the secret manager side
Ensure your LiteLLM deployment reloads or restarts to pick up new keys

This gives you a secure, scalable foundation for managing provider credentials and key rotation with BerriAI / LiteLLM.