LiteLLM Integration — Zukünftiger Plan

Status: Zurückgestellt ( Juni 2026 ) Grund: Aktuell wird alles direkt über Ollama (lokal) und OpenRouter (API) genutzt. LiteLLM als Proxy ist nicht notwendig solange wir nur 2-3 Provider haben.

Was ist LiteLLM?

LiteLLM ist ein OpenAI-kompatibler LLM Proxy / API Gateway.

Dein Code → LiteLLM Proxy (:4000) → OpenRouter / Ollama / Anthropic / Google / ...

Vorteile

Feature	Beschreibung
Einheitliche API	OpenAI-kompatible API für ALLE Anbieter
Multi-Provider Fallback	Automatischer Wechsel bei Ausfall
Kosten-Tracking	Token-Nutzung pro Anbieter tracken
Rate Limiting	Zentrale Kontrolle
Load Balancing	Requests auf mehrere Keys verteilen

Wann LiteLLM sinnvoll wird

5+ AI-Provider gleichzeitig
Automatisches Failover zwischen Anbietern nötig
Kosten-Tracking pro Team/User
Rate Limiting für verschiedene User-Gruppen
Model-Routing (einfache Anfragen → billiges Modell, komplexe → teures)

Konfiguration (für später)

docker-compose.yml

litellm:
  image: ghcr.io/berriai/litellm:latest
  ports:
    - "4000:4000"
  volumes:
    - ./litellm-config.yaml:/app/config.yaml
  environment:
    - OPENROUTER_API_KEY=${OPENROUTER_KEY_PRIMARY}
    - OPENROUTER_FALLBACK_KEY=${OPENROUTER_KEY_FALLBACK1}
    - OLLAMA_BASE_URL=http://ollama:11434

litellm-config.yaml

model_list:
  - model_name: hermes-default
    litellm_params:
      model: openrouter/anthropic/claude-sonnet-4
      api_key: ${OPENROUTER_KEY_PRIMARY}
  - model_name: hermes-fast
    litellm_params:
      model: openrouter/google/gemini-2.0-flash-001:free
      api_key: ${OPENROUTER_KEY_FALLBACK1}
  - model_name: hermes-local
    litellm_params:
      model: ollama/llama3.1:8b
      api_base: http://ollama:11434

fallbacks:
  - hermes-default: [hermes-fast, hermes-local]
  - hermes-fast: [hermes-local]

Nächste Schritte (wenn implementiert)

LiteLLM Container in docker-compose.yml
Config mit allen Providern
Scripts auf localhost:4000/v1/chat/completions umstellen
Fallback-Konfiguration testen
Kosten-Tracking Dashboard

Aktueller Stand (Juni 2026)

❌ LiteLLM Container läuft OHNE Konfiguration
✅ Ollama direkt: http://localhost:11434
✅ OpenRouter direkt: https://openrouter.ai/api/v1
✅ API Key Rotation: PRIMARY → FALLBACK1 → Ollama

Referenzen

Docs: https://docs.litellm.ai/
GitHub: https://github.com/BerriAI/litellm
Docker: https://docs.litellm.ai/docs/proxy/docker

2.6 KiB Raw Blame History

LiteLLM Integration — Zukünftiger Plan

Was ist LiteLLM?

Vorteile

Wann LiteLLM sinnvoll wird

Konfiguration (für später)

docker-compose.yml

litellm-config.yaml

Nächste Schritte (wenn implementiert)

Aktueller Stand (Juni 2026)

Referenzen

2.6 KiB

Raw Blame History