Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
load_test.html | 2025-10-09 | 1.7 MB | |
load_test_stats.csv | 2025-10-09 | 546 Bytes | |
README.md | 2025-10-09 | 4.2 kB | |
v1.77.7.dev9 source code.tar.gz | 2025-10-09 | 198.7 MB | |
v1.77.7.dev9 source code.zip | 2025-10-09 | 201.2 MB | |
Totals: 5 Items | 401.5 MB | 1 |
What's Changed
- (Bug) Fix reasoning response ID by @Sameerlite in https://github.com/BerriAI/litellm/pull/15265
- fix gemini cli by actually streaming the response by @Sameerlite in https://github.com/BerriAI/litellm/pull/15264
- Add gpt-realtime-mini support by @Sameerlite in https://github.com/BerriAI/litellm/pull/15283
- [Feat] Proxy CLI - dont store existing key in the URL, store it in the state param by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15290
- Fix: Make PATCH /model/{model_id}/update handle team_id consistently with POST /model/new by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15297
- [Fix] Networking: remove limitations by @AlexsanderHamir in https://github.com/BerriAI/litellm/pull/15302
- [MCP Gateway] Litellm mcp fixes team control by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15304
- [MCP Gateway] QA/Fixes - Ensure Team/Key level enforcement works for MCPs by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15305
- fix: model + endpoints page crash when config file contains router_settings.model_group_alias by @ARajan1084 in https://github.com/BerriAI/litellm/pull/15308
- Upgrades tenacity version to 8.5.0 by @ARajan1084 in https://github.com/BerriAI/litellm/pull/15303
- [QA/Fixes] - Dynamic Rate Limiter v3 - final QA by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15311
- Add Cohere Embed v4 support for AWS Bedrock by @timelfrink in https://github.com/BerriAI/litellm/pull/15298
- fix(bedrock): include cacheWriteInputTokens in prompt_tokens calculation by @timelfrink in https://github.com/BerriAI/litellm/pull/15292
- fix issue with parsing assistant messages by @Sameerlite in https://github.com/BerriAI/litellm/pull/15320
- feat(files): add @client decorator to file operations by @FelipeRodriguesGare in https://github.com/BerriAI/litellm/pull/15339
- [Fix] Watsonx - Apply correct prompt templates for openai/gpt-oss model family by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15341
- potentially fixes a UI spasm issue with an expired cookie by @ARajan1084 in https://github.com/BerriAI/litellm/pull/15309
- Add gpt-5-pro-2025-10-06 to model costs by @sandeshghanta in https://github.com/BerriAI/litellm/pull/15344
- Fix - (openrouter): move cache_control to content blocks for claude/gemini by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15345
- [Fix] x-litellm-cache-key header not being returned on cache hit by @ishaan-jaff in https://github.com/BerriAI/litellm/pull/15348
- Add native Responses API support for litellm_proxy provider by @Copilot in https://github.com/BerriAI/litellm/pull/15347
- AzureAD Default credentials - select credential type based on environment by @krrishdholakia in https://github.com/BerriAI/litellm/pull/14470
- SSO - support EntraID app roles by @krrishdholakia in https://github.com/BerriAI/litellm/pull/15351
- MCP - support converting OpenAPI specs to MCP servers by @krrishdholakia in https://github.com/BerriAI/litellm/pull/15343
- LiteLLM UI Refactor Infrastructure by @ARajan1084 in https://github.com/BerriAI/litellm/pull/15236
- MCP - specify allowed params per tool by @krrishdholakia in https://github.com/BerriAI/litellm/pull/15346
Full Changelog: https://github.com/BerriAI/litellm/compare/v1.77.7.dev.3...v1.77.7.dev9
Docker Run LiteLLM Proxy
docker run \
-e STORE_MODEL_IN_DB=True \
-p 4000:4000 \
ghcr.io/berriai/litellm:main-v1.77.7.dev9
Don't want to maintain your internal proxy? get in touch 🎉
Hosted Proxy Alpha: https://calendly.com/d/4mp-gd3-k5k/litellm-1-1-onboarding-chat
Load Test LiteLLM Proxy Results
Name | Status | Median Response Time (ms) | Average Response Time (ms) | Requests/s | Failures/s | Request Count | Failure Count | Min Response Time (ms) | Max Response Time (ms) |
---|---|---|---|---|---|---|---|---|---|
/chat/completions | Failed ❌ | 60 | 77.84764691897514 | 6.515099712566108 | 6.515099712566108 | 1950 | 1950 | 40.86414299996477 | 2919.70134099995 |
Aggregated | Failed ❌ | 60 | 77.84764691897514 | 6.515099712566108 | 6.515099712566108 | 1950 | 1950 | 40.86414299996477 | 2919.70134099995 |