Download Latest Version v0.2.16 source code.tar.gz (11.2 MB)
Email in envelope

Get an email when there's a new version of Llama Stack

Home / v0.2.16
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2025-07-28 16.0 kB
v0.2.16 source code.tar.gz 2025-07-28 11.2 MB
v0.2.16 source code.zip 2025-07-28 11.9 MB
Totals: 3 Items   23.0 MB 0

Highlights

  • Automatic model registration for self-hosted providers (ollama and vllm currently). No need for INFERENCE_MODEL environment variables which need to be updated, etc.
  • Much simplified starter distribution. Most ENABLE_ env variables are now gone. When you set VLLM_URL, the vllm provider is auto-enabled. Similar for MILVUS_URL, PGVECTOR_DB, etc. Check the run.yaml for more details.
  • All tests migrated to pytest now (thanks @Elbehery)
  • DPO implementation in the post-training provider (thanks @Nehanth)
  • (Huge!) Support for external APIs and providers thereof (thanks @leseb, @cdoern and others). This is a really big deal -- you can now add more APIs completely out of tree and experiment with them before (optionally) wanting to contribute back.
  • inline::vllm provider is gone thank you very much
  • several improvements to OpenAI inference implementations and LiteLLM backend (thanks @mattf)
  • Chroma now supports Vector Store API (thanks @franciscojavierarceo).
  • Authorization improvements: Vector Store/File APIs now supports access control (thanks @franciscojavierarceo); Telemetry read APIs are gated according to logged-in user's roles.

What's Changed

New Contributors

Full Changelog: https://github.com/meta-llama/llama-stack/compare/v0.2.15...v0.2.16

Source: README.md, updated 2025-07-28