Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-07-28 | 16.0 kB | |
v0.2.16 source code.tar.gz | 2025-07-28 | 11.2 MB | |
v0.2.16 source code.zip | 2025-07-28 | 11.9 MB | |
Totals: 3 Items | 23.0 MB | 0 |
Highlights
- Automatic model registration for self-hosted providers (ollama and vllm currently). No need for
INFERENCE_MODEL
environment variables which need to be updated, etc. - Much simplified starter distribution. Most
ENABLE_
env variables are now gone. When you setVLLM_URL
, thevllm
provider is auto-enabled. Similar forMILVUS_URL
,PGVECTOR_DB
, etc. Check the run.yaml for more details. - All tests migrated to pytest now (thanks @Elbehery)
- DPO implementation in the post-training provider (thanks @Nehanth)
- (Huge!) Support for external APIs and providers thereof (thanks @leseb, @cdoern and others). This is a really big deal -- you can now add more APIs completely out of tree and experiment with them before (optionally) wanting to contribute back.
inline::vllm
provider is gone thank you very much- several improvements to OpenAI inference implementations and LiteLLM backend (thanks @mattf)
- Chroma now supports Vector Store API (thanks @franciscojavierarceo).
- Authorization improvements: Vector Store/File APIs now supports access control (thanks @franciscojavierarceo); Telemetry read APIs are gated according to logged-in user's roles.
What's Changed
- fix: re-hydrate requirement and fix package by @leseb in https://github.com/meta-llama/llama-stack/pull/2774
- ci: do not pull model by @leseb in https://github.com/meta-llama/llama-stack/pull/2776
- docs: fix typo and link self loop for index.html#running-tests by @r3v5 in https://github.com/meta-llama/llama-stack/pull/2777
- chore: remove vision model URL workarounds and simplify client creation by @mattf in https://github.com/meta-llama/llama-stack/pull/2775
- chore: remove 'gha_workflow_llama_stack_tests.yml' by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2767
- fix: add shutdown function for localfs provider by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2781
- fix: SQLiteVecIndex.create(..., bank_id="test_bank.123") - bank_id with a dot - leads to sqlite3.OperationalError (#2770) by @syedriko in https://github.com/meta-llama/llama-stack/pull/2771
- docs: add missing bold title to match others by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2782
- fix: de-clutter
llama stack run
logs by @cdoern in https://github.com/meta-llama/llama-stack/pull/2783 - feat: create dynamic model registration for OpenAI and Llama openai compat remote inference providers by @r3v5 in https://github.com/meta-llama/llama-stack/pull/2745
- chore: update k8s template by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2786
- fix: Move sentence-transformers to the top by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2703
- chore: internal change, make Model.provider_model_id non-optional by @mattf in https://github.com/meta-llama/llama-stack/pull/2690
- feat: allow dynamic model registration for nvidia inference provider by @mattf in https://github.com/meta-llama/llama-stack/pull/2726
- chore(test): migrate unit tests from unittest to pytest for server en… by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2795
- test: add some tests for Telemetry API by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2787
- chore(test): migrate unit tests from unittest to pytest for prompt adapter by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2788
- chore: block asyncio marks in tests by @mattf in https://github.com/meta-llama/llama-stack/pull/2744
- fix(cli): image name should not default to CONDA_DEFAULT_ENV by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2806
- fix: remove async test markers (fix pre-commit) by @cdoern in https://github.com/meta-llama/llama-stack/pull/2808
- chore(test): migrate unit tests from unittest to pytest nvidia test p… by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2792
- chore(test): migrate unit tests from unittest to pytest for nvidia datastore by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2790
- chore(test): migrate unit tests from unittest to pytest for system prompt by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2789
- chore(api): add
mypy
coverage tochat_format
by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2654 - chore: add
mypy
inference parallel utils by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2670 - chore(test): migrate unit tests from unittest to pytest nvidia test f… by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2794
- fix: amend integration test workflow by @cdoern in https://github.com/meta-llama/llama-stack/pull/2812
- docs: add virtualenv instructions for running starter distro by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2780
- test: Measure and track code coverage by @ChristianZaccaria in https://github.com/meta-llama/llama-stack/pull/2636
- docs: fix steps in the Quick Start Guide by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2800
- feat(server): construct the stack in a persistent event loop by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2818
- chore: Add slekkala1 to codeowners by @slekkala1 in https://github.com/meta-llama/llama-stack/pull/2817
- fix: remove disabled providers from model dump by @cdoern in https://github.com/meta-llama/llama-stack/pull/2784
- fix: DPOAlignmentConfig schema to use correct DPO parameters by @Nehanth in https://github.com/meta-llama/llama-stack/pull/2804
- feat: enable ls client for files tests by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2769
- feat(ollama): periodically refresh models by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2805
- chore: kill inline::vllm by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2824
- feat(vllm): periodically refresh models by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2823
- feat(ci): add a ci-tests distro by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2826
- feat: enable auth for LocalFS Files Provider by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2773
- chore(github-deps): bump astral-sh/setup-uv from 6.3.1 to 6.4.1 by @dependabot[bot] in https://github.com/meta-llama/llama-stack/pull/2827
- fix: Add permissions for pull request creation in coverage-badge workflow by @ChristianZaccaria in https://github.com/meta-llama/llama-stack/pull/2832
- chore: add contribution guideline around PRs by @leseb in https://github.com/meta-llama/llama-stack/pull/2811
- fix(vectordb): VectorDBInput has no provider_id by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2830
- feat: Allow application/yaml as mime_type by @onmete in https://github.com/meta-llama/llama-stack/pull/2575
- fix: remove @pytest.mark.asyncio from test_get_raw_document_text.py by @r3v5 in https://github.com/meta-llama/llama-stack/pull/2840
- test: skip flaky telemetry tests by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2814
- fix: graceful SIGINT on server by @leseb in https://github.com/meta-llama/llama-stack/pull/2831
- fix: uvicorn respect log_config by @cdoern in https://github.com/meta-llama/llama-stack/pull/2842
- chore: merge --config and --template in server.py by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2716
- chore: Adding Access Control for OpenAI Vector Stores methods by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2772
- chore: Adding demo script and importing it into the docs by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2848
- docs: minor fix of the pgvector provider spec description by @jeremychoi in https://github.com/meta-llama/llama-stack/pull/2847
- fix(agent): ensure turns are sorted by @omertuc in https://github.com/meta-llama/llama-stack/pull/2854
- chore: remove *_openai_compat providers by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2849
- chore: Making name optional in openai_create_vector_store by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2858
- fix(install): explicit docker.io usage by @omertuc in https://github.com/meta-llama/llama-stack/pull/2850
- chore(test): fix flaky telemetry tests by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2815
- feat(registry): more flexible model lookup by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2859
- fix: optimize container build by enabling uv cache by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2855
- fix: honour deprecation of --config and --template by @leseb in https://github.com/meta-llama/llama-stack/pull/2856
- chore: add some documentation for access policy rules by @grs in https://github.com/meta-llama/llama-stack/pull/2785
- chore: create OpenAIMixin for inference providers with an OpenAI-compat API that need to implement openai_* methods by @mattf in https://github.com/meta-llama/llama-stack/pull/2835
- chore: Moving vector store and vector store files helper methods to openai_vector_store_mixin by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2863
- fix: search mode validation for rag query by @Bobbins228 in https://github.com/meta-llama/llama-stack/pull/2857
- fix: bring back dell template by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2880
- fix: fixed test_access_control.py unit test by @r3v5 in https://github.com/meta-llama/llama-stack/pull/2876
- fix: cleanup after build_container.sh by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2869
- fix: prevent shell redirection issues with pip dependencies by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2867
- chore: Added openai compatible vector io endpoints for chromadb by @cheesecake100201 in https://github.com/meta-llama/llama-stack/pull/2489
- fix: starter template and litellm backward compat conflict for openai by @mattf in https://github.com/meta-llama/llama-stack/pull/2885
- fix: update check-workflows-use-hashes to use github error format by @Mohit-Gaur in https://github.com/meta-llama/llama-stack/pull/2875
- docs: Document use cases for Responses and Agents APIs by @ChristianZaccaria in https://github.com/meta-llama/llama-stack/pull/2756
- docs: Update CHANGELOG.md by @terrytangyuan in https://github.com/meta-llama/llama-stack/pull/2874
- chore(test): migrate unit tests from unittest to pytest nvidia test safety by @Elbehery in https://github.com/meta-llama/llama-stack/pull/2793
- test: Add VLLM provider support to integration tests by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2757
- fix: various improvements on install.sh by @leseb in https://github.com/meta-llama/llama-stack/pull/2724
- docs: update list of apis by @cdoern in https://github.com/meta-llama/llama-stack/pull/2697
- chore: add mypy coverage to inspect.py and library_client.py in /distribution by @SjjThaler in https://github.com/meta-llama/llama-stack/pull/2707
- feat(registry): make the Stack query providers for model listing by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2862
- chore: return webmethod from find_matching_route by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2883
- chore: install script should use starter by @leseb in https://github.com/meta-llama/llama-stack/pull/2891
- docs: Update nvidia docs template by @kelbrown20 in https://github.com/meta-llama/llama-stack/pull/2893
- fix: use logger for console telemetry by @cdoern in https://github.com/meta-llama/llama-stack/pull/2844
- feat: Bring Your Own API (BYOA) by @leseb in https://github.com/meta-llama/llama-stack/pull/2228
- feat: add MCP Streamable HTTP support by @Cali0707 in https://github.com/meta-llama/llama-stack/pull/2554
- feat(auth): API access control by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2822
- chore(deps): bump form-data from 4.0.2 to 4.0.4 in /llama_stack/ui by @dependabot[bot] in https://github.com/meta-llama/llama-stack/pull/2898
- ci: Remove
open-pull-requests-limit: 0
from dependabot.yml by @terrytangyuan in https://github.com/meta-llama/llama-stack/pull/2900 - chore(github-deps): bump astral-sh/setup-uv from 6.4.1 to 6.4.3 by @dependabot[bot] in https://github.com/meta-llama/llama-stack/pull/2902
- refactor: install external providers from module by @cdoern in https://github.com/meta-llama/llama-stack/pull/2637
- fix(registry): ensure clean shutdown by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2901
- chore: Fix chroma unit tests by @franciscojavierarceo in https://github.com/meta-llama/llama-stack/pull/2896
- feat: implement chunk deletion for vector stores by @derekhiggins in https://github.com/meta-llama/llama-stack/pull/2701
- feat: add auto-generated CI documentation pre-commit hook by @nathan-weinberg in https://github.com/meta-llama/llama-stack/pull/2890
- fix: separate build and run provider types by @cdoern in https://github.com/meta-llama/llama-stack/pull/2917
- feat(starter)!: simplify starter distro; litellm model registry changes by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2916
- test: upload logs for external provider tests by @cdoern in https://github.com/meta-llama/llama-stack/pull/2914
- fix: litellm_provider_name for llama-api by @mattf in https://github.com/meta-llama/llama-stack/pull/2934
- fix: switch refresh to debug log by @cdoern in https://github.com/meta-llama/llama-stack/pull/2933
- fix: Fix unit tests CI and failing tests by @ChristianZaccaria in https://github.com/meta-llama/llama-stack/pull/2928
- feat: implement dynamic model detection support for inference providers using litellm by @mattf in https://github.com/meta-llama/llama-stack/pull/2886
- fix: adjust provider type used in external provider test by @cdoern in https://github.com/meta-llama/llama-stack/pull/2921
- docs: remove provider_id from external docs by @cdoern in https://github.com/meta-llama/llama-stack/pull/2922
- feat(openai): add configurable base_url support with OPENAI_BASE_URL env var by @mattf in https://github.com/meta-llama/llama-stack/pull/2919
- fix(openai-compat): restrict developer/assistant/system/tool messages to text-only content by @mattf in https://github.com/meta-llama/llama-stack/pull/2932
- fix(dependabot): run pre-commit on dependabot PRs by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2935
- chore(packaging): remove requirements.txt by @ashwinb in https://github.com/meta-llama/llama-stack/pull/2938
- chore(python-deps): bump pydantic from 2.10.6 to 2.11.7 by @dependabot[bot] in https://github.com/meta-llama/llama-stack/pull/2925
- chore: revert [#2855] by @ehhuang in https://github.com/meta-llama/llama-stack/pull/2939
New Contributors
- @r3v5 made their first contribution in https://github.com/meta-llama/llama-stack/pull/2777
- @syedriko made their first contribution in https://github.com/meta-llama/llama-stack/pull/2771
- @slekkala1 made their first contribution in https://github.com/meta-llama/llama-stack/pull/2817
- @Nehanth made their first contribution in https://github.com/meta-llama/llama-stack/pull/2804
- @onmete made their first contribution in https://github.com/meta-llama/llama-stack/pull/2575
- @jeremychoi made their first contribution in https://github.com/meta-llama/llama-stack/pull/2847
- @omertuc made their first contribution in https://github.com/meta-llama/llama-stack/pull/2854
- @Mohit-Gaur made their first contribution in https://github.com/meta-llama/llama-stack/pull/2875
- @SjjThaler made their first contribution in https://github.com/meta-llama/llama-stack/pull/2707
- @Cali0707 made their first contribution in https://github.com/meta-llama/llama-stack/pull/2554
Full Changelog: https://github.com/meta-llama/llama-stack/compare/v0.2.15...v0.2.16