| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-04 | 5.5 kB | |
| Release 2.25.0 source code.tar.gz | 2026-03-04 | 76.7 MB | |
| Release 2.25.0 source code.zip | 2026-03-04 | 77.3 MB | |
| Totals: 3 Items | 154.0 MB | 0 | |
2.25.0
Changes
Triton-based Inference Architecture
Marqo's inference layer has been restructured from a monolithic design into three dedicated components:
- Inference Orchestrator — A FastAPI service that coordinates inference requests. (https://github.com/marqo-ai/marqo/pull/1315)
- Model Management Container — A FastAPI service for managing ML model lifecycles with Triton Inference Server, including model loading/unloading, health checks, and environment variable consistency. (https://github.com/marqo-ai/marqo/pull/1322)
- Marqo API adaptations — The core Marqo API has been updated to work with the new Triton-backed components. (https://github.com/marqo-ai/marqo/pull/1317)
This architecture enables independent scaling and deployment of inference, model management, and search API layers.
## Other Changes
- Centralize model registry into a shared components/common package (marqo-common). (https://github.com/marqo-ai/marqo/pull/1356)
- Fix model download auth handling to support public S3 buckets. (https://github.com/marqo-ai/marqo/pull/1352)