| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| gpustack-2.0.3-py3-none-any.whl | 2026-01-09 | 12.6 MB | |
| README.md | 2026-01-09 | 1.4 kB | |
| v2.0.3 source code.tar.gz | 2026-01-09 | 42.3 MB | |
| v2.0.3 source code.zip | 2026-01-09 | 42.7 MB | |
| Totals: 4 Items | 97.6 MB | 5 | |
Bug Fixes
- Fixed an issue where the Gateway timed out in 3 minutes. (Issue [#4175])
- Fixed the
copy-imagescommand's--platformflag not filtering architectures correctly. (Issue [#4173]) - Fixed a failure when running vLLM distributed inference with NPU. (Issue [#4171])
- Fixed an error in the model serving logs. (Issue [#4156])
- Fixed the
mem-fraction-staticparameter for SGLang not being accounted for during scheduling. (Issue [#4153]) - Fixed the Higress access log file increasing rapidly in size. (Issue [#4150])
- Fixed deployment with a custom backend still reporting an “Unrecognized architecture” error. (Issue [#4146])
- Fixed model instances oscillating between pending and analyzing after scaling a model's replicas from 0 to 5. (Issue [#4138])
- Fixed an issue where the TLS port check blocked startup when TLS was not configured. (Issue [#4127])
- Fixed a duplicated model ingress being created after a model was deleted. (Issue [#4125])
- Fixed AMD GPUs not being detected. (Issue [#4123], [#4116])
- Fixed unexpected errors when calling the embeddings and reranker APIs. (Issue [#4114], [#4113])
- Fixed an incorrect status code being returned when a model had no running instances. (Issue [#4103])
- Fixed a “No HIP GPUs are available” error when using additional GPUs beyond the first one in an AMD GPU worker. (Issue [#4033])