Download Latest Version linux-64-torchserve-0.12.0-py311_0.tar.bz2 (42.3 MB)
Email in envelope

Get an email when there's a new version of TorchServe

Home / v0.11.1
Name Modified Size InfoDownloads / Week
Parent folder
torchserve-0.11.1-py3-none-any.whl 2024-07-18 24.4 MB
win-64_torchserve-0.11.1-py311_0.tar.bz2 2024-07-18 24.5 MB
win-64_torchserve-0.11.1-py310_0.tar.bz2 2024-07-18 24.5 MB
win-64_torchserve-0.11.1-py39_0.tar.bz2 2024-07-18 24.5 MB
win-64_torchserve-0.11.1-py38_0.tar.bz2 2024-07-18 24.5 MB
osx-arm64_torchserve-0.11.1-py311_0.tar.bz2 2024-07-18 24.5 MB
osx-arm64_torchserve-0.11.1-py310_0.tar.bz2 2024-07-18 24.4 MB
osx-arm64_torchserve-0.11.1-py39_0.tar.bz2 2024-07-18 24.4 MB
osx-arm64_torchserve-0.11.1-py38_0.tar.bz2 2024-07-18 24.4 MB
osx-64_torchserve-0.11.1-py311_0.tar.bz2 2024-07-18 24.5 MB
osx-64_torchserve-0.11.1-py310_0.tar.bz2 2024-07-18 24.4 MB
osx-64_torchserve-0.11.1-py39_0.tar.bz2 2024-07-18 24.4 MB
linux-aarch64_torchserve-0.11.1-py311_0.tar.bz2 2024-07-18 24.5 MB
linux-aarch64_torchserve-0.11.1-py310_0.tar.bz2 2024-07-18 24.4 MB
linux-aarch64_torchserve-0.11.1-py39_0.tar.bz2 2024-07-18 24.4 MB
linux-aarch64_torchserve-0.11.1-py38_0.tar.bz2 2024-07-18 24.4 MB
osx-64_torchserve-0.11.1-py38_0.tar.bz2 2024-07-18 24.4 MB
linux-64_torchserve-0.11.1-py311_0.tar.bz2 2024-07-18 24.5 MB
linux-64_torchserve-0.11.1-py310_0.tar.bz2 2024-07-18 24.4 MB
linux-64_torchserve-0.11.1-py39_0.tar.bz2 2024-07-18 24.4 MB
linux-64_torchserve-0.11.1-py38_0.tar.bz2 2024-07-18 24.4 MB
README.md 2024-07-16 9.5 kB
TorchServe v0.11.1 Release Notes source code.tar.gz 2024-07-16 63.0 MB
TorchServe v0.11.1 Release Notes source code.zip 2024-07-16 63.7 MB
Totals: 24 Items   640.3 MB 13

This is the release of TorchServe v0.11.1.

Highlights Include

  • Security Updates
    • Token Authorization: TorchServe enforces token authorization by default which requires the correct token to be provided when calling a HTTP/S or gRPC API. This is a security feature which addresses the concern of unauthorized API calls. This is applicable in the scenario where an unauthorized user may try to access a running TorchServe instance. The default behavior is to enable this feature which creates a key file with the appropriate tokens to be used for API calls. Users have the option to disable this feature to prevent token authorization from being required for API calls. For more details, refer to the token authorization documentation: https://github.com/pytorch/serve/blob/master/docs/token_authorization_api.md
    • Model API Control: TorchServe disables the ability to register and delete models using HTTP/S or gRPC API calls by default once TorchServe is running. This is a security feature which addresses the concern of unintended registration and deletion of models once TorchServe has started. This is applicable in the scenario where a user may upload malicious code to the model server in the form of a model or where a user may delete a model that is being used. The default behavior prevents users from registering or deleting models once TorchServe is running. Model API control can be enabled to allow users to register and delete models using the TorchServe model load and delete APIs. For more details, refer to the model API control documentation: https://github.com/pytorch/serve/blob/master/docs/model_api_control.md
  • PyTorch 2.x updates
    • Standardized torch.compile configuration
    • Added examples for tensorrt & hpu backends
  • GenAI updates
    • Support continuous batching in sequence batch streaming
    • Asynchronous backend worker communication for continuous batching
    • No code LLM deployment
  • Support for Intel GPUs

Security Updates

PyTorch 2.x Updates

GenAI

Support for Intel GPUs

Documentation

Improvements and Bug Fixing

New Contributors

Platform Support

Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe requires Python >= 3.8 and JDK17.

GPU Support Matrix

TorchServe version PyTorch version Python Stable CUDA Experimental CUDA
0.11.1 2.3.0 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.11.0 2.3.0 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.10.0 2.2.1 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.9.0 2.1 >=3.8, <=3.11 CUDA 11.8, CUDNN 8.7.0.84 CUDA 12.1, CUDNN 8.9.2.26
0.8.0 2.0 >=3.8, <=3.11 CUDA 11.7, CUDNN 8.5.0.96 CUDA 11.8, CUDNN 8.7.0.84
0.7.0 1.13 >=3.7, <=3.10 CUDA 11.6, CUDNN 8.3.2.44 CUDA 11.7, CUDNN 8.5.0.96

Inferentia2 Support Matrix

TorchServe version PyTorch version Python Neuron SDK
0.11.1 2.1 >=3.8, <=3.11 2.18.2+
0.11.0 2.1 >=3.8, <=3.11 2.18.2+
0.10.0 1.13 >=3.8, <=3.11 2.16+
0.9.0 1.13 >=3.8, <=3.11 2.13.2+
Source: README.md, updated 2024-07-16