Download Latest Version linux-64-torchserve-0.12.0-py311_0.tar.bz2 (42.3 MB)
Email in envelope

Get an email when there's a new version of TorchServe

Home / v0.9.0
Name Modified Size InfoDownloads / Week
Parent folder
win64-torchserve-0.9.0-py38_0.tar.bz2 2023-10-13 23.9 MB
win64-torchserve-0.9.0-py310_0.tar.bz2 2023-10-13 23.9 MB
osx64-torchserve-0.9.0-py38_0.tar.bz2 2023-10-13 23.8 MB
win64-torchserve-0.9.0-py311_0.tar.bz2 2023-10-13 23.9 MB
osx64-torchserve-0.9.0-py310_0.tar.bz2 2023-10-13 23.8 MB
win64-torchserve-0.9.0-py39_0.tar.bz2 2023-10-13 23.9 MB
osx64-torchserve-0.9.0-py311_0.tar.bz2 2023-10-13 23.9 MB
linux64-torchserve-0.9.0-py39_0.tar.bz2 2023-10-13 23.8 MB
osx64-torchserve-0.9.0-py39_0.tar.bz2 2023-10-13 23.8 MB
linux64-torchserve-0.9.0-py311_0.tar.bz2 2023-10-13 23.9 MB
linux64-torchserve-0.9.0-py310_0.tar.bz2 2023-10-13 23.8 MB
linux64-torchserve-0.9.0-py38_0.tar.bz2 2023-10-13 23.8 MB
torch_workflow_archiver-0.2.11-py3-none-any.whl 2023-10-13 12.6 kB
torch_model_archiver-0.9.0-py3-none-any.whl 2023-10-13 14.7 kB
torchserve-0.9.0-py3-none-any.whl 2023-10-13 23.8 MB
README.md 2023-10-12 6.2 kB
TorchServe v0.9.0 Release Notes source code.tar.gz 2023-10-12 33.7 MB
TorchServe v0.9.0 Release Notes source code.zip 2023-10-12 34.2 MB
Totals: 18 Items   378.0 MB 12

This is the release of TorchServe v0.9.0.

Security

Our security process is documented here

We rely heavily on automation to improve the security of torchserve namely by 1. On a monthly basis updating our gradle and pip dependencies 2. Docker scanning via Snyk 3. Code analysis via CodeQL

A key point to remember is that torchserve will allow you to configure things in an unsecure way so make sure to read our security docs and relevant security warnings to make sure your product is secure in production. In general we do not encourage you to download untrusted mar files from the internet, running a .mar file effectively is running arbitrary python code so make sure to unzip mar files and validate whether they are doing anything suspicious.

Code scanning fixes

  1. Used Sha-256 in ziputils [#2629] @msaroufim
  2. Verified default hostname in Test [#2631] @msaroufim
  3. Fixed zip slip error [#2634] @msaroufim
  4. Used string array as Process arguments input [#2632] [#2635] @msaroufim
  5. Enabled Netty HTTP header validation as default [#2630] @msaroufim
  6. Verified 3rd party package installation path [#2687] @lxning
  7. Allowed url validation [#2685] @lxning including
  8. Disabled loading TS_ALLOWED_URLS from env by default.
  9. Moved the model url validation to last step.
  10. Sanity check model archive name to guard Uncontrolled data used in path expression

Address configuration updates

  1. Updated default address from 0.0.0.0 to 127.0.0.1 [#2624] [#2704] @namannandan @agunapal
  2. Bind container ports to localhost ports [#2646] @namannandan

Documentation improvements

  1. Updated security readme [#2643] [#2690] @msaroufim @agunapal
  2. Updated security guidance in docker readme [#2669] @agunapal

Dependency improvements

  1. Created dependabot.yml [#2642] [#2675] @msaroufim
  2. Bumped packaging from 23.1 to 23.2
  3. Bumped pygit2 from 1.21.1 to 1.13.1
  4. Bumped com.github.spotbugs from 4.0.2 to 5.1.3
  5. Bumped ONNX from 1.14.0 to 1.14.1
  6. Bumped Pillow from 9.3.0 to 10.0.1
  7. Bumped Bump com.amazonaws:DynamoDBLocal from 1.13.2 to 2.0.0
  8. Upgraded node to version 18 [#2663] @agunapal

Blogs

New Features

  • Support PyTorch 2.1.0 and Python 3.11 [#2621] [#2691] [#2697] @agunapal
  • Supported continous batching for single GPU LLM inference [#2628] @mreso @lxning
  • Supported dynamically loading 3rd party package on SageMaker Multi-Model Endpoint [#2535] @lxning
  • Added DALI handler to handle preprocess and updated Nvidia DALI example [#2485] @jagadeeshi2i

New Examples

  1. Deploy Llama2 on Inferentia2 [#2458] @namannandan
  2. Using TorchServe on SageMaker Inf2.24xlarge with Llama2-13B @lxning
  3. PyTorch tensor parallel on Llama2 example [#2623] [#2689] @HamidShojanazeri
  4. Enabled better transformer (ie. flash attention 2) on Llama2 [#2700] @HamidShojanazeri @lxning
  5. Llama2 Chatbot on Mac [#2618] @agunapal
  6. ASR speech recognition example [#2047] @husenzhang

Improvements

  • Fixed typo in BaseHandler [#2547] @a-ys
  • Create merge_queue workflow for CI [#2548] @msaroufim
  • Fixed typo in artifact terminology unification [#2551] @park12sj
  • Added env hints in model_service_worker [#2540] @ZachOBrien
  • Refactor conda build scripts to publish all binaries [#2561] @agunapal
  • Fixed response return type in KServe [#2566] @jagadeeshi2i
  • Added torchserve-kfs nightly build [#2574] @jagadeeshi2i
  • Added regression for all CPU binaries [#2562] @agunapal
  • Updated CICD runners [#2586] [#2597] [#2636] [#2627] [#2677] [#2710] [#2696] @agunapal @msaroufim
  • Upgraded newman version to 5.3.2 [#2598] [#2603] @agunapal
  • Updated opt benchmark config for inf2 [#2617] @namannandan
  • Added ModelRequestEncoderTest [#2580] @abergmeier
  • Added manually dispatch workflow [#2686] @msaroufim
  • Updated test wheels with PyTorch 2.1.0 [#2684] @agunapal
  • Allowed parallel level = 1 to run in torchrun mode [#2608] @lxning
  • Fixed metric unit assignment backward compatibility [#2693] @namannandan

Documentation

  • Updated MPS readme [#2543] @sekyondaMeta
  • Updated large model inference readme [#2542] @sekyondaMeta
  • Fixed bash snippets in examples/image_classifier/mnist/Docker.md [#2345] @dmitsf
  • Fixed typo in kubernetes/autoscale.md [#2393] @CandiedCode
  • Fixed path in examples/image_classifier/resnet_18/README.md [#2568] @udaij12
  • Model Loading Guidance [#2592] @agunapal
  • Updated Metrics readme [#2560] @sekyondaMeta
  • Display nightly workflow status badge in README [#2619] [#2666] @agunapal @msaroufim
  • Update torch.compile information in examples/pt2/README.md [#2706] @agunapal
  • Deploy model using TorchServe on SageMaker tutorial @lxning

Platform Support

Ubuntu 16.04, Ubuntu 18.04, Ubuntu 20.04 MacOS 10.14+, Windows 10 Pro, Windows Server 2019, Windows subsystem for Linux (Windows Server 2019, WSLv1, Ubuntu 18.0.4). TorchServe now requires Python 3.8 and above, and JDK17.

GPU Support

Torch 2.1.0 + Cuda 11.8, 12.1 Torch 2.0.1 + Cuda 11.7 Torch 2.0.0 + Cuda 11.7 Torch 1.13 + Cuda 11.7 Torch 1.11 + Cuda 10.2, 11.3, 11.6 Torch 1.9.0 + Cuda 11.1 Torch 1.8.1 + Cuda 9.2

Source: README.md, updated 2023-10-12