Download Latest Version Release 3.0.1 source code.zip (3.1 MB)
Email in envelope

Get an email when there's a new version of LightSeq

Home / v3.0.0
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2022-10-24 778 Bytes
Release 3.0.0 source code.tar.gz 2022-10-24 2.6 MB
Release 3.0.0 source code.zip 2022-10-24 3.1 MB
Totals: 3 Items   5.7 MB 0

It's been a long time since our last release (v2.2.0). For the past one year, we have focused on int8 quantization.

In this release, LightSeq supports int8 quantized training and inference. Compared with PyTorch QAT, LightSeq int8 training has a speedup of 3x without any performance loss. Compared with previous LightSeq fp16 inference, int8 engine has a speedup up to 1.7x.

LightSeq int8 engine supports multiple models, such as Transformer, BERT, GPT, etc. For int8 training, the users only need to apply quantization mode to the model using model.apply(enable_quant). For int8 inference, the users only need to use QuantTransformer instead of fp16 Transformer.

Other releases include supporting models like MoE, fix bugs, performance improvement, etc.

Source: README.md, updated 2022-10-24