Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
add GPU quantization support.tar.gz | 2021-12-08 | 3.4 MB | |
add GPU quantization support.zip | 2021-12-08 | 3.4 MB | |
README.md | 2021-12-08 | 957 Bytes | |
Totals: 3 Items | 6.9 MB | 0 |
- support int-8 GPU quantization
- add a tuto to perform quantization end to end
- add
QDQRoberta
model - switch to ONNX opset 13
- refactoring in the TensorRT engine creation
- fix bugs
- add auth token (for private HF repo)
What's Changed
- Update triton by @pommedeterresautee in https://github.com/ELS-RD/transformer-deploy/pull/11
- fix README.md by @pommedeterresautee in https://github.com/ELS-RD/transformer-deploy/pull/13
- Fix install errors by @sam-writer in https://github.com/ELS-RD/transformer-deploy/pull/20
- Add auth token by @sam-writer in https://github.com/ELS-RD/transformer-deploy/pull/19
- Support GPU INT-8 quantization by @pommedeterresautee in https://github.com/ELS-RD/transformer-deploy/pull/15
New Contributors
- @sam-writer made their first contribution in https://github.com/ELS-RD/transformer-deploy/pull/20
Full Changelog: https://github.com/ELS-RD/transformer-deploy/compare/v0.1.1...v0.2.0