| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-01 | 754 Bytes | |
| v1.4.1 -- ORT optimization + ONNX GPU benchmarks source code.tar.gz | 2026-03-01 | 1.4 MB | |
| v1.4.1 -- ORT optimization + ONNX GPU benchmarks source code.zip | 2026-03-01 | 1.5 MB | |
| Totals: 3 Items | 2.9 MB | 0 | |
Changes
- ORT_ENABLE_ALL graph optimization on ONNX inference sessions
- Suppress Memcpy transformer warnings (log_severity_level=3)
- MiniCheck AggreFact benchmark (
benchmarks/aggrefact_minicheck.py) - ONNX GPU batch benchmarks: 14.6 ms/pair (DeBERTa-v3-Large, GTX 1060)
- Zenodo DOI: 10.5281/zenodo.18822167
- Version bump to 1.4.1
Performance (FactCG-DeBERTa-v3-Large)
| Backend | Latency (ms/pair) |
|---|---|
| PyTorch GPU batch | 19.0 |
| ONNX GPU batch | 14.6 |
| Production NLIScorer | 13.2 |
| Raw ORT (short inputs) | 9.0 |
| Heuristic (no model) | 0.03 |
Accuracy
75.8% balanced accuracy on LLM-AggreFact (29K samples, 4th on leaderboard)