| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2025-11-05 | 1.4 kB | |
| v0.7.0_ add options for faster diffusion inference_ shared variable caching, efficient bias fusion, and TF32 acceleration. source code.tar.gz | 2025-11-05 | 45.5 MB | |
| v0.7.0_ add options for faster diffusion inference_ shared variable caching, efficient bias fusion, and TF32 acceleration. source code.zip | 2025-11-05 | 45.6 MB | |
| Totals: 3 Items | 91.0 MB | 0 | |
What's Changed
We’re excited to announce the open-source release of Protenix v0.7.0, supported by @yangyanpinghpc, featuring several performance optimizations for diffusion inference. This version introduces three new optional acceleration flags (enabled by default in inference stage) and improved support for batched inference: - --enable_cache Precomputes and caches shared intermediate variables (pair_z, p_lm, c_l) across the N_sample and N_step dimensions. - --enable_fusion Fuses bias transformations and normalization in the 24-layer diffusion transformer blocks at compile time. - --enable_tf32 Enables TF32 precision for matrix multiplications when using FP32 computation, trading slight numerical accuracy for speed. - Batched Diffusion Support (N_sample > 1) Shares s_trunk and z_pair across the N_sample dimension during diffusion, reducing memory and compute overhead without affecting results.
You can run it using the following example command: (Note: if not specified, --enable_cache, --enable_fusion, and --enable_tf32 default to true.)
protenix predict -i examples/example.json -o ./test_outputs/cmd/output_mini -s 105,106 -n "protenix_mini_default_v0.5.0" --triatt_kernel "torch" --trimul_kernel "torch" --enable_cache true --enable_fusion true --enable_tf32 true
