FlashInfer - Browse /v0.2.7 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-06-30	5.0 kB	0
v0.2.7 source code.tar.gz	2025-06-30	1.2 MB	0
v0.2.7 source code.zip	2025-06-30	1.7 MB	0
Totals: 3 Items		2.8 MB	0

What's Changed

ci: Update images for self-hosted ARM64 runner by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/1128
Fix pointer dtype bug in rope by @Edenzzzz in https://github.com/flashinfer-ai/flashinfer/pull/1129
feat: update and test create_ipc_buffer by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1130
misc: update runllm widget by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1132
misc: correct runllm widget (again) by @MasterJH5574 in https://github.com/flashinfer-ai/flashinfer/pull/1133
[Feature] Support PDL for batch Prefill and Decode by @Edenzzzz in https://github.com/flashinfer-ai/flashinfer/pull/1117
fix: negative zero by type trait --> binary value by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1136
fix: sync after create_workspace by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1138
refactor: use functools.cache instead of global dict for caching modules by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1135
[feat] add unified batch attention w/ correctness tests. by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1137
Fix FA2 and FA3 multi-item scoring and cuda illegal memory access error by @arde171 in https://github.com/flashinfer-ai/flashinfer/pull/1140
feat: Add support for FLASHINFER_EXTRA_LDFLAGS environment variable by @jennifgcrl in https://github.com/flashinfer-ai/flashinfer/pull/1144
misc: remove sync between persistent runners and use packed_causal_kv_end for SM90Plan by @Edenzzzz in https://github.com/flashinfer-ai/flashinfer/pull/1146
[fix] fix precision errors when applying causal mask on Qwen-2.5 series models by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1148
ci: Install mpi4py by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/1149
feat: add trtllm moe_allreduce_fusion by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1108
feat: add trtllm all-reduce fusion by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1131
Add more logging to TRTLLM-GEN debug trace (NFC) by @joker-eph in https://github.com/flashinfer-ai/flashinfer/pull/1158
feat: update non-fused moe by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1161
Add fp4 quantization swizzling tests by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1157
refactor: communication module by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1162
feat: add finalize_moe_allreduce from trtllm by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1159
feat: experimental support of green ctx by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1163
feat: Fused temperature online softmax kernel by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1153
MNNVL MoE All-to-All Support by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1134
feat: nvshmem python bindings by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1160
Fix missing symbols in trtllm_utils.so by @tiran in https://github.com/flashinfer-ai/flashinfer/pull/1168
feat: logits processor fustion rule for temperature softmax by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1170
Expose fp4 blockscale swizzling kernel by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1176
add nvshmem sum_reduce for mnnvl allreduce by @Amir-19 in https://github.com/flashinfer-ai/flashinfer/pull/1152
bugfix: softmax NaN results caused by large -inf masks by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1178
[CI] Update is_last_build by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/1183
[feat] support block sparse attention w/ variable block sizes and head-wise sparse patterns by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1177
bugfix: fix invalid blackwell fmha unittests by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1181
feat: support green ctx creation by a list of SM counts by @Conless in https://github.com/flashinfer-ai/flashinfer/pull/1190
fix: trtllm_comm module aot arch issues by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1196
bugfix: fix broken docs build by adding missing dependencies by @Conless in https://github.com/flashinfer-ai/flashinfer/pull/1197
chore: bump v0.2.7 by @zhyncs in https://github.com/flashinfer-ai/flashinfer/pull/1199

New Contributors

@jennifgcrl made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1144
@tiran made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1168
@Amir-19 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1152
@Conless made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1190

Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.2.6.post1...v0.2.7

Source: README.md, updated 2025-06-30

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

What's Changed

New Contributors

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

Get an email when there's a new version of FlashInfer

What's Changed

New Contributors