Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-06-06 | 10.0 kB | |
v0.2.6 source code.tar.gz | 2025-06-06 | 1.1 MB | |
v0.2.6 source code.zip | 2025-06-06 | 1.5 MB | |
Totals: 3 Items | 2.6 MB | 0 |
What's Changed
- ci: select 2_28 manylinux builder for new torch+cuda versions by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1000
- misc: update REAMDME.md by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1003
- bugfix: Fix illegal memory access due to custom mask ptr by @yongchaoding in https://github.com/flashinfer-ai/flashinfer/pull/1008
- misc: fix kv-layout doc references by @Edenzzzz in https://github.com/flashinfer-ai/flashinfer/pull/1009
- misc: more benchmark scripts in Python by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1010
- misc: fix instrument code for mla profiler by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1014
- bugfix: import wrapper of mla decode by @dhy2000 in https://github.com/flashinfer-ai/flashinfer/pull/1013
- feat: update decode attention APIs by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1007
- doc: use latest protobuf for profiler by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1021
- feat: SM-constraint Communication Kernels by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/994
- feat: ragged tensor padding kernel for blackwell kernel alignment by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1025
- bugfix: fix custom mask not be reseted after convert custom mask into causal or non-causal by @yongchaoding in https://github.com/flashinfer-ai/flashinfer/pull/1028
- fix: add zero init for KV tiled copy by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1029
- [NVIDIA] Add Cutlass MLA backend by @kaixih in https://github.com/flashinfer-ai/flashinfer/pull/1031
- Add workflow to build aarch64 wheel by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/1036
- Non-blocking host-to-device copy in the ragged prefill wrapper by @nandor in https://github.com/flashinfer-ai/flashinfer/pull/1040
- fix: remove default ubuntu user in Lunar/Noble by @rickyfeng0119 in https://github.com/flashinfer-ai/flashinfer/pull/1042
- feat: Softmax free sampling by @kf-zhang in https://github.com/flashinfer-ai/flashinfer/pull/1035
- feat: add functional per-head FP8 quantization for FA3 by @happierpig in https://github.com/flashinfer-ai/flashinfer/pull/1033
- add multi-item scoring by @arde171 in https://github.com/flashinfer-ai/flashinfer/pull/1015
- [nvidia] cutlass fp8 blockwise/groupwise gemm support by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1045
- [nvidia] cutlass fp8 groupwise grouped gemm support by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1047
- fix: top_k_mask_logits hangs on -inf inputs by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1050
- Benchmark: POD vs batched prefill by @Edenzzzz in https://github.com/flashinfer-ai/flashinfer/pull/1052
- [nvidia] initial support for blackwell kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1039
- Fix KV chunking for POD. by @AKKamath in https://github.com/flashinfer-ai/flashinfer/pull/1054
- bugfix: temporally disable split-kv in blackwell mla by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1055
- bugfix: remove device allocation by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1056
- Parameterize prefix mask call (needed by POD-Attention) by @AKKamath in https://github.com/flashinfer-ai/flashinfer/pull/1059
- bugfix: move
cum_m
calculation inside kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1060 - misc: add pull request template by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1062
- bugfix: Cast build paths to str before setuputils Extension by @farnasirim in https://github.com/flashinfer-ai/flashinfer/pull/1058
- Add PyTorch 2.7.0 build by @huydhn in https://github.com/flashinfer-ai/flashinfer/pull/1063
- bugfix: adding lse output to blackwell fmha kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1071
- bugfix: follow user-specified sm_scale for blackwell cutlass fmha by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1072
- misc: jit: Introduce JitSpec and Generate ninja file by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1065
- fix: fix a typo in docs by @acelyc111 in https://github.com/flashinfer-ai/flashinfer/pull/1077
- misc: jit: Deprecate
load_cuda_ops()
by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1066 - misc: jit: fix missing _get_glibcxx_abi_build_flags by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1080
- misc: jit: Refactor gen JitSpec out of get_xxx_module by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1069
- misc: jit: Replace parallel_load_modules() with build_jit_specs() by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1070
- misc: jit: Import jit_env as a module by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1073
- misc: aot: Add script to build all AOT ops by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1067
- misc: aot: Refactor AOT packaging by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1075
- misc: aot: Remove has_prebuilt_ops by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1076
- ci: upgrade docker ci image by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1082
- bugfix: fix custom allreduce compilation in AOT mode by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1083
- perf: accelerate blackwell grouped gemm by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1086
- misc: update pull request template by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1088
- Fix Cutlass grouped GEMM stride by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1081
- bugfix: fix fp8 attention kernels aot compilation issue by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1087
- comm: refactor and initialize
flashinfer.comm
module by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1089 - misc: cleanup by @b8zhong in https://github.com/flashinfer-ai/flashinfer/pull/1092
- misc: followup by @b8zhong in https://github.com/flashinfer-ai/flashinfer/pull/1093
- [nvidia] Add Blackwell FMHA decode kernel from TRT-LLM by @joker-eph in https://github.com/flashinfer-ai/flashinfer/pull/1051
- bugfix: fix ninja generation rule for non-cuda input by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1097
- jit: Update TVM JIT binding with the latest FFI refactor by @MasterJH5574 in https://github.com/flashinfer-ai/flashinfer/pull/1100
- SM100 Groupwise GeMM K-Major Scale Supports by @cyx-6 in https://github.com/flashinfer-ai/flashinfer/pull/1102
- misc: aot: Add platform tag to wheel by @abcdabcd987 in https://github.com/flashinfer-ai/flashinfer/pull/1105
- feat: composable logits processor by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1099
- feat: add trtllm all-reduce (non-MoE) by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1096
- bugfix: host-precomuted plan function for blackwell fmha by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1106
- doc: fix LogitsPipe example by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1110
- bugfix: bugfix for blackwell mla split-k by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1109
- Add CUTLASS fused moe kernels from TensorRT-LLM. by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1113
- fix: initialize lamport buffer only once after creating new workspace by @yyihuang in https://github.com/flashinfer-ai/flashinfer/pull/1111
- hotfix: fix the blackwell fmha stream by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1116
- fix head_dim not defined if sm_scale is not None by @majian4work in https://github.com/flashinfer-ai/flashinfer/pull/1119
- doc: add Ask-AI widget by @xslingcn in https://github.com/flashinfer-ai/flashinfer/pull/1121
- bugfix: Fix test and output shape of fp4 quantize by @wenscarl in https://github.com/flashinfer-ai/flashinfer/pull/1114
- misc: update slack link by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1120
- release: bump version to v0.2.6 by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/1122
New Contributors
- @yongchaoding made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1008
- @Edenzzzz made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1009
- @dhy2000 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1013
- @kaixih made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1031
- @yongwww made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1036
- @rickyfeng0119 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1042
- @kf-zhang made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1035
- @arde171 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1015
- @farnasirim made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1058
- @huydhn made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1063
- @acelyc111 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1077
- @b8zhong made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1092
- @joker-eph made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1051
- @wenscarl made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1113
- @majian4work made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/1119
Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.2.5...v0.2.6