| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| flashinfer_jit_cache-0.6.4+cu130-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-19 | 1.8 GB | |
| flashinfer_jit_cache-0.6.4+cu130-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-19 | 1.8 GB | |
| flashinfer_jit_cache-0.6.4+cu129-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-19 | 1.3 GB | |
| flashinfer_jit_cache-0.6.4+cu129-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-19 | 1.3 GB | |
| flashinfer_jit_cache-0.6.4+cu128-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-19 | 1.2 GB | |
| flashinfer_jit_cache-0.6.4+cu128-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-19 | 1.2 GB | |
| flashinfer_cubin-0.6.4-py3-none-any.whl | 2026-02-19 | 245.7 MB | |
| flashinfer_python-0.6.4-py3-none-any.whl | 2026-02-19 | 7.8 MB | |
| flashinfer_python-0.6.4.tar.gz | 2026-02-19 | 5.3 MB | |
| README.md | 2026-02-19 | 5.6 kB | |
| Release v0.6.4 source code.tar.gz | 2026-02-19 | 2.8 MB | |
| Release v0.6.4 source code.zip | 2026-02-19 | 3.8 MB | |
| Totals: 12 Items | 8.9 GB | 0 | |
What's Changed
- perf: add fp4 GEMM tile configs and streamK scheduler for SM120 by @Yuening-wa in https://github.com/flashinfer-ai/flashinfer/pull/2460
- refactor: Port upstream CUTLASS fixes and refactor grouped_gemm_nt_masked GEMM module location by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2503
- feat: cuteDSL fp4 moe for better DSR1 performance. by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2398
- ci: refactor PR tests to hide failed spot jobs from PR status by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2500
- Enable setting user in CI containers by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2515
- perf: cache cudaGetDeviceProperties in gdn_prefill to avoid per-call overhead by @xutizhou in https://github.com/flashinfer-ai/flashinfer/pull/2509
- Revert "ci: refactor PR tests to hide failed spot jobs from PR status… by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2524
- feat: Add TRTLLM-Gen Skip-Softmax kernels for prefill and decode by @DomBrown in https://github.com/flashinfer-ai/flashinfer/pull/2477
- add salyminty (me) to authorized_codeowners, fix alphabetical ordering by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2537
- chore: update benchmark scripts; fix trtllm-gen moe comments by @IwakuraRein in https://github.com/flashinfer-ai/flashinfer/pull/2412
- Add sm90 guard to fence.acquire by @jhalabi-nv in https://github.com/flashinfer-ai/flashinfer/pull/2535
- feat: Add MXFP8 GEMM mm_mxfp8 (cutlass) by @danisereb in https://github.com/flashinfer-ai/flashinfer/pull/2464
- fallback to fa2 (instead of fa3) for unsupported configuration (bf16 Q, Fp8 KV) by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2536
- misc: point triton blackwell-ptxas to local cuda ptxas by @jimmyzho in https://github.com/flashinfer-ai/flashinfer/pull/2543
- tests: bmm_fp8 for SM110 by @jimmyzho in https://github.com/flashinfer-ai/flashinfer/pull/2538
- Add parallel testing to unit test script by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2531
- Add gen_gemm_sm100_module_cutlass_mxfp8 to jit-cache by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2549
- fix: Sampling: CUDA Graph fix by @IzzyPutterman in https://github.com/flashinfer-ai/flashinfer/pull/2432
- fix: include fp8_blockscale_gemm_90 in AOT jit-cache by @Edward-lyz in https://github.com/flashinfer-ai/flashinfer/pull/2533
- bugfix: fix the enum/int type mismatch mentioned in [#2507] by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2508
- Add test case for Qwen3N by @samuellees in https://github.com/flashinfer-ai/flashinfer/pull/2532
- Chore: Cute dsl moe update (TMA.RED implementation) by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2529
- benchmarks: Add microbenchmark support for Mamba selective_state_update by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2512
- Update Docker CI tags to 20260209-a2d3b39 by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2528
- Ameyn/gdn decode cutedsl kernel by @ameynaik-hub in https://github.com/flashinfer-ai/flashinfer/pull/2498
- [Bugfix][comm] Fix FP4 one-shot launch config instability in trtllm_allreduce_fusion by @baonudesifeizhai in https://github.com/flashinfer-ai/flashinfer/pull/2557
- pick fa2 for BatchDecodeWithPagedKVCacheWrapper auto backend by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2530
- Feat: Trtllm-gen MxFP8 MoE integration by @IwakuraRein in https://github.com/flashinfer-ai/flashinfer/pull/2505
- [Bug] Fix spark unit test failures for test_add_rmsnorm_fp4_quant_cute_dsl by @kahyunnam in https://github.com/flashinfer-ai/flashinfer/pull/2573
- fix: W4A8 autotune crash in cutlass_fused_moe profiler workspace by @ipnon in https://github.com/flashinfer-ai/flashinfer/pull/2564
- Add Hopper to CI by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2552
- fix: allow fmha_v2_prefill_deepseek on SM121 (DGX Spark) by @blake-snc in https://github.com/flashinfer-ai/flashinfer/pull/2559
- feat: Enable TRTLLM-Gen Skip-Softmax attention for MLA by @DomBrown in https://github.com/flashinfer-ai/flashinfer/pull/2547
- docs: Add note on feature support for compute capabilities by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/2578
- bump version to 0.6.4 by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/2565
New Contributors
- @Yuening-wa made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2460
- @xutizhou made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2509
- @DomBrown made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2477
- @saltyminty made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2537
- @IzzyPutterman made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2432
- @Edward-lyz made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2533
- @ameynaik-hub made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2498
- @baonudesifeizhai made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2557
- @ipnon made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2564
- @blake-snc made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2559
Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.6.3...v0.6.4