FlashInfer - Browse /v0.6.4 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
flashinfer_jit_cache-0.6.4+cu130-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-19	1.8 GB	0
flashinfer_jit_cache-0.6.4+cu130-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-19	1.8 GB	0
flashinfer_jit_cache-0.6.4+cu129-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-19	1.3 GB	0
flashinfer_jit_cache-0.6.4+cu129-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-19	1.3 GB	0
flashinfer_jit_cache-0.6.4+cu128-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-19	1.2 GB	0
flashinfer_jit_cache-0.6.4+cu128-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-19	1.2 GB	0
flashinfer_cubin-0.6.4-py3-none-any.whl	2026-02-19	245.7 MB	0
flashinfer_python-0.6.4-py3-none-any.whl	2026-02-19	7.8 MB	0
flashinfer_python-0.6.4.tar.gz	2026-02-19	5.3 MB	0
README.md	2026-02-19	5.6 kB	0
Release v0.6.4 source code.tar.gz	2026-02-19	2.8 MB	0
Release v0.6.4 source code.zip	2026-02-19	3.8 MB	0
Totals: 12 Items		8.9 GB	0

What's Changed

perf: add fp4 GEMM tile configs and streamK scheduler for SM120 by @Yuening-wa in https://github.com/flashinfer-ai/flashinfer/pull/2460
refactor: Port upstream CUTLASS fixes and refactor grouped_gemm_nt_masked GEMM module location by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2503
feat: cuteDSL fp4 moe for better DSR1 performance. by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2398
ci: refactor PR tests to hide failed spot jobs from PR status by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2500
Enable setting user in CI containers by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2515
perf: cache cudaGetDeviceProperties in gdn_prefill to avoid per-call overhead by @xutizhou in https://github.com/flashinfer-ai/flashinfer/pull/2509
Revert "ci: refactor PR tests to hide failed spot jobs from PR status… by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2524
feat: Add TRTLLM-Gen Skip-Softmax kernels for prefill and decode by @DomBrown in https://github.com/flashinfer-ai/flashinfer/pull/2477
add salyminty (me) to authorized_codeowners, fix alphabetical ordering by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2537
chore: update benchmark scripts; fix trtllm-gen moe comments by @IwakuraRein in https://github.com/flashinfer-ai/flashinfer/pull/2412
Add sm90 guard to fence.acquire by @jhalabi-nv in https://github.com/flashinfer-ai/flashinfer/pull/2535
feat: Add MXFP8 GEMM mm_mxfp8 (cutlass) by @danisereb in https://github.com/flashinfer-ai/flashinfer/pull/2464
fallback to fa2 (instead of fa3) for unsupported configuration (bf16 Q, Fp8 KV) by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2536
misc: point triton blackwell-ptxas to local cuda ptxas by @jimmyzho in https://github.com/flashinfer-ai/flashinfer/pull/2543
tests: bmm_fp8 for SM110 by @jimmyzho in https://github.com/flashinfer-ai/flashinfer/pull/2538
Add parallel testing to unit test script by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2531
Add gen_gemm_sm100_module_cutlass_mxfp8 to jit-cache by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2549
fix: Sampling: CUDA Graph fix by @IzzyPutterman in https://github.com/flashinfer-ai/flashinfer/pull/2432
fix: include fp8_blockscale_gemm_90 in AOT jit-cache by @Edward-lyz in https://github.com/flashinfer-ai/flashinfer/pull/2533
bugfix: fix the enum/int type mismatch mentioned in [#2507] by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2508
Add test case for Qwen3N by @samuellees in https://github.com/flashinfer-ai/flashinfer/pull/2532
Chore: Cute dsl moe update (TMA.RED implementation) by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2529
benchmarks: Add microbenchmark support for Mamba selective_state_update by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2512
Update Docker CI tags to 20260209-a2d3b39 by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2528
Ameyn/gdn decode cutedsl kernel by @ameynaik-hub in https://github.com/flashinfer-ai/flashinfer/pull/2498
[Bugfix][comm] Fix FP4 one-shot launch config instability in trtllm_allreduce_fusion by @baonudesifeizhai in https://github.com/flashinfer-ai/flashinfer/pull/2557
pick fa2 for BatchDecodeWithPagedKVCacheWrapper auto backend by @saltyminty in https://github.com/flashinfer-ai/flashinfer/pull/2530
Feat: Trtllm-gen MxFP8 MoE integration by @IwakuraRein in https://github.com/flashinfer-ai/flashinfer/pull/2505
[Bug] Fix spark unit test failures for test_add_rmsnorm_fp4_quant_cute_dsl by @kahyunnam in https://github.com/flashinfer-ai/flashinfer/pull/2573
fix: W4A8 autotune crash in cutlass_fused_moe profiler workspace by @ipnon in https://github.com/flashinfer-ai/flashinfer/pull/2564
Add Hopper to CI by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2552
fix: allow fmha_v2_prefill_deepseek on SM121 (DGX Spark) by @blake-snc in https://github.com/flashinfer-ai/flashinfer/pull/2559
feat: Enable TRTLLM-Gen Skip-Softmax attention for MLA by @DomBrown in https://github.com/flashinfer-ai/flashinfer/pull/2547
docs: Add note on feature support for compute capabilities by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/2578
bump version to 0.6.4 by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/2565

New Contributors

@Yuening-wa made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2460
@xutizhou made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2509
@DomBrown made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2477
@saltyminty made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2537
@IzzyPutterman made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2432
@Edward-lyz made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2533
@ameynaik-hub made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2498
@baonudesifeizhai made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2557
@ipnon made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2564
@blake-snc made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2559

Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.6.3...v0.6.4

Source: README.md, updated 2026-02-19

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

What's Changed

New Contributors

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

Get an email when there's a new version of FlashInfer

What's Changed

New Contributors