FlashInfer - Browse /v0.6.3 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
flashinfer_jit_cache-0.6.3+cu130-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-06	1.8 GB	0
flashinfer_jit_cache-0.6.3+cu130-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-06	1.8 GB	0
flashinfer_jit_cache-0.6.3+cu129-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-06	1.3 GB	0
flashinfer_jit_cache-0.6.3+cu129-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-06	1.3 GB	0
flashinfer_jit_cache-0.6.3+cu128-cp39-abi3-manylinux_2_28_aarch64.whl	2026-02-06	1.2 GB	0
flashinfer_jit_cache-0.6.3+cu128-cp39-abi3-manylinux_2_28_x86_64.whl	2026-02-06	1.2 GB	0
flashinfer_cubin-0.6.3-py3-none-any.whl	2026-02-06	150.7 MB	1
flashinfer_python-0.6.3-py3-none-any.whl	2026-02-06	7.6 MB	0
flashinfer_python-0.6.3.tar.gz	2026-02-06	5.2 MB	0
README.md	2026-02-05	6.0 kB	0
Release v0.6.3 source code.tar.gz	2026-02-05	2.7 MB	0
Release v0.6.3 source code.zip	2026-02-05	3.6 MB	0
Totals: 12 Items		8.8 GB	1

What's Changed

ci: add permission control for public ci tests by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2397
Remove cudaMalloc/Free in GDN prefill kernel by @KevinZeng08 in https://github.com/flashinfer-ai/flashinfer/pull/2415
Update cudnn prefill to use correct sequence strides by @vedaanta in https://github.com/flashinfer-ai/flashinfer/pull/2414
perf: mm_fp4 heuristic prioritizes CUTLASS over cuDNN on SM103 by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2404
test: add coverage for all cli commands by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/1848
feat: BF16 GEMM using cuDNN backend by @raayandhar in https://github.com/flashinfer-ai/flashinfer/pull/2376
refactor: simplify fp4 rmsnorm by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2421
feat: update trtllm-gen MoE cubins by @nekorobov in https://github.com/flashinfer-ai/flashinfer/pull/2416
chore/feat: A2A + MoE benchmark; add routed counterpart for trtllm_gen_fp8_fused_moe by @rosenrodt in https://github.com/flashinfer-ai/flashinfer/pull/2379
[CI] Add on-demand rerun for spot-terminated jobs by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2403
fix: Fix NaN output in mxfp8_quantize for very small input values by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2441
feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron by @amitz-nv in https://github.com/flashinfer-ai/flashinfer/pull/2304
infra: add manual code owner override support in codeowner_analyzer.py by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/2418
fix: improve numerical stability of Gumbel sampling by @ixlmar in https://github.com/flashinfer-ai/flashinfer/pull/2438
ci: CI build workflow should always pull fresh and do not cache by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2454
Update Docker CI tags to 20260131-a52eff1 by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2457
Revert "feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron" by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2451
Skip trtllm_alltoall tests on Thor by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2448
Fix argument type error in _cudnn_gemm_fp4_requirement by @Kangyan-Zhou in https://github.com/flashinfer-ai/flashinfer/pull/2450
fix: set_log_level now properly sets logger level to enable DEBUG logs by @kahyunnam in https://github.com/flashinfer-ai/flashinfer/pull/2449
bugfix: fix stub generation directory in fused_moe module by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2445
[Perf][Feature] Add SM103-specific schedulers for NVFP4 CUTLASS kernels by @LopezCastroRoberto in https://github.com/flashinfer-ai/flashinfer/pull/2303
ci: set LD_LIBRARY_PATH in Docker images for correct cuBLAS detection by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2468
add sgl_kernel.fast_topk_v2 to top_k benchmark by @huangzhilin-hzl in https://github.com/flashinfer-ai/flashinfer/pull/2461
Update Docker CI tags to 20260203-9b5901e by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2475
MTP for mamba by @ishovkun in https://github.com/flashinfer-ai/flashinfer/pull/2444
Add sm90 guard to fence ptx by @jhalabi-nv in https://github.com/flashinfer-ai/flashinfer/pull/2439
perf: improve gdn decode cute-dsl kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2405
ci: migrate release workflows to ci-infra runners by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2467
fix: blockscale moe routine supports non-DS routing by @hypdeb in https://github.com/flashinfer-ai/flashinfer/pull/2476
Fix autotuner oom by @zack041 in https://github.com/flashinfer-ai/flashinfer/pull/2442
refactor: reduce hopper's gdn prefill compilation time and fix docstring. by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2422
fix: Fix memory bandwidth calculation in MLA benchmarks by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2479
fix: Rename tests/mamba/test_utils.py to tests/mamba/utils.py to fix CI test discovery by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2481
Add/update multi node/multi GPU test scripts by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2410
feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron, fixed by @amitz-nv in https://github.com/flashinfer-ai/flashinfer/pull/2462
ci: fix permission errors in release workflow on ci-infra runner by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2488
benchmarks: Expand microbenchmark harness to include sampling and RoPe APIs by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2484
fix: add support check for gemm config for cutlass moe by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2495
Allow non-DeepSeekV3 routing with one group by @dbari in https://github.com/flashinfer-ai/flashinfer/pull/2502
bump version to 0.6.3 by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/2497

New Contributors

@KevinZeng08 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2415
@vedaanta made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2414
@ixlmar made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2438
@Kangyan-Zhou made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2450
@LopezCastroRoberto made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2303
@huangzhilin-hzl made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2461
@zack041 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2442

Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.6.2...v0.6.3

Source: README.md, updated 2026-02-05

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

What's Changed

New Contributors

FlashInfer Files

FlashInfer: Kernel Library for LLM Serving

Get an email when there's a new version of FlashInfer

What's Changed

New Contributors