| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| flashinfer_jit_cache-0.6.3+cu130-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-06 | 1.8 GB | |
| flashinfer_jit_cache-0.6.3+cu130-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-06 | 1.8 GB | |
| flashinfer_jit_cache-0.6.3+cu129-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-06 | 1.3 GB | |
| flashinfer_jit_cache-0.6.3+cu129-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-06 | 1.3 GB | |
| flashinfer_jit_cache-0.6.3+cu128-cp39-abi3-manylinux_2_28_aarch64.whl | 2026-02-06 | 1.2 GB | |
| flashinfer_jit_cache-0.6.3+cu128-cp39-abi3-manylinux_2_28_x86_64.whl | 2026-02-06 | 1.2 GB | |
| flashinfer_cubin-0.6.3-py3-none-any.whl | 2026-02-06 | 150.7 MB | |
| flashinfer_python-0.6.3-py3-none-any.whl | 2026-02-06 | 7.6 MB | |
| flashinfer_python-0.6.3.tar.gz | 2026-02-06 | 5.2 MB | |
| README.md | 2026-02-05 | 6.0 kB | |
| Release v0.6.3 source code.tar.gz | 2026-02-05 | 2.7 MB | |
| Release v0.6.3 source code.zip | 2026-02-05 | 3.6 MB | |
| Totals: 12 Items | 8.8 GB | 1 | |
What's Changed
- ci: add permission control for public ci tests by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2397
- Remove cudaMalloc/Free in GDN prefill kernel by @KevinZeng08 in https://github.com/flashinfer-ai/flashinfer/pull/2415
- Update cudnn prefill to use correct sequence strides by @vedaanta in https://github.com/flashinfer-ai/flashinfer/pull/2414
- perf: mm_fp4 heuristic prioritizes CUTLASS over cuDNN on SM103 by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2404
- test: add coverage for all cli commands by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/1848
- feat: BF16 GEMM using cuDNN backend by @raayandhar in https://github.com/flashinfer-ai/flashinfer/pull/2376
- refactor: simplify fp4 rmsnorm by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2421
- feat: update trtllm-gen MoE cubins by @nekorobov in https://github.com/flashinfer-ai/flashinfer/pull/2416
- chore/feat: A2A + MoE benchmark; add routed counterpart for trtllm_gen_fp8_fused_moe by @rosenrodt in https://github.com/flashinfer-ai/flashinfer/pull/2379
- [CI] Add on-demand rerun for spot-terminated jobs by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2403
- fix: Fix NaN output in mxfp8_quantize for very small input values by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2441
- feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron by @amitz-nv in https://github.com/flashinfer-ai/flashinfer/pull/2304
- infra: add manual code owner override support in codeowner_analyzer.py by @sricketts in https://github.com/flashinfer-ai/flashinfer/pull/2418
- fix: improve numerical stability of Gumbel sampling by @ixlmar in https://github.com/flashinfer-ai/flashinfer/pull/2438
- ci: CI build workflow should always pull fresh and do not cache by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2454
- Update Docker CI tags to 20260131-a52eff1 by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2457
- Revert "feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron" by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2451
- Skip trtllm_alltoall tests on Thor by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2448
- Fix argument type error in _cudnn_gemm_fp4_requirement by @Kangyan-Zhou in https://github.com/flashinfer-ai/flashinfer/pull/2450
- fix: set_log_level now properly sets logger level to enable DEBUG logs by @kahyunnam in https://github.com/flashinfer-ai/flashinfer/pull/2449
- bugfix: fix stub generation directory in fused_moe module by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2445
- [Perf][Feature] Add SM103-specific schedulers for NVFP4 CUTLASS kernels by @LopezCastroRoberto in https://github.com/flashinfer-ai/flashinfer/pull/2303
- ci: set LD_LIBRARY_PATH in Docker images for correct cuBLAS detection by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2468
- add sgl_kernel.fast_topk_v2 to top_k benchmark by @huangzhilin-hzl in https://github.com/flashinfer-ai/flashinfer/pull/2461
- Update Docker CI tags to 20260203-9b5901e by @flashinfer-bot in https://github.com/flashinfer-ai/flashinfer/pull/2475
- MTP for mamba by @ishovkun in https://github.com/flashinfer-ai/flashinfer/pull/2444
- Add sm90 guard to fence ptx by @jhalabi-nv in https://github.com/flashinfer-ai/flashinfer/pull/2439
- perf: improve gdn decode cute-dsl kernels by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2405
- ci: migrate release workflows to ci-infra runners by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2467
- fix: blockscale moe routine supports non-DS routing by @hypdeb in https://github.com/flashinfer-ai/flashinfer/pull/2476
- Fix autotuner oom by @zack041 in https://github.com/flashinfer-ai/flashinfer/pull/2442
- refactor: reduce hopper's gdn prefill compilation time and fix docstring. by @yzh119 in https://github.com/flashinfer-ai/flashinfer/pull/2422
- fix: Fix memory bandwidth calculation in MLA benchmarks by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2479
- fix: Rename tests/mamba/test_utils.py to tests/mamba/utils.py to fix CI test discovery by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2481
- Add/update multi node/multi GPU test scripts by @dierksen in https://github.com/flashinfer-ai/flashinfer/pull/2410
- feat: Support Fused MoE non gated Relu2 NVFP4 & FP8 and support Nemotron, fixed by @amitz-nv in https://github.com/flashinfer-ai/flashinfer/pull/2462
- ci: fix permission errors in release workflow on ci-infra runner by @yongwww in https://github.com/flashinfer-ai/flashinfer/pull/2488
- benchmarks: Expand microbenchmark harness to include sampling and RoPe APIs by @bkryu in https://github.com/flashinfer-ai/flashinfer/pull/2484
- fix: add support check for gemm config for cutlass moe by @nv-yunzheq in https://github.com/flashinfer-ai/flashinfer/pull/2495
- Allow non-DeepSeekV3 routing with one group by @dbari in https://github.com/flashinfer-ai/flashinfer/pull/2502
- bump version to 0.6.3 by @aleozlx in https://github.com/flashinfer-ai/flashinfer/pull/2497
New Contributors
- @KevinZeng08 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2415
- @vedaanta made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2414
- @ixlmar made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2438
- @Kangyan-Zhou made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2450
- @LopezCastroRoberto made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2303
- @huangzhilin-hzl made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2461
- @zack041 made their first contribution in https://github.com/flashinfer-ai/flashinfer/pull/2442
Full Changelog: https://github.com/flashinfer-ai/flashinfer/compare/v0.6.2...v0.6.3