Download Latest Version apache-tvm-src-v0.21.0.tar.gz (76.9 MB)
Email in envelope

Get an email when there's a new version of tvm

Home / v0.21.0
Name Modified Size InfoDownloads / Week
Parent folder
apache-tvm-src-v0.21.0.tar.gz 2025-07-28 76.9 MB
apache-tvm-src-v0.21.0.tar.gz.sha512 2025-07-28 164 Bytes
apache-tvm-src-v0.21.0.tar.gz.asc 2025-07-28 833 Bytes
Apache TVM v0.21.0 source code.tar.gz 2025-07-07 5.9 MB
Apache TVM v0.21.0 source code.zip 2025-07-07 9.0 MB
README.md 2025-07-07 20.4 kB
Totals: 6 Items   91.8 MB 0

Introduction

The TVM community has worked since the last release to deliver the following new exciting improvements!

The main tags are below (bold text is with lots of progress): Relax (especial PyTorch frontend), FFI etc.

Please visit the full listing of commits for a complete view: v0.21.dev0...v0.21.0.rc0.

Community

None.

RFCs

None.

Arith

  • #18067 - Add IsBound method to ConstIntBoundAnalyzer
  • #18031 - Canonicalize mul-coefficient to rhs
  • #18025 - Fix canonical simplify for LE with incorrect range assumptions

BugFix

  • #18115 - [Fix][Serialization] Add support for NaN value serialization
  • #18103 - [Fix] Replace dmlc::Error with std::exception in VerifyGPUCode
  • #18092 - [Fix] Fix ExecBuilderDeclareFunction method name in exec_builder.py
  • #18087 - fix exception when tvm not built with llvm support
  • #18035 - [CUDA] Fix: Update settings for rerun on Increase FloatImm precision when printing 64 bit values in CUDA codegen
  • #17968 - [Relax][Pytorch] Bugfix of conv_transpose1d and conv_transpose2d
  • #17950 - [Fix][Relax] Fix dangling reference in GetTargetFunctions()
  • #17902 - Fix off-by-one error in the type index range check within Object::IsInstance()
  • #17882 - [Relax][Pytorch] Fix incorrect behaviour of % (mod) operator in TVM frontend
  • #17875 - [Relax][Pytorch] Incorrect Handling of In-Place Ops in FX-Based TVM Frontend
  • #17838 - [TIR] Schedule support reverse-inline with reduction blocks

CI

  • #18071 - Update windows to 2025
  • #18058 - [TEST] Move temp files into tempdir
  • #18037 - Further robustify is_last_build check
  • #17981 - Update images to 20250513-063354-70aa3797
  • #17891 - Update images to 20250428-080833-03eadc65
  • #17905 - Install PyTorch 2.7 compatible with CUDA 11.8
  • #17887 - Upgrade pytorch to 2.7.0, torchvision to 0.22.0, and vulkan sdk to 1.4.309
  • #17846 - Upgrade ubuntu runner image for GitHub CI

Docker

  • #17955 - [CI] Reintroduce NNEF to CI images

Docs

  • #18056 - Update installation instruction based ffi refactor

Frontend

  • #18090 - [Relax][ONNX] Update Reduce ops to support axes as input
  • #18072 - [Relax][ONNX] Update ReduceL1 to opset 18
  • #18016 - [Relax][ONNX] Replace deprecated mapping.TENSOR_TYPE_TO_NP_TYPE usage
  • #18001 - [Relax][ONNX] Fix: bitwise_not misclassified as binary (is …
  • #17990 - [Relax]Fix: Output tensor with zero dimension after torch.u…
  • #17925 - [Relax][PyTorch] Re-enable test_subgraph_capture in dynamo test
  • #17980 - [ONNX] Make bias input optional in LayerNormalization
  • #17918 - [Relax][PyTorch] Add ReLU6 Op Support for Exported Program and FX graph
  • #17930 - [Relax][PyTorch] Add torch.outer Op Support for Exported Program and FX graph
  • #17932 - [Relax][PyTorch] Add UpSample Bicubic Op Support for Exported Program and FX graph
  • #17921 - [Relax][PyTorch] Add AvgPool 1D and 3D Op Support for Exported Program and FX graph
  • #17922 - [Relax][PyTorch] Add Adaptive AvgPool 1D and 3D Op Support for Exported Program and FX graph
  • #17863 - [Relax][PyTorch] CrossEntropyLoss
  • #17919 - [Relax][PyTorch] Add MaxPool 1D and 3D Op Support for Exported Program and FX graph
  • #17926 - [Relax][PyTorch] Add tests for all the dtypes supported in the PyTorch frontend
  • #17924 - [Relax][PyTorch] Add div.Tensor_mode and trunc Op Support for Exported Program and FX graph
  • #17904 - [Relax][PyTorch] Add Meshgrid Op Support for Exported Program and FX graph
  • #17915 - [Relax][PyTorch] Add support for linspace op in fx graph
  • #17886 - [Relax][PyTorch] Add Pixel Shuffle Op Support for Exported Program and FX graph
  • #17908 - [Relax][PyTorch] Add support for eye op in fx graph
  • #17893 - [Relax][Pytorch] Add fmod support
  • #17894 - [Relax][PyTorch] Support torch.bfloat16 dtype in pytorch frontend
  • #17878 - [Relax][PyTorch] Add torch.isin Op Support for Exported Program and FX graph
  • #17889 - [Relax][PyTorch] Support linspace op for ExportedProgram importer
  • #17868 - [Relax][Pytorch] Add support for ones_like, zero_, zeros, type_as, item ops
  • #17857 - [Relax][PyTorch] Refactor norm op for ExportedProgram importer
  • #17852 - [Relax][PyTorch] Sort.default
  • #17871 - [Relax][Pytorch] Add support for bitwise_or op support
  • #17836 - [Relax][PyTorch] support for index.Tensor
  • #17864 - [Relax][PyTorch] Support eye op for ExportedProgram importer
  • #17858 - [Relax][PyTorch] Add copy_ op support in fxGraph
  • #17851 - [Relax][PyTorch] Support leaky_relu_.default and reshape_as.default in ExportedProgram frontend
  • #17843 - [Relax][PyTorch] Add mul_.Tensor, max.default, min.default and pow.Scalar Op Support into Exported Program Frontend
  • #17821 - [Relax][PyTorch] Add Pad Op Support for Exported Program and FX graph
  • #17819 - [Relax][PyTorch] Add Stack Op Support for Exported Program
  • #17849 - [Relax][PyTorch] Add RSub Op Support for Exported Program and FX graph
  • #17850 - [Relax][Pytorch] Add masked_fill op support in ExportedProgram
  • #17816 - [Relax][PyTorch] Add PReLU Op Support for Exported Program and FX graph
  • #17803 - [Relax][PyTorch] Add Logaddexp op support for exported program
  • #17841 - [Relax][PyTorch] Add support for norm op
  • #17832 - [Relax][PyTorch] full.default, full_like.default, ones.default
  • #17830 - [Relax][PyTorch] Support narrow and broadcast_to ops for ExportedProgram importer

LLVM

  • #17859 - [Codegen] Enable SVE/VLA for RISCV targets
  • #17958 - Fix JIT unknown reloc issue for case of RISCV
  • #17954 - [FFI]Fix compilation errors with clang20

Metal

  • #18034 - Fix GetFunction of metal runtime

ROCm

  • #18029 - Fix ROCm build after FFI refactor

Relax

  • #18102 - Fix rotary embedding buffer size calculation
  • #17928 - [KVCache] Per Layer Sliding Window
  • #17840 - Refactor missing op check into shared utility for Torch frontends
  • #17826 - Fix Torch frontends to report all the missing ops

Runtime

  • #18097 - CutensorMap support

TIR

  • #18068 - Extend address_of to support Buffer objects
  • #18069 - Fix block access region detection for nested let bindings
  • #18057 - Phase out ProducerStore, ProducerRealize and Prefetch

TOPI

  • #18039 - [Relax] Support InstanceNorm & Bugfix of InstanceNorm
  • #18063 - [NN][Layer_Norm] Fix layer_norm error with reduce-only axes
  • #18006 - Fix index handling in expand_like operator for axis expansion
  • #18015 - Support integer type input for log10
  • #17942 - Add shape validation to prevent negative dimensions in conv operations

Vulkan

  • #18005 - Add TIR unary trigonometric/hyperbolic intrinsic definitions

cuda & cutlass & tensorrt

  • #18064 - [CUTLASS] Fix CUTLASS kernel build on Hopper
  • #18033 - [CUTLASS] Add GeMM kernels for Blackwell GPUs
  • #18024 - [CUDA] Fix thrust with latest FFI refactor
  • #18118 - bump cutlass_fpA_intB_gemm
  • #18113 - [CMake] Refine C++/CUDA standard settings in CMakeLists.txt

FFI

  • #18076 - [FFI][REFACTOR] Stablize container ABI and implementation
  • #18091 - [FFI] Provide Field Visit bridge so we can do gradual transition
  • #18095 - [FFI][REFACTOR] Migrate attrs to use new reflection
  • #18083 - [FFI] Update typeinfo to speedup parent reflection
  • #18077 - [FFI] Optimize atomic decref in Object
  • #18065 - [FFI] Introduce FFI reflection support in python
  • #18062 - [FFI][REFACTOR] Update registry to have complete meta-data
  • #18059 - [FFI][REFACTOR] Enhance reflection
  • #18050 - [FFI] Enhance FFI Object exception safety during init
  • #18121 - Revert "[FFI] Replace Arg2Str with a more powerful for_each"
  • #18117 - [FFI] Replace Arg2Str with a more powerful for_each
  • #18116 - [FFI] Use fold expression to simplify for_each
  • #18114 - [FFI] Replace __attribute__ with C++ standard attributes
  • #18112 - [FFI] Cleanup visit_attrs attribute after refactor
  • #18111 - [FFI] Introduce GlobalDef for function registration
  • #18106 - [REFACTOR][FFI] Phase out old VisitAttrs mechanism
  • #18042 - [REFACTOR][FFI] Update symbol name for library module
  • #18023 - [FFI] More strict tuple constructor checking
  • #18022 - [REFACTOR][FFI] Cleanup PackedFunc redirections
  • #18020 - [REFACTOR][PYTHON] Phase out tvm._ffi and Limited API support
  • #17979 - [FFI][REFACTOR] Update to distinguish as and cast
  • #17983 - [FFI][JVM] Upgrade tvm4j to latest FFI
  • #18010 - [REFACTOR][FFI] Phase out legacy C API
  • #17943 - [FFI] Variant specialize for all ObjectRef
  • #17939 - [REFACTOR] Phase out legacy rust ffi
  • #17940 - [REFACTOR] Phase out legacy go ffi
  • #17931 - [REFACTOR][FFI][RPC] Migrate RPC to use the latest FFI ABI
  • #17929 - [REFACTOR][FFI] Cleanup container redirections
  • #17927 - [FFI][FEAT] AutoDLPack for taking external tensor objects
  • #17923 - [REFACTOR][FFI] Cleanup PackedFunc related redirection
  • #17920 - [REFACTOR] Introduce and modernize ffi system

web

  • #17946 - [REFACTOR][FFI]Upgrade Web Runtime to new FFI
  • #17917 - [WebGPU][CodeGen] Override PrintVecElemLoad and Store for WebGPU

Misc

  • #18104 - Add LLVM Legalization for tir.erf
  • #18107 - fix: guard tensormap with cuda version check
  • #18101 - [REFACTOR] Formalize namespace for all objects
  • #18040 - Add support for bucketize
  • #18098 - [REFACTOR] Transition VisitAttrs to new reflection mechanism
  • #18096 - [REFACTOR] Transition VisitAttrs to new reflection mechanism in tir/ir_builder/meta_schedule
  • #18093 - [NVSHMEM] Extend CUDA backend to compile and link TIR modules with NVSHMEM
  • #18088 - [Script] Enhance alloc buffer handling in nested frames
  • #18086 - [SCRIPT] Bump Python minimum version to 3.9 and update AST compatibility
  • #18075 - add support for softsign op
  • #18079 - [Script] Add support for merging block annotations
  • #18080 - [REFACTOR] Phase out LegacyReprPrinter and improve CommonSubExprElim
  • #18078 - [REFACTOR] Phase out the RelaxExpr.checked_type in favor of struct_info
  • #18073 - [NVSHMEM] Update NDArray allocation
  • #18066 - [Script] Remove deprecated attributes from Constant AST node
  • #18060 - Add Python functor support for TIR expressions and statements
  • #18054 - [Pytest] Remove obsolete test suite entries
  • #18036 - Add support for hamming_window op
  • #18049 - [Refactor] Rename relax_vm to vm
  • #18046 - [3rdparty] Phasing out FlashInfer AOT from 3rdparty
  • #18047 - [Backend] JIT compile FlashInfer kernel with FFI header
  • #18041 - [DTYPE] Fix dtype functions after dtype refactor
  • #18043 - [REFACTOR] Phase out the relax tuning_api
  • #18038 - Resolving inconsistency between attention/attention_bias
  • #18027 - [Dtype] Low-precision Blackwell Datatype Support
  • #17985 - [Codegen] Resolve issue [#17965] where the same model produces different outputs on the LLVM (CPU) and CUDA (GPU) backends
  • #17978 - Fix IR generation conflict in topi.nn.simplify by separating Tensor and PrimExpr handling
  • #18026 - [Python] Fix library lookup path for pip installed packages
  • #18019 - Add op support for slice_scatter
  • #17974 - Fix FLOP estimation for EvaluateNode by implementing VisitStmt_ handler
  • #18013 - Fix RuntimeError: parallel_for_dynamic
  • #18014 - Fix division truncation in window size calculation for small dtypes in average_pool
  • #17995 - Fix zero-extent loops in PerStoreFeature to prevent crashes
  • #17969 - Add registion for the operator asinh, acosh, atanh in llvm
  • #17972 - Fix g.costs
  • #17953 - Fix sqrt/rsqrt Compatibility with Integer Data Types
  • #17961 - Fix basic FLOP estimation for WhileNode
  • #17945 - Add registion for the operator asin and acos in llvm
  • #17951 - [NODE] Fix structural equality for Array<Any> specialization
  • #17913 - [Triton] Support latest triton.compile interface
  • #17911 - Add op support for new_zeros op in Exported Program and fx graph frontend
  • #17909 - Add masked_fill_.scalar, logical_not.default in Exported Program frontend
  • #17910 - [RPC] Fix Bug That Change Dict When Iterate The Keys
  • #17896 - Add op support for zeros_like and fill_
  • #17900 - Fix onnx expand op
  • #17865 - Add support for index_put_ op
  • #17839 - Add op support for roll op
  • #17844 - Fix incorrect docstring in topi softmax
  • #17831 - [3rdparty] Bump DLPack to v1.1 for float8/6/4 dtype supports
  • #17848 - Fix docstring in batch_to_space_nd and bitpack
  • #17845 - fixing incorrect docstring in upsampling.py
  • #17808 - [Install] Fix error during python/tvm installation
Source: README.md, updated 2025-07-07