| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-05-12 | 9.2 kB | |
| v8.0.0 source code.tar.gz | 2026-05-12 | 2.2 MB | |
| v8.0.0 source code.zip | 2026-05-12 | 2.4 MB | |
| Totals: 3 Items | 4.6 MB | 0 | |
Breaking Changes
- Major release: TabPFN-3 is now the default model. New users and existing users who do not pin a model will automatically get TabPFN-3 going forward. To use a previous model version, use the
create_default_for_version()classmethod onTabPFNClassifier/TabPFNRegressor, or pass an explicitmodel_pathto the estimator constructor to pin a specific model file. (#948)
Added
- Add opt-in feature subsampling strategies across ensemble members when the number of features exceeds
max_features_per_estimator. SetFEATURE_SUBSAMPLING_METHODin the inference config to one of"random"(default),"balanced", or"constant_and_balanced". (#851) - Add enable_torch_compile to PerformanceOptions. (#879)
- Add GPU preprocessing pipeline that runs feature transformations (quantile normalization, SVD) directly on the GPU as part of the model forward pass. (#884)
- Add
get_inference_config()method toTabPFNClassifierandTabPFNRegressor. This method loads the model checkpoint if needed and returns the activeInferenceConfig, allowing inspection of preprocessing and inference settings before callingfit(). (#890) - Add an optional
show_progress_barflag to TabPFN classifier and regressor inference, defaulting toFalse. (#899) - Add a nightly workflow that reproduces every example notebook's pip-install sequence in a fresh venv and asserts
tabpfnresolves to the latest PyPI release. (#901) - Add
gini_feature_importanceandgini_feature_importance_lightgbmas newFEATURE_SUBSAMPLING_METHODoptions. Both rank features by importance and always include the top-K most predictive features per estimator when the dataset exceedsmax_features_per_estimator. LightGBM is an optional dependency (pip install tabpfn[lightgbm]). (#908) - Add TabPFN v3 support:
TabPFNClassifierandTabPFNRegressornow supportModelVersion.V3, includingcreate_default_for_version(ModelVersion.V3)and explicit v3 model paths. (#909) - Add
autoas a newFEATURE_SUBSAMPLING_METHODoption. When selected, it automatically usesgini_feature_importance(LightGBM-based) for datasets with more than 100k samples where feature subsampling is needed, and falls back tobalancedotherwise. LightGBM is now a required dependency (previously optional viapip install tabpfn[lightgbm]). (#913) - Add
embedding_dimabstract property to theArchitectureinterface, exposing the output embedding dimension for all architecture implementations. (#924) - Stratified row subsampling for the classifier: when
SUBSAMPLE_SAMPLESis set, each ensemble member now draws rows that preserve the original class proportions, using a balanced round-robin pool per class to ensure uniform row coverage across estimators. (#928) - Add opt-in FlashAttention-3 backend selector for v3 (
PerformanceOptions.attention_backend). On Hopper GPUs, "auto" routes to FA3 once the sequence length amortises FA3's dispatch overhead; otherwise falls back to PyTorch SDPA. (#935) - Auto-scale
n_estimatorsat fit time so every feature is covered by at least one ensemble member. The effective count is exposed asn_estimators_; aUserWarningis emitted when scaling triggers. (#937) - Add
TorchSquashingScalerandTorchSquashingScalerStep— a torch implementation ofSquashingScalermirroring the CPU version. (#938) - Run SVD on GPU when
enable_gpu_preprocessing=Trueby pre-warming PyTorch's LAPACK lazy wrapper on the main thread before parallel dispatch to avoid a multi-GPU race intorch.svd_lowrank->torch.linalg.qr. (#941) - Schedule the squashing scaler on GPU when the configuration is eligible. This makes the preprocessing significantly faster. (#944)
Changed
- Introduces balanced subsampling of features for improved performance for datasets with large number of features. Results may vary slightly because of different seeds. (#851)
- Model checkpoint caching now automatically invalidates when the file on disk changes (detected via mtime and size), so replaced checkpoints (e.g. during finetuning) are always reloaded. (#863)
- Row subsampling across ensemble members now uses round-robin balanced sampling. This replaces the previous random sampling approach. (#886)
- Remove unused v2.6 defaults from
InferenceConfig.get_default(). V2.6 checkpoints always embed their ownInferenceConfig, so these defaults were never used at inference time. The v2.6 preprocessor config factories are also removed fromtabpfn.preprocessing. (#890) - Renamed
InferenceConfig.CONSTANT_FEATURE_COUNTtoFEATURE_SUBSAMPLING_CONSTANT_FEATURE_COUNTto better reflect its purpose. Old checkpoints that store the previous key name are migrated transparently on load. (#900) - Updated copyright year to 2026 and consolidated the
authorsfield inpyproject.tomlto a single Prior Labs entry. (#916) - Speed up
ReshapeFeatureDistributionsStep~2x on large numerical workloads (~1670 ms → ~870 ms on 100k×100): inlineSquashingScaler's robust/minmax branches into a singlenanpercentilepass, and callColumnTransformer.fit_transformonce instead offit+transform(sklearn'sfitalready runs the transform internally). Behavior unchanged. (#938) - Keep the inference cache on the GPU by default when
fit_mode="fit_with_cache", avoiding host/device transfers on each predict call. The per-estimator KV caches are reachable viamodel.executor_.kv_caches. (#942) - Clean up README and inline references to removed/deprecated tabpfn-extensions modules (
rf_pfn,post_hoc_ensembles,hpo) and the retiredlarge_datasetsexample. Drops the now-stale workflow mermaid diagram, updates the OOM error message to link to the Models page, and removes the unusedAutoTabPFNClassifierimport from the Colab demo notebook. (#945)
Fixed
- Fix inference precision to respect force_inference_dtype in KV cache engine and skip thinking tokens during cache-building. (#802)
- Reduce TabPFNRegressor peak GPU memory at large test-set sizes by chunking the row dimension inside
translate_probs_across_borders. Output is unchanged; peak drops ~60% atn_test=250k(57.6 GB → 22.8 GB on an H100). (#882) - Fix v2.6 producing near-random outputs on Apple Silicon (MPS).
F.scaled_dot_product_attentionon MPS silently returns wrong values for non-contiguous q/k/v (upstream: pytorch/pytorch#181133); we now force contiguity before the call. Iris multiclass accuracy on MPS: 0.48 → 0.98. (#888) - Fix
FinetunedTabPFNClassifier/FinetunedTabPFNRegressordropping pandas feature names from the final inference model. The raw training inputs are now retained so the fitted inference estimator recordsfeature_names_in_, and callingpredict_proba/predictwith a DataFrame no longer triggers spurious sklearn feature-name warnings. (#892) - Adapt
recompute_layerflag inFinetunedTabPFNClassifier/FinetunedTabPFNRegressorto newPerformanceOptionsinterface. (#917) - Fix
save_tabpfn_modelnot settingarchitecture_name="tabpfn_v3"for v3 configs and not persistinginference_config_, which broke resuming v3 finetuning from a saved checkpoint. (#930) - Reduce KV cache GPU memory in
fit_with_cacheby materialising only the kept KV head(s) at cache-build time. Output is unchanged. (#933) - Fix
RuntimeError: No available kernelon v3 inference for GPUs where none of FlashAttention / EfficientAttention / CuDNN-Attention are eligible (e.g. Turing-class cards like the T4) by addingSDPBackend.MATHas a final fallback in_SDPA_BACKENDS. (#947)