| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| README.md | 2026-03-30 | 16.8 kB | |
| v4.0.0 source code.tar.gz | 2026-03-30 | 13.8 MB | |
| v4.0.0 source code.zip | 2026-03-30 | 14.7 MB | |
| Totals: 3 Items | 28.6 MB | 0 | |
What's Changed
Breaking Changes π
- feat!: upgrade DataFusion dependency to 52.1.0 by @wjones127 in https://github.com/lance-format/lance/pull/6015
- refactor!: refactor java access to file format version by @jackye1995 in https://github.com/lance-format/lance/pull/6053
- refactor!: remove create_empty_table usage by @jackye1995 in https://github.com/lance-format/lance/pull/6087
- fix!: bump IVF_RQ version for compatibility check by @BubbleCal in https://github.com/lance-format/lance/pull/6097
- perf(inverted)!: reduce fts indexing time and memory by @BubbleCal in https://github.com/lance-format/lance/pull/6174
- feat: add index segment commit API by @Xuanwo in https://github.com/lance-format/lance/pull/6209
- refactor!: remove staging from distributed vector indexing by @Xuanwo in https://github.com/lance-format/lance/pull/6269
New Features π
- feat: compress complex all null by @yingjianwu98 in https://github.com/lance-format/lance/pull/4990
- feat: expose
use_scalar_indexparam in Java scanner by @xloya in https://github.com/lance-format/lance/pull/5487 - feat: add file list with sizes to IndexMetadata by @wjones127 in https://github.com/lance-format/lance/pull/5497
- feat(compaction): add Python config for defer_index_remap by @zhangyue19921010 in https://github.com/lance-format/lance/pull/5691
- feat(core): add Levenshtein-based suggestions to not-found errors in schema by @HemantSudarshan in https://github.com/lance-format/lance/pull/5976
- feat: add URI-based commit support to Java SDK by @hamersaw in https://github.com/lance-format/lance/pull/5978
- fix: concurrent read and write to directory namespace by @jackye1995 in https://github.com/lance-format/lance/pull/5983
- feat: add ability to pass custom headers to objectstore requests by @hamersaw in https://github.com/lance-format/lance/pull/5989
- feat: add DeleteResult with num_deleted_rows by @wkalt in https://github.com/lance-format/lance/pull/6001
- feat: introduce IncompatibleTransaction error by @wjones127 in https://github.com/lance-format/lance/pull/6003
- feat(cleanup): add more metrics to RemovalStats by @zhangyue19921010 in https://github.com/lance-format/lance/pull/6025
- feat(java): expose prefilter parameter to support vector search with fragments by @nyl3532016 in https://github.com/lance-format/lance/pull/6040
- feat: surface ambiguous merge insert error as
InvalidInputby @wjones127 in https://github.com/lance-format/lance/pull/6048 - feat(blob): distribute blob sidecar keys with reversed binary ids by @Xuanwo in https://github.com/lance-format/lance/pull/6060
- feat: handle JSONB literals in Lance SQL planner by @wkalt in https://github.com/lance-format/lance/pull/6061
- feat(java): expose Dataset.dropIndex method to drop specific index by @fangbo in https://github.com/lance-format/lance/pull/6065
- feat(blob): map external blob URIs to multi-base base ids by @Xuanwo in https://github.com/lance-format/lance/pull/6066
- feat: add env toggle for repetition index cache on read by @Xuanwo in https://github.com/lance-format/lance/pull/6069
- feat(compaction): single reserve_fragment_ids after rewriting files by @hamersaw in https://github.com/lance-format/lance/pull/6072
- feat: expose compaction binary copy configuration through python and java SDKs by @hamersaw in https://github.com/lance-format/lance/pull/6074
- feat(cleanup): support rate limiter for cleanup operation by @zhangyue19921010 in https://github.com/lance-format/lance/pull/6084
- feat: mark 2.2 as stable and add 2.3 as the next file format version by @Xuanwo in https://github.com/lance-format/lance/pull/6088
- feat: support prewarm for IVF-based ANN indices by @wjones127 in https://github.com/lance-format/lance/pull/6090
- feat: add skip_transpose flag to vector index builders by @BubbleCal in https://github.com/lance-format/lance/pull/6114
- feat: enable HNSW-accelerated partition assignment for fp16 vectors by @wkalt in https://github.com/lance-format/lance/pull/6119
- feat: clearer progress reporting for IVF by @wkalt in https://github.com/lance-format/lance/pull/6126
- feat: support vector indices in describe_indices filtering by @ndpvt-web in https://github.com/lance-format/lance/pull/6145
- feat: reduce open file handles during IVF training by @westonpace in https://github.com/lance-format/lance/pull/6169
- feat: add compaction options in manifest config by @hamersaw in https://github.com/lance-format/lance/pull/6170
- feat: support atomic multi-table transactions via namespace manifest by @XuQianJin-Stars in https://github.com/lance-format/lance/pull/6173
- feat: add abfss:// scheme support for Azure ADLS Gen2 by @burlacio in https://github.com/lance-format/lance/pull/6192
- feat: bounding source fragments for compaction execution by @hamersaw in https://github.com/lance-format/lance/pull/6232
- fix: filter out detached versions when scanning manifests by @jackye1995 in https://github.com/lance-format/lance/pull/6245
- feat: allow setting transaction properties in various operations by @jackye1995 in https://github.com/lance-format/lance/pull/6246
- feat: add OpenDAL Azdls backend for abfss:// with use_opendal flag by @burlacio in https://github.com/lance-format/lance/pull/6256
Bug Fixes π
- fix(java): transaction fatal bug in java transaction api by @wojiaodoubao in https://github.com/lance-format/lance/pull/5824
- fix: maintaining individual fragment operation when calling take_source by @hamersaw in https://github.com/lance-format/lance/pull/5844
- fix(encoding): handle empty rows in variable packed struct decode by @Xuanwo in https://github.com/lance-format/lance/pull/5995
- fix: various bugs to namespace access by @jackye1995 in https://github.com/lance-format/lance/pull/5996
- fix: set namespace commit handler for LanceDataset.commit by @jackye1995 in https://github.com/lance-format/lance/pull/6002
- fix: fast_search limits full text search to indexed fragments by @BubbleCal in https://github.com/lance-format/lance/pull/6006
- fix: fast_search should ignore any unindexed data for vector search by @BubbleCal in https://github.com/lance-format/lance/pull/6007
- fix: correctly calculate max visible level when a list has no def by @westonpace in https://github.com/lance-format/lance/pull/6008
- perf: avoid oversized variable buffers in full-zip scan batches by @Xuanwo in https://github.com/lance-format/lance/pull/6013
- fix: make overwrites retryable instead of compatible by @jackye1995 in https://github.com/lance-format/lance/pull/6014
- fix(python): avoid interpreter shutdown panic in BackgroundExecutor by @Xuanwo in https://github.com/lance-format/lance/pull/6023
- fix: filter stale row IDs in TakeExec for FTS/vector after delete by @wkalt in https://github.com/lance-format/lance/pull/6042
- fix(btree): include null pages in non-IsNull queries for correct thre⦠by @wkalt in https://github.com/lance-format/lance/pull/6043
- fix: handle list-level NULLs in NOT filters by @fenfeng9 in https://github.com/lance-format/lance/pull/6044
- fix: allowing headers for static configuration to be consistent by @hamersaw in https://github.com/lance-format/lance/pull/6045
- fix: bitmap iterator exhaustion in mask_to_offset_ranges by @wkalt in https://github.com/lance-format/lance/pull/6046
- fix(build): add Android aarch64 support to lance-linalg by @dardourimohamed in https://github.com/lance-format/lance/pull/6057
- fix: make blob v2 reads base-aware in multi-base datasets by @Xuanwo in https://github.com/lance-format/lance/pull/6064
- fix(lance-linalg): fix missing return value in u8x16::bit_and for non-x86_64/aarch64 targets by @cheungxi in https://github.com/lance-format/lance/pull/6068
- fix: resolve Python lint failure on main by @Xuanwo in https://github.com/lance-format/lance/pull/6073
- fix: restore main CI by formatting take_blob imports by @Xuanwo in https://github.com/lance-format/lance/pull/6082
- fix: incorrect deletion masking in
DatasetPreFilterby @cijiugechu in https://github.com/lance-format/lance/pull/6083 - fix: avoid thread pool contention between compression and write operations during FTS indexing by @BubbleCal in https://github.com/lance-format/lance/pull/6085
- fix: compile error for err_express by @zhangyue19921010 in https://github.com/lance-format/lance/pull/6094
- fix(python): crash when schema contains nested fixed_size_list or extension type by @erandagan in https://github.com/lance-format/lance/pull/6107
- fix: dont sample if no vectors are needed by @westonpace in https://github.com/lance-format/lance/pull/6110
- fix(index): preserve stable row-id entries during scalar index optimize by @acking-you in https://github.com/lance-format/lance/pull/6117
- fix: disallow wrapping auto-detected fsst in other compression by @hamersaw in https://github.com/lance-format/lance/pull/6120
- fix: pin substrait to 0.62.2 until DF supports 0.62.3 by @westonpace in https://github.com/lance-format/lance/pull/6121
- fix: vector index type shown as unknown in describe_indices by @jackye1995 in https://github.com/lance-format/lance/pull/6122
- fix: handle inverted index worker exits during dispatch by @BubbleCal in https://github.com/lance-format/lance/pull/6129
- fix: add missing type hint for producer function by @Gallardot in https://github.com/lance-format/lance/pull/6133
- fix: prevent duplicate manifest entries from concurrent table creation by @jmhsieh in https://github.com/lance-format/lance/pull/6143
- fix: replace
fetch_arrow_tablewithto_arrow_tableby @BubbleCal in https://github.com/lance-format/lance/pull/6146 - fix: preserve merge insert delete-by-source semantics by @Xuanwo in https://github.com/lance-format/lance/pull/6148
- fix: handle DataType::Null in adjust_child_validity to prevent panic by @wjones127 in https://github.com/lance-format/lance/pull/6160
- fix: persist frag reuse index external file on local filesystem by @wjones127 in https://github.com/lance-format/lance/pull/6163
- fix: avoid empty range reads for zero-length blobs by @Xuanwo in https://github.com/lance-format/lance/pull/6168
- fix: handle nullable validity layers without def levels by @Xuanwo in https://github.com/lance-format/lance/pull/6187
- fix: like queries with a prefix should be accelerated by btree and zonemap by @jackye1995 in https://github.com/lance-format/lance/pull/6188
- fix: use to_arrow_reader in benchmark datagen by @Xuanwo in https://github.com/lance-format/lance/pull/6190
- fix: disallowing stale credentials from directory namespace by @hamersaw in https://github.com/lance-format/lance/pull/6194
- fix: memory_limit and num_workers params are not passed to index worker by @BubbleCal in https://github.com/lance-format/lance/pull/6197
- fix: preserve create index transaction semantics by @Xuanwo in https://github.com/lance-format/lance/pull/6204
- fix: allow same field name with different type in dataset overwrites by @hamersaw in https://github.com/lance-format/lance/pull/6206
- fix: prewarm all segments for named indices by @Xuanwo in https://github.com/lance-format/lance/pull/6211
- fix: respect the old data filter on inverted index by @westonpace in https://github.com/lance-format/lance/pull/6216
- fix: 2.1/2.2 panic when a list column had small values and many empty values by @westonpace in https://github.com/lance-format/lance/pull/6234
- fix: resolve_latest_location converts errors to not_found unconditionally by @wkalt in https://github.com/lance-format/lance/pull/6248
- fix: return errors for unsupported fixed-size-list child types by @myandpr in https://github.com/lance-format/lance/pull/6253
- fix: adding namespace support to java SDK CommitBuilder from dataset by @hamersaw in https://github.com/lance-format/lance/pull/6257
- fix: pass dataset_options to SafeLanceDataset in worker processes by @eddyxu in https://github.com/lance-format/lance/pull/6278
Documentation π
- docs: fix incorrect URLs and cleanup by @prrao87 in https://github.com/lance-format/lance/pull/5317
- docs: expand the FTS index doc explaining the training process and multiple partitions by @westonpace in https://github.com/lance-format/lance/pull/5988
- docs: clarify v2.2 nested drop rollback risk by @Xuanwo in https://github.com/lance-format/lance/pull/5999
- docs: require data_storage_version=2.2 in map type example by @Xuanwo in https://github.com/lance-format/lance/pull/6032
- docs: update file versioning matrix for 2.2 rollout by @Xuanwo in https://github.com/lance-format/lance/pull/6033
- docs: reorganize blob docs around blob v2 and clarify legacy compatibility by @Xuanwo in https://github.com/lance-format/lance/pull/6034
- docs: align 2.2 encoding docs and nested add-column notes by @Xuanwo in https://github.com/lance-format/lance/pull/6038
- docs: clarify how to generate TPCH benchmark dataset locally by @Xuanwo in https://github.com/lance-format/lance/pull/6063
- docs: document vector index RAM (training) & storage requirements by @westonpace in https://github.com/lance-format/lance/pull/6108
- docs: update index.md to fix
indexestoindicesfor uniformity by @wombatu-kun in https://github.com/lance-format/lance/pull/6113 - docs: document the rules for transaction conflicts by @westonpace in https://github.com/lance-format/lance/pull/6158
- docs: add alicloud oss configuration by @FarmerChillax in https://github.com/lance-format/lance/pull/6167
- docs: update the rules for data replacement conflicts to reflect reality by @westonpace in https://github.com/lance-format/lance/pull/6182
- docs: add example to show how to index JSON column by @prrao87 in https://github.com/lance-format/lance/pull/6208
- docs: remove legacy preview index note by @Xuanwo in https://github.com/lance-format/lance/pull/6218
Performance Improvements π
- perf: pre-transpose PQ codebook for SIMD-friendly L2 distance by @wkalt in https://github.com/lance-format/lance/pull/5923
- perf: speed up format v2.2 scans by adding shortcut for full page by @Xuanwo in https://github.com/lance-format/lance/pull/5981
- perf: speed up format 2.2 300% by spawning structural decode batch tasks by @Xuanwo in https://github.com/lance-format/lance/pull/5982
- perf: reduce peak memory during cosine IVF-PQ index training by @wkalt in https://github.com/lance-format/lance/pull/6016
- perf: fast rotation for RQ quantization by @BubbleCal in https://github.com/lance-format/lance/pull/6024
- perf: avoid re-open shard indices and small reads by @BubbleCal in https://github.com/lance-format/lance/pull/6026
- perf: disable auto FSST for binary fields by @Xuanwo in https://github.com/lance-format/lance/pull/6047
- perf: speedup flat fts by @westonpace in https://github.com/lance-format/lance/pull/6054
- perf: add dict-values compression controls with lz4 default by @Xuanwo in https://github.com/lance-format/lance/pull/6059
- perf: avoid frequent allocating when computing residual vectors by @BubbleCal in https://github.com/lance-format/lance/pull/6062
- perf: add take_blob benchmark with cache_repetition_index matrix by @Xuanwo in https://github.com/lance-format/lance/pull/6067
- perf: parallelize FTS prewarming by @BubbleCal in https://github.com/lance-format/lance/pull/6144
- perf: remove shard content key sorting from distributed merge by @Xuanwo in https://github.com/lance-format/lance/pull/6179
- perf(inverted): reuse posting batch builder and merge tail partitions by @BubbleCal in https://github.com/lance-format/lance/pull/6191
- perf: reuse distance calculator at selecting candidates by @BubbleCal in https://github.com/lance-format/lance/pull/6202
- perf: new layout for positions and new algo for phrase query by @BubbleCal in https://github.com/lance-format/lance/pull/6203
- perf: batched WAND and new WAND structure, ~50% faster by @BubbleCal in https://github.com/lance-format/lance/pull/6241
Other Changes
- refactor: use dict entries and encoded size instead of cardinality for dict decision by @Xuanwo in https://github.com/lance-format/lance/pull/5891
- refactor: upgrade to SNAFU 0.9 by @shepmaster in https://github.com/lance-format/lance/pull/6071
- refactor: overhaul AGENTS.md with PR review insights by @Xuanwo in https://github.com/lance-format/lance/pull/6103
- refactor: use the dataset file version to determine index file version by @westonpace in https://github.com/lance-format/lance/pull/6142
- refactor: rename arrow-scalar to lance-arrow-scalar by @westonpace in https://github.com/lance-format/lance/pull/6199
- refactor: distributed vector segment build by @Xuanwo in https://github.com/lance-format/lance/pull/6220
Full Changelog: https://github.com/lance-format/lance/compare/release-root/4.0.0-beta.N...v4.0.0