Spice v1.8.0 (Oct 6, 2025)
Spice v1.8.0 delivers major advances in data writes, scalable vector search, and now in preview—managed acceleration snapshots for fast cold starts. This release introduces write support for Iceberg tables using standard SQL INSERT INTO
, partitioned S3 Vector indexes for petabyte-scale vector search, and preview of the AI SQL function for direct LLM integration in SQL. Additional improvements include improved reliability, and the v3.0.3 release of the Spice.js Node.js SDK.
What's New in v1.8.0
Iceberg Table Write Support (Preview)
Append Data to Iceberg Tables with SQL INSERT INTO
: Spice now supports writing to Iceberg tables and catalogs using standard SQL INSERT INTO
statements. This enables data ingestion, transformation, and pipeline use cases—no Spark or external writer required.
- Append-only: Initial version targets appends; no overwrite or delete.
- Schema validation: Inserted data must match the target table schema.
- Secure by default: Writes are only enabled for datasets or catalogs explicitly marked with
access: read_write
.
Example Spicepod configuration:
:::yaml
catalogs:
- from: iceberg:https://glue.ap-northeast-3.amazonaws.com/iceberg/v1/catalogs/111111/namespaces
name: ice
access: read_write
datasets:
- from: iceberg:https://iceberg-catalog-host.com/v1/namespaces/my_namespace/tables/my_table
name: iceberg_table
access: read_write
Example SQL usage:
:::sql
-- Insert from another table
INSERT INTO iceberg_table
SELECT * FROM existing_table;
-- Insert with values
INSERT INTO iceberg_table (id, name, amount)
VALUES (1, 'John', 100.0), (2, 'Jane', 200.0);
-- Insert into catalog table
INSERT INTO ice.sales.transactions
VALUES (1001, '2025-01-15', 299.99, 'completed');
Note: Only Iceberg datasets and catalogs with
access: read_write
support writes. Internal Spice tables and other connectors remain read-only.
Learn more in the Iceberg Data Connector documentation.
Acceleration Snapshots for Fast Cold Starts (Preview)
Bootstrap Managed Accelerations from Object Storage: Spice now supports managed acceleration snapshots in preview, enabling datasets accelerated with file-based engines (DuckDB or SQLite) to bootstrap from a snapshot stored in object storage (such as S3) if the local acceleration file does not exist on startup. This dramatically reduces cold start times and enables ephemeral storage for accelerations with persistent recovery.
Key features:
- Rapid readiness: Datasets can become ready in seconds by downloading a pre-built snapshot, skipping lengthy initial acceleration.
- Hive-style partitioning: Snapshots are organized by month, day, and dataset for easy retention and management.
- Flexible bootstrapping: Configurable fallback and retry behavior if a snapshot is missing or corrupted.
Example Spicepod configuration:
:::yaml
snapshots:
enabled: true
location: s3://some_bucket/some_folder/ # Folder for storing snapshots
bootstrap_on_failure_behavior: warn # Options: warn, retry, fallback
params:
s3_auth: iam_role # All S3 dataset params accepted here
datasets:
- from: s3://some_bucket/some_table/
name: some_table
params:
file_format: parquet
s3_auth: iam_role
acceleration:
enabled: true
snapshots: enabled # Options: enabled, disabled, bootstrap_only, create_only
engine: duckdb
mode: file
params:
duckdb_file: /nvme/some_table.db
How it works:
- On startup, if the acceleration file does not exist, Spice checks the snapshot location for the latest snapshot and downloads it.
- Snapshots are stored as:
s3://some_bucket/some_folder/month=2025-09/day=2025-09-30/dataset=some_table/some_table_<timestamp>.db
- If no snapshot is found, a new acceleration file is created as usual.
- Snapshots are written after each refresh (unless configured otherwise).
Supported snapshot modes:
enabled
: Download and write snapshots.bootstrap_only
: Only download on startup, do not write new snapshots.create_only
: Only write snapshots, do not download on startup.disabled
: No snapshotting.
Note: This feature is only supported for file-based accelerations (DuckDB or SQLite) with dedicated files.
Why use acceleration snapshots?
- Faster cold starts: Skip waiting for full acceleration on startup.
- Ephemeral storage: Use fast local disks (e.g., NVMe) for acceleration, with persistent recovery from object storage.
- Disaster recovery: Recover from federated source outages by bootstrapping from the latest snapshot.
Learn more in the Acceleration Snapshots documentation.
Partitioned S3 Vector Indexes
Efficient, Scalable Vector Search with Partitioning: Spice now supports partitioning Amazon S3 Vector indexes and scatter-gather queries using a partition_by
expression in the dataset vector engine configuration. Partitioned indexes enable faster ingestion, lower query latency, and scale to billions of vectors.
Example Spicepod configuration:
:::yaml
datasets:
- name: reviews
vectors:
enabled: true
engine: s3_vectors
params:
s3_vectors_bucket: my-bucket
s3_vectors_index: base-embeddings
partition_by:
- 'bucket(50, PULocationID)'
columns:
- name: body
embeddings:
from: bedrock_titan
- name: title
embeddings:
from: bedrock_titan
See the Amazon S3 Vectors documentation for details.
AI SQL function for LLM Integration (Preview)
LLMs Directly In SQL: A new asynchronous ai
SQL function enables direct calls to LLMs from SQL queries for text generation, translation, classification, and more. This feature is released in preview and supports both default and model-specific invocation.
Example Spicepod model configuration:
:::yaml
models:
- name: gpt-4o
from: openai:gpt-4o
params:
openai_api_key: ${secrets:openai_key}
Example SQL usage:
:::sql
-- basic usage with default model
SELECT ai('hi, this prompt is directly from SQL.');
:::sql
-- basic usage with specified model
SELECT ai('hi, this prompt is directly from SQL.', 'gpt-4o');
:::sql
-- Using row data as input to the prompt
SELECT ai(concat_ws(' ', 'Categorize the zone', Zone, 'in a single word. Only return the word.')) AS category
FROM taxi_zones
LIMIT 10;
Learn more in the SQL Reference AI documentation.
Spice.js v3.0.3 SDK
Spice.js v3.0.3 Released: The official Spice.ai Node.js/JavaScript SDK has been updated to v3.0.3, bringing cross-platform support, new APIs, and improved reliability for both Node.js and browser environments.
- Modern Query Methods: Use
sql()
,sqlJson()
, andnsql()
for flexible querying, streaming, and natural language to SQL. - Browser Support: SDK now works in browsers and web applications, automatically selecting the optimal transport (gRPC or HTTP).
- Health Checks & Dataset Refresh: Easily monitor Spice runtime health and trigger dataset refreshes on demand.
- Automatic HTTP Fallback: If gRPC/Flight is unavailable, the SDK falls back to HTTP automatically.
- Migration Guidance: v3 requires Node.js 20+, uses camelCase parameters, and introduces a new package structure.
Example usage:
:::js
import { SpiceClient } from '@spiceai/spice';
const client = new SpiceClient(apiKey);
const table = await client.sql('SELECT * FROM my_table LIMIT 10');
console.table(table.toArray());
See Spice.js SDK documentation for full details, migration tips, and advanced usage.
Additional Improvements
- Reliability: Improved logging, error handling, and network readiness checks across connectors (Iceberg, Databricks, etc.).
- Vector search durability and scale: Refined logging, stricter default limits, safeguards against index-only scans and duplicate results, and always-accessible metadata for robust queryability at scale.
- Cache behavior: Tightened cache logic for modification queries.
- Full-Text Search: FTS metadata columns now usable in projections; max search results increased to 1000.
- RRF Hybrid Search: Reciprocal Rank Fusion (RRF) UDTF enhancements for advanced hybrid search scenarios.
Contributors
Breaking Changes
This release introduces two breaking changes associated with the search observability and tooling.
Firstly, the document_similarity
tool has been renamed to search
. This has the equivalent change to tracing of these tool calls:
:::bash
## Old: v1.7.1
>> spice trace tool_use::document_similarity
>> curl -XPOST http://localhost:8090/v1/tools/document_similarity \
-d '{
"datasets": ["my_tbl"],
"text": "Welcome to another Spice release"
}'
## New: v1.8.0
>> spice trace tool_use::search
>> curl -XPOST http://localhost:8090/v1/tools/search \
-d '{
"datasets": ["my_tbl"],
"text": "Welcome to another Spice release"
}'
Secondly, the vector_search
task in runtime.task_history
has been renamed to search
.
Cookbook Updates
- Added new AI SQL function recipe for invoking LLMs within SQL queries.
- Updated Iceberg Catalog Connector recipe for Iceberg Writes.
- Updated Spice.js JavaScript (Node.js) SDK for v3.0.3 with examples and v2 to v3 migration guide.
The Spice Cookbook now includes 80 recipes to help you get started with Spice quickly and easily.
Upgrading
To upgrade to v1.8.0, use one of the following methods:
CLI:
:::console
spice upgrade
Homebrew:
:::console
brew upgrade spiceai/spiceai/spice
Docker:
Pull the spiceai/spiceai:1.8.0
image:
:::console
docker pull spiceai/spiceai:1.8.0
For available tags, see DockerHub.
Helm:
:::console
helm repo update
helm upgrade spiceai spiceai/spiceai
AWS Marketplace:
🎉 Spice is now available in the AWS Marketplace!
What's Changed
Dependencies
- iceberg-rust: Upgraded to v0.7.0-rc.1
- mimalloc: Upgraded from 0.1.47 to 0.1.48
- azure_core: Upgraded from 0.27.0 to 0.28.0
- Jimver/cuda-toolkit: Upgraded from 0.2.27 to 0.2.28
Changelog
- Add
#[cfg(feature = "postgres")]
to acceleration refresh tests by @Jeadie in https://github.com/spiceai/spiceai/pull/7241 - fix: Update benchmark snapshots by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7267
- fix: Update benchmark snapshots by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7268
- fix: Update benchmark snapshots by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7269
- Update the tpch benchmark snapshots for: federated/databricks[sql_warehouse].yaml by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7270
- EmbeddingInput cache keys to include model name by @mach-kernel in https://github.com/spiceai/spiceai/pull/7275
- Ensure FTS metadata columns can be used in projection by @Jeadie in https://github.com/spiceai/spiceai/pull/7282
- Use 8-core runners for Windows CUDA builds by @sgrebnov in https://github.com/spiceai/spiceai/pull/7284
- Make search test more robust by @krinart in https://github.com/spiceai/spiceai/pull/7283
- Post-release housekeeping by @sgrebnov in https://github.com/spiceai/spiceai/pull/7272
- fix: Use median cached response duration for test search cache by @peasee in https://github.com/spiceai/spiceai/pull/7286
- Bump dirs from 5.0.1 to 6.0.0 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7244
- Bump indexmap from 2.11.0 to 2.11.4 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7248
- Fix JOIN level filters not having columns in schema by @Jeadie in https://github.com/spiceai/spiceai/pull/7287
- use SessionContext::new_empty in RRF by @kczimm in https://github.com/spiceai/spiceai/pull/7291
- Use rust:1.89-slim-bookworm for build, more places to bump rust version by @sgrebnov in https://github.com/spiceai/spiceai/pull/7293
- Update openapi.json by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7290
- Enable chunking in
SearchIndex
by @Jeadie in https://github.com/spiceai/spiceai/pull/7143 - Add index name and remove duplicate records string to S3 Vectors log by @lukekim in https://github.com/spiceai/spiceai/pull/7260
- Use file-based fts index by @Jeadie in https://github.com/spiceai/spiceai/pull/7024
- Remove 'PostApplyCandidateGeneration' by @Jeadie in https://github.com/spiceai/spiceai/pull/7288
- RRF: Rank and recency boosting by @mach-kernel in https://github.com/spiceai/spiceai/pull/7294
- Update ROADMAP.md by removing v1.7 milestone by @sgrebnov in https://github.com/spiceai/spiceai/pull/7297
- RRF: Preserve base ranking when results differ -> FULL OUTER JOIN does not produce time column by @mach-kernel in https://github.com/spiceai/spiceai/pull/7300
- chore: remove unused Dataset methods by @kczimm in https://github.com/spiceai/spiceai/pull/7295
- fix removing embedding column by @Jeadie in https://github.com/spiceai/spiceai/pull/7302
- fix: Add feature flag for using object store in
spicepod
by @peasee in https://github.com/spiceai/spiceai/pull/7303 - Upgrade to iceberg-rust v0.7.0-rc1 by @sgrebnov in https://github.com/spiceai/spiceai/pull/7296
- Enable DML
Update
SQL operations for datasets configured asaccess: read_write
by @sgrebnov in https://github.com/spiceai/spiceai/pull/7304 - Create and parse partitioned S3 vector index names by @kczimm in https://github.com/spiceai/spiceai/pull/7198
- RRF: Fix decay for disjoint result sets by @mach-kernel in https://github.com/spiceai/spiceai/pull/7305
- RRF: Project top scores, do not yield duplicate results by @mach-kernel in https://github.com/spiceai/spiceai/pull/7306
- RRF: Case sensitive column/ident handling by @mach-kernel in https://github.com/spiceai/spiceai/pull/7309
- For
vector_search
, use a default limit of 1000 if no limit specified by @lukekim in https://github.com/spiceai/spiceai/pull/7311 - Don’t cache modification queries (DDL, DML, COPY) by @sgrebnov in https://github.com/spiceai/spiceai/pull/7316
- Fix Anthropic model regex and add validation tests by @ewgenius in https://github.com/spiceai/spiceai/pull/7319
- Enhancement: Implement before/after/lag metrics for acceleration refresh by @krinart in https://github.com/spiceai/spiceai/pull/7310
- Refactor chat model health check to lower tokens usage for reasoning models by @ewgenius in https://github.com/spiceai/spiceai/pull/7317
- Add support for writing into Iceberg tables by @sgrebnov in https://github.com/spiceai/spiceai/pull/7315
- Fix lint warnings by @lukekim in https://github.com/spiceai/spiceai/pull/7327
- Use logical plan in
SearchQueryProvider
. by @Jeadie in https://github.com/spiceai/spiceai/pull/7314 - FTS max search results 100 -> 1000 by @Jeadie in https://github.com/spiceai/spiceai/pull/7331
- Improve Databricks SQL Warehouse Error Handling by @sgrebnov in https://github.com/spiceai/spiceai/pull/7332
- Use spicepod embedding model name for 'model_name()' by @Jeadie in https://github.com/spiceai/spiceai/pull/7333
- Handle async queries for Databricks SQL Warehouse API by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7335
- Enable DML (INSERT INTO) operations for catalogs configured as
access:read_write
by @sgrebnov in https://github.com/spiceai/spiceai/pull/7330 - Bump regex from 1.11.2 to 1.11.3 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7336
- Update qa_analytics.csv with 1.7.0 release data by @sgrebnov in https://github.com/spiceai/spiceai/pull/7337
- RRF: Fix ident resolution for struct fields, autohashed join key for varying types by @mach-kernel in https://github.com/spiceai/spiceai/pull/7339
- v1.7.1 release notes by @kczimm in https://github.com/spiceai/spiceai/pull/7348
- Bump Jimver/cuda-toolkit from 0.2.27 to 0.2.28 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7343
- Add support for writing into Glue (Iceberg) tables and catalogs by @sgrebnov in https://github.com/spiceai/spiceai/pull/7355
- Bump mimalloc from 0.1.47 to 0.1.48 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7342
- Add
ai
async UDF by @lukekim in https://github.com/spiceai/spiceai/pull/7328 - Use self-hosted and spiceai-macos runners for workflows where possible by @lukekim in https://github.com/spiceai/spiceai/pull/7371
- Several updates for improved search testing by @Jeadie in https://github.com/spiceai/spiceai/pull/7358
- Update supported versions in SECURITY.md by @Jeadie in https://github.com/spiceai/spiceai/pull/7377
- 1.7.1 release analytics by @mach-kernel in https://github.com/spiceai/spiceai/pull/7380
- Add
acceleration_file_path
helper and refactorspice_sys
to use Snafu errors by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7376 - fix: Update benchmark snapshots by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7353
- Robust search test by @Jeadie in https://github.com/spiceai/spiceai/pull/7381
- [bug] Fix ai UDF bug of mismatched column length by @lukekim in https://github.com/spiceai/spiceai/pull/7383
- Add OpenOption to spice_sys acceleration tables by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7379
- Add new
snapshots
Spicepod configuration by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7384 - Update naming of
tool_use::document_similarity
andvector_search
spans by @Jeadie in https://github.com/spiceai/spiceai/pull/7273 - fix: Update benchmark snapshots by @github-actions[bot] in https://github.com/spiceai/spiceai/pull/7354
- Make
ai
UDF a models only feature by @lukekim in https://github.com/spiceai/spiceai/pull/7387 - Add new
runtime_acceleration
crate; createSnapshotManager
; implementSnapshotManager::download_latest_snapshot
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7386 - Refactor 'VectorScanTableProvider' to use just 'VectorIndex::list_table_provider' by @Jeadie in https://github.com/spiceai/spiceai/pull/7318
- Fix embed logs by @Jeadie in https://github.com/spiceai/spiceai/pull/7382
- Enable spicepod dependencies in
testoperator
by @Jeadie in https://github.com/spiceai/spiceai/pull/7334 ai
UDF security and performance optimizations by @lukekim in https://github.com/spiceai/spiceai/pull/7392- Wire up the snapshot download on dataset startup by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7389
- Implement initial snapshot creation logic in
SnapshotManager
by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7391 - Make
tool_use::table_schema
output model-friendly by @krinart in https://github.com/spiceai/spiceai/pull/7393 - Fix minor lint warnings by @lukekim in https://github.com/spiceai/spiceai/pull/7395
- Enable metadata columns in document-based object store datasets by @Jeadie in https://github.com/spiceai/spiceai/pull/7397
- Core dependencies of financebench by @Jeadie in https://github.com/spiceai/spiceai/pull/7400
- Add S3vector variant to financebench. by @Jeadie in https://github.com/spiceai/spiceai/pull/7399
- Set PostgreSQL unsupported_spice_action=string by default by @lukekim in https://github.com/spiceai/spiceai/pull/7398
- Use non-blocking connection check for verify_ns_lookup_and_tcp_connect by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7401
- Bump moka from 0.12.10 to 0.12.11 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7340
- Bump tokio-postgres from 0.7.13 to 0.7.14 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7344
- Bump azure_core from 0.27.0 to 0.28.0 by @dependabot[bot] in https://github.com/spiceai/spiceai/pull/7338
- Forbid
INSERT OVERWRITE
DML operations by @sgrebnov in https://github.com/spiceai/spiceai/pull/7402 - Make database connection pool sizes consistent by @lukekim in https://github.com/spiceai/spiceai/pull/7403
- Disable vector index only scans by @Jeadie in https://github.com/spiceai/spiceai/pull/7405
- Make CLI
--endpoint
and--cloud
args & table output consistent by @lukekim in https://github.com/spiceai/spiceai/pull/7396 - Write new snapshots at the end of an accelerated refresh. by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7410
- Read and write partitioned S3 indexes by @kczimm in https://github.com/spiceai/spiceai/pull/7313
- Fix partial data writes in Iceberg data connector by @sgrebnov in https://github.com/spiceai/spiceai/pull/7411
- Remove nix by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7414
- Use DataFusion JoinSetTracer for async context propagation by @lukekim in https://github.com/spiceai/spiceai/pull/7416
- Implement cache invalidation for DML (INSERT INTO) operations by @sgrebnov in https://github.com/spiceai/spiceai/pull/7394
- Make cleanup disk GH action; use in integration tests by @Jeadie in https://github.com/spiceai/spiceai/pull/7418
- Move S3Vector to 'search' crate by @Jeadie in https://github.com/spiceai/spiceai/pull/7373
- Use LogicalPlan builder API for LogicalPlans by @Jeadie in https://github.com/spiceai/spiceai/pull/7408
- Use hive-style partitioned paths for DB snapshots by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7422
- Limit results from
SearchIndex::query_table_provider
by @Jeadie in https://github.com/spiceai/spiceai/pull/7421 - Delay initial readiness if snapshots are enabled with an append-mode refresh by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7425
- Disable snapshots by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7426
- Rewrite
ChunkedNonIndexVectorGeneration
to use LogicalPlanBuilder (instead of string formatting). by @Jeadie in https://github.com/spiceai/spiceai/pull/7413 - Fix for search field as metadata for chunked search indexes by @Jeadie in https://github.com/spiceai/spiceai/pull/7429
- Add feature is currently in preview warning for read_write access mode by @sgrebnov in https://github.com/spiceai/spiceai/pull/7440
- Add feature is currently in preview warning for snapshots by @sgrebnov in https://github.com/spiceai/spiceai/pull/7442
- Fix tracing so that ai_completions are parented under sql_query by @lukekim in https://github.com/spiceai/spiceai/pull/7415
- Disable acceleration refresh metrics by @krinart in https://github.com/spiceai/spiceai/pull/7450
- Enable snapshot acceleration by default by @phillipleblanc in https://github.com/spiceai/spiceai/pull/7451
- fix: partition name validation by @kczimm in https://github.com/spiceai/spiceai/pull/7452