Download Latest Version v0.4.0 source code.tar.gz (6.0 MB)
Email in envelope

Get an email when there's a new version of Stoolap

Home / v0.3.3
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2026-02-28 3.4 kB
v0.3.3 source code.tar.gz 2026-02-28 3.8 MB
v0.3.3 source code.zip 2026-02-28 4.1 MB
Totals: 3 Items   7.9 MB 1

What's New in v0.3.3

Vector Search Engine

  • VECTOR(N) column type with packed binary storage for embedding vectors of any dimension
  • HNSW index for approximate nearest neighbor search with configurable parameters (m, ef_construction, ef_search, distance metric)
  • Distance functions: VEC_DISTANCE_L2, VEC_DISTANCE_COSINE, VEC_DISTANCE_IP, and <=> operator
  • Vector search optimizer with HNSW index path (O(log N)) and WHERE post-filtering, or parallel brute-force k-NN with fused distance+topK
  • Utility functions: VEC_DIMS, VEC_NORM, VEC_TO_TEXT

    :::sql CREATE TABLE documents (id INTEGER PRIMARY KEY, embedding VECTOR(384));

    CREATE INDEX idx_emb ON documents(embedding) USING HNSW WITH (m = 16, ef_construction = 200, metric = 'cosine');

    SELECT id, VEC_DISTANCE_COSINE(embedding, '[0.1, 0.2, ...]') AS dist FROM documents ORDER BY dist LIMIT 10;

  • EMBED() function generates 384-dimensional embeddings using a built-in sentence-transformer model (all-MiniLM-L6-v2), no external API calls needed
  • Enable with --features semantic at build time
  • Combine with HNSW indexing for end-to-end semantic search in pure SQL

    :::sql -- Generate embeddings at insert time INSERT INTO documents (title, embedding) VALUES ('Password Reset Guide', EMBED('How to recover your account'));

    -- Semantic search with a natural language query SELECT title, VEC_DISTANCE_COSINE(embedding, EMBED('forgot my login')) AS dist FROM documents ORDER BY dist LIMIT 5;

ANN Benchmark

Self-contained benchmark binary on the public Fashion-MNIST dataset (60,000 vectors, 784 dimensions, 10,000 queries, single-core, full SQL path):

Recall QPS p95 Latency Speedup vs brute-force
95.0% 10,410 0.12 ms 733x
99.0% 6,700 0.19 ms 472x
99.9% 4,159 0.33 ms 293x
100% 913 1.59 ms 64x

See ANN Benchmarks for the full report.

:::bash
# Run the benchmark yourself
RAYON_NUM_THREADS=1 cargo run --release --example ann_benchmark \
  --features ann-benchmark -- --sweep --runs 5 --max-queries 10000

Other Changes

  • VACUUM statement and PRAGMA VACUUM for manual cleanup of deleted rows and old versions
  • Value::Extension refactoring: Value::Json replaced by tag-in-data pattern keeping Value at 16 bytes with room for future types
  • Lexer fast path: zero-alloc string literal parsing for non-escaped strings

Bug Fixes

  • Fix #[inline(always)] + #[target_feature] build error on x86_64
  • Stabilize flaky CI tests (HNSW recall with deterministic PRNG, table name collision, perf threshold)
  • Fix 38 broken relative links across docs (converted to Jekyll link tags)

Docs & Website

  • Full website redesign with vector search spotlight, search modal (Cmd/Ctrl+K), accessibility improvements
  • Blog post: Vector and Semantic Search in SQL
  • Playground: vector search tables and query chips
  • Python driver documentation

Full Changelog: https://github.com/stoolap/stoolap/compare/v0.3.2...v0.3.3

Source: README.md, updated 2026-02-28