SentenceTransformers - Browse /v5.0.0 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2025-07-01	37.7 kB	0
v5.0.0 - SparseEncoder support_ encode_query _ encode_document_ multi-processing in encode_ Router_ and more source code.tar.gz	2025-07-01	13.1 MB	0
v5.0.0 - SparseEncoder support_ encode_query _ encode_document_ multi-processing in encode_ Router_ and more source code.zip	2025-07-01	13.6 MB	0
Totals: 3 Items		26.8 MB	0

This release consists of significant updates including the introduction of Sparse Encoder models, new methods encode_query and encode_document, multi-processing support in encode, the Router module for asymmetric models, custom learning rates for parameter groups, composite loss logging, and various small improvements and bug fixes.

Install this version with

:::bash
# Training + Inference
pip install sentence-transformers[train]==5.0.0

# Inference only, use one of:
pip install sentence-transformers==5.0.0
pip install sentence-transformers[onnx-gpu]==5.0.0
pip install sentence-transformers[onnx]==5.0.0
pip install sentence-transformers[openvino]==5.0.0

[!TIP] Our Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 blogpost is an excellent place to learn about finetuning sparse embedding models!

[!NOTE] This release is designed to be fully backwards compatible, meaning that you should be able to upgrade from older versions to v5.x without any issues. If you are running into issues when upgrading, feel free to open an issue. Also see the Migration Guide for changes that we would recommend.

Sparse Encoder models

The Sentence Transformers v5.0 release introduces Sparse Embedding models, also known as Sparse Encoders. These models generate high-dimensional embeddings, often with 30,000+ dimensions, where often only <1% of dimensions are non-zero. This is in contrast to the standard dense embedding models, which produce low-dimensional embeddings (e.g., 384, 768, or 1024 dimensions) where all values are non-zero.

Usually, each active dimension (i.e. the dimension with a non-zero value) in a sparse embedding corresponds to a specific token in the model's vocabulary, allowing for interpretability. This means that you can e.g. see exactly which words/tokens are important in an embedding, and that you can inspect exactly because of which words/tokens two texts are deemed similar.

Let's have a look at naver/splade-v3, a strong sparse embedding model, as an example:

:::python
from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("naver/splade-v3")

# Run inference
sentences = [
    "The weather is lovely today.",
    "It's so sunny outside!",
    "He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# (3, 30522)

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[   32.4323,     5.8528,     0.0258],
#         [    5.8528,    26.6649,     0.0302],
#         [    0.0258,     0.0302,    24.0839]])

# Let's decode our embeddings to be able to interpret them
decoded = model.decode(embeddings, top_k=10)
for decoded, sentence in zip(decoded, sentences):
    print(f"Sentence: {sentence}")
    print(f"Decoded: {decoded}")
    print()

Sentence: The weather is lovely today.
Decoded: [('weather', 2.754288673400879), ('today', 2.610959529876709), ('lovely', 2.431990623474121), ('currently', 1.5520408153533936), ('beautiful', 1.5046082735061646), ('cool', 1.4664798974990845), ('pretty', 0.8986214995384216), ('yesterday', 0.8603134155273438), ('nice', 0.8322536945343018), ('summer', 0.7702118158340454)]

Sentence: It's so sunny outside!
Decoded: [('outside', 2.6939032077789307), ('sunny', 2.535827398300171), ('so', 2.0600898265838623), ('out', 1.5397940874099731), ('weather', 1.1198079586029053), ('very', 0.9873268604278564), ('cool', 0.9406591057777405), ('it', 0.9026399254798889), ('summer', 0.684999406337738), ('sun', 0.6520509123802185)]

Sentence: He drove to the stadium.
Decoded: [('stadium', 2.7872302532196045), ('drove', 1.8208855390548706), ('driving', 1.6665740013122559), ('drive', 1.5565159320831299), ('he', 1.4721972942352295), ('stadiums', 1.449463129043579), ('to', 1.0441515445709229), ('car', 0.7002660632133484), ('visit', 0.5118278861045837), ('football', 0.502326250076294)]

In this example, the embeddings are 30,522-dimensional vectors, where each dimension corresponds to a token in the model's vocabulary. The decode method returned the top 10 tokens with the highest values in the embedding, allowing us to interpret which tokens contribute most to the embedding.

We can even determine the intersection or overlap between embeddings, very useful for determining why two texts are deemed similar or dissimilar:

:::python
# Let's also compute the intersection/overlap of the first two embeddings
intersection_embedding = model.intersection(embeddings[0], embeddings[1])
decoded_intersection = model.decode(intersection_embedding)
print(decoded_intersection)

Decoded: [('weather', 3.0842742919921875), ('cool', 1.379457712173462), ('summer', 0.5275946259498596), ('comfort', 0.3239051103591919), ('sally', 0.22571465373039246), ('julian', 0.14787325263023376), ('nature', 0.08582140505313873), ('beauty', 0.0588383711874485), ('mood', 0.018594780936837196), ('nathan', 0.000752730411477387)]

And if we think the embeddings are too big, we can limit the maximum number of active dimensions like so:

:::python
from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("naver/splade-v3")  # You can also set max_active_dims here instead of encode()

# Run inference
documents = [
    "UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.",
    "Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.",
    "Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)",
]
embeddings = model.encode_document(documents, max_active_dims=64)
print(embeddings.shape)
# (3, 30522)

# Print the sparsity of the embeddings
sparsity = model.sparsity(embeddings)
print(sparsity)
# {'active_dims': 64.0, 'sparsity_ratio': 0.9979031518249132}

Click to see that it has minimal impact on scores

:::python from sentence_transformers import SparseEncoder # Download from the 🤗 Hub model = SparseEncoder("naver/splade-v3") # You can also set max_active_dims here instead of encode() # Run inference queries = ["what causes aging fast"] documents = [ "UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.", "Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.", "Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)", ] query_embeddings = model.encode_query(queries) document_embeddings = model.encode_document(documents) # Determine the sparsity query_sparsity = model.sparsity(query_embeddings) document_sparsity = model.sparsity(document_embeddings) print(query_sparsity, document_sparsity) # {'active_dims': 28.0, 'sparsity_ratio': 0.9990826289233995} {'active_dims': 174.6666717529297, 'sparsity_ratio': 0.9942773516888497} # Calculate the similarity scores for the embeddings similarities = model.similarity(query_embeddings, document_embeddings) print(similarities) # tensor([[11.3767, 10.8296, 4.3457]], device='cuda:0') # Again with smaller max_active_dims smaller_document_embeddings = model.encode_document(documents, max_active_dims=64) # Determine the sparsity for the smaller document embeddings smaller_document_sparsity = model.sparsity(smaller_document_embeddings) print(query_sparsity, smaller_document_sparsity) # {'active_dims': 28.0, 'sparsity_ratio': 0.9990826289233995} {'active_dims': 64.0, 'sparsity_ratio': 0.9979031518249132} # Print the similarity scores for the smaller document embeddings smaller_similarities = model.similarity(query_embeddings, smaller_document_embeddings) print(smaller_similarities) # tensor([[10.1311, 9.8360, 4.3457]], device='cuda:0') # Very similar to the scores for the full document embeddings!

Are they any good?

A big question is: How do sparse embedding models stack up against the “standard” dense embedding models, and what kind of performance can you expect when combining various?

For this, I ran a variation of our hybrid_search.py evaluation script, with:

The NanoMSMARCO dataset (a subset of the MS MARCO eval split)
Qwen/Qwen3-Embedding-0.6B dense embedding model
naver/splade-v3-doc sparse embedding model, inference free for queries
Alibaba-NLP/gte-reranker-modernbert-base reranker

Which resulted in this evaluation:

Dense	Sparse	Reranker	NDCG@10	MRR@10	MAP
x			65.33	57.56	57.97
	x		67.34	59.59	59.98
x	x		72.39	66.99	67.59
x		x	68.37	62.76	63.56
	x	x	69.02	63.66	64.44
x	x	x	68.28	62.66	63.44

Here, the sparse embedding model actually already outperforms the dense one, but the real magic happens when combining the two: hybrid search. In our case, we used Reciprocal Rank Fusion to merge the two rankings.

Rerankers also help improve the performance of the dense or sparse model here, but hurt the performance of the hybrid search, as its performance is already beyond what the reranker can achieve.

[!NOTE] The naver/splade-v3-doc was trained on the MS MARCO training set, so this is in-domain performance, much like what you might expect if you finetune on your own data.

Resources

Check out the following links to get a better feel for what Sparse Encoders are, how they work, what architectures exist, how to use them, what pretrained models exist, how to finetune them, and more:

Blogpost:
- Training and Finetuning Sparse Embedding Models with Sentence Transformers v5
Documentation:
Models:
- Sparse Encoder Model Collection

Update Stats

The introduction of SparseEncoder has been one of the largest updates to Sentence Transformers, introducing all of the following:

Code:
- New Trainer, Training Arguments, Data Collator, Model Card generation + template, with backwards compatibility
- 4 new, 1 refactored modules to support at least 3 model archetypes: SPLADE, Inference-free SPLADE, and CSR
- 12 new losses
- 9 new evaluators
- 1 new Callback
- 4 example integrations with ElasticSearch, OpenSearch, Qdrant, and Seismic
Tests:
- 317 tests for SparseEncoder loading, inference, training, etc.
Docs:

New methods:`encode_query` and `encode_document`

Sentence Transformers v5.0 introduces two new core methods to the SentenceTransformer and SparseEncoder classes: encode_query and encode_document.

These methods are specialized versions of encode that differ in exactly two ways:

If no prompt_name or prompt is provided, it uses a predefined “query”/“document” prompt, if available in the model’s prompts dictionary (example).
It sets the task to “query”/“document”. If the model has a Router module, it will use the “query”/“document” task type to route the input through the appropriate submodules.

In short, if you use encode_query and encode_document, you can be sure that you're using the model's predefined prompts and use the correct route (if the model has multiple routes).

If you are unsure whether you should use encode, encode_query, or encode_documen), your best bet is to use encode_query and encode_document for Information Retrieval tasks with clear query and document/passage distinction, and use encode for all other tasks.

Note that encode is the most general method and can be used for any task, including Information Retrieval, and that if the model was not trained with predefined prompts and/or task types, then all three methods will return identical embeddings.

See for example this snippet, which automatically uses the “query” prompt stored in the Qwen3-Embedding-0.6B model config.

:::python
from sentence_transformers import SentenceTransformer

# Load the model
model = SentenceTransformer("Qwen/Qwen3-Embedding-0.6B")

# The queries and documents to embed
queries = [
    "What is the capital of China?",
    "Explain gravity",
]
documents = [
    "The capital of China is Beijing.",
    "Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]

# Encode the queries and documents
query_embeddings = model.encode_query(queries)  # Equavalent to model.encode(queries, prompt_name="query")
document_embeddings = model.encode_document(documents)

# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
# tensor([[0.7646, 0.1414],
#         [0.1355, 0.6000]])

Documentation: Migration Guide
Documentation: SentenceTransformer.encode
Documentation: SentenceTransformer.encode_query
Documentation: SentenceTransformer.encode_document

`encode_multi_process` absorbed by `encode`

The encode method (and by extension the encode_query and encode_document methods) can now be used directly for multi-processing/multi-GPU processing, instead of having to use encode_multi_process.

Previously, you had to manually start a multi-processing pool, use encode_multi_process, and stop the pool:

:::python
from sentence_transformers import SentenceTransformer

def main():
    model = SentenceTransformer("all-mpnet-base-v2")
    texts = ["The weather is so nice!", "It's so sunny outside.", ...]

    pool = model.start_multi_process_pool(["cpu", "cpu", "cpu", "cpu"])
    embeddings = model.encode_multi_process(texts, pool, chunk_size=512)
    model.stop_multi_process_pool(pool)

    print(embeddings.shape)
    # => (4000, 768)

if __name__ == "__main__":
    main()

Now you can just pass a list of devices as device to encode:

:::python
from sentence_transformers import SentenceTransformer

def main():
    model = SentenceTransformer("all-mpnet-base-v2")
    texts = ["The weather is so nice!", "It's so sunny outside.", ...]

    embeddings = model.encode(texts, device=["cpu", "cpu", "cpu", "cpu"], chunk_size=512)

    print(embeddings.shape)
    # => (4000, 768)

if __name__ == "__main__":
    main()

The multi-processing can be configured using these parameters:

device: If a list of devices, start multi-processing using those devices. Can be e.g. cpu, but also different GPUs.
pool: You can still use start_multi_process_pool and stop_multi_process_pool to create and stop a multi-processing pool, allowing you to reuse the pool across multiple encode calls via the pool arguments.
chunk_size: When you use multi-processing with n devices, then the inputs will be subdivided into chunks, and those chunks will be spread across the n processes. The size of the chunk can be defined here, although it’s optional. It can have a minor impact on processing speed and memory usage, but is much less important than the batch_size argument.
Documentation: Migration Guide
Documentation: SentenceTransformer.encode

Router module

The Sentence Transformers v5.0 release has refactored the Asym module into the Router module. The previous implementation wasn’t straightforward to use with the other components of the library. We’ve improved heavily on this to make the integration seamless. This module allows you to create asymmetric models that apply different modules depending on the specified route (often “query” or “document”).

Notably, you can use the task argument in model.encode to specify which route to use, and the model.encode_query and model.encode_document convenience methods automatically specify task="query" and task="document", respectively.

See for example opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill for an example of a model using a Router to specify different modules for queries vs documents. Its router_config.json specifies that the query route uses an efficient SparseStaticEmbedding module, while the document route uses the more expensive standard SPLADE modules: MLMTransformer with SpladePooling.

Usage is very straight-forward with the new encode_query and encode_document methods:

:::python
from sentence_transformers import SparseEncoder

# Download from the 🤗 Hub
model = SparseEncoder("opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill")
print(model)
# SparseEncoder(
#   (0): Router(
#     (query_0_SparseStaticEmbedding): SparseStaticEmbedding({'frozen': True}, dim=30522, tokenizer=DistilBertTokenizerFast)
#     (document_0_MLMTransformer): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'})
#     (document_1_SpladePooling): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
#   )
# )

# Run inference
queries = ["what causes aging fast"]
documents = [
    "UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.",
    "Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.",
    "Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]

# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[12.0820,  6.5648,  5.0988]])

Note that if you wish to train a model with a Router, then you must specify the router_mapping training arguments that maps dataset column names to Router routes. Then the Trainer knows which route to use for each dataset column.

Note also that any models using Asym still work as before.

Documentation: Sentence Transformer > Modules > Router

InputModule and Module modules

Alongside introducing some new modules and refactoring the Asym module into the Router module, we also introduced two new "superclass" modules: Module and InputModule. The former is the new base class of all modules, with the latter as the base class of all modules that are also responsible for tokenization (i.e. for processing inputs).

The documentation describes which methods still need to be implemented when you subclass one of these, and also which convenience methods are available for you to use already. It should certainly simplify the creation of custom modules.

Documentation: Sentence Transformer > Modules > Module
Documentation: Sentence Transformer > Modules > InputModule

Custom Learning Rates for parameter groups

With the introduction of the Router module, it’s becoming much simpler to train a “two-tower model” where the query and document encoders differ a lot. For example, a regular Sentence Transformer for the document encoder, and a Static Embedding model for the query encoder.

In such settings, it’s worthwhile to set different learning rates for different parts of the model. Because of this, v5.0 adds a learning_rate_mapping parameter to the Training Arguments classes. This mapping consists of parameter name regular expressions to learning rates, e.g.

:::python
args = SentenceTransformerTrainingArguments(
    ...,
    learning_rate=2e-5,
    learning_rate_mapping={"StaticEmbedding.*": 1e-3},
)

Using these training arguments, the learning rate for every parameter whose name matches the regular expression is 1e-3, while all other parameters have a learning rate of 2e-5. Note that we use re.search for determining whether a parameter matches the regular expression, not match or fullmatch.

Training with composite losses

Many models are trained with just one loss, or perhaps one loss for each dataset. In those cases, all of the losses are nicely logged in both the terminal and third party logging tools (e.g. Weights & Biases, Tensorboard, etc.).

But if you’re using one loss that has multiple components, e.g. a SpladeLoss which sums the losses from FlopsLoss and a SparseMultipleNegativesRankingLoss behind the scenes, then you’re often left guessing whether the various loss components are balanced or not: perhaps one of the two is responsible for 90% of the total loss?

As of the v5.0 release, your loss classes can output dictionaries of loss components. The Trainer will sum them and train like normal, but each of the components will also be logged individually! In short, you can see the various loss components in addition to the final loss itself in your logs.

:::python
class SpladeLoss(nn.Module):
    ...

    def forward(
        self, sentence_features: Iterable[dict[str, torch.Tensor]], labels: torch.Tensor | None = None
    ) -> dict[str, torch.Tensor]:
        # Compute embeddings using the model
        embeddings = [self.model(sentence_feature)["sentence_embedding"] for sentence_feature in sentence_features]

        ...

        return {
              "base_loss": base_loss,
              "document_regularizer_loss": corpus_loss * self.document_regularizer_weight,
              "query_regularizer_loss": query_loss * self.query_regularizer_weight,
        }

Schermafbeelding 2025-07-01 102421

Small improvements

Allow training with custom batch samplers and multi-dataset batch samplers (#3162)
Gradient Checkpointing was fixed for CrossEncoder models (#3331)
Add sif_coefficient, token_remove_pattern, and quantize_to parameters from Model2Vec to StaticEmbedding.from_distillation(...) (#3349)
Added examples for semantic search using OpenSearch and Sentence Transformers (#3369)
Added caching support to mine_hard_negatives (#3338)
Add prompts support to mine_hard_negatives (#3334)
You can now pass truncate_dim to encode (and encode_query, encode_document) instead of exclusively being able to set the truncate_dim when initializing the SentenceTransformer.
You can now access the underlying transformers model with model.transformers_model, works for SentenceTransformer, CrossEncoder, and SparseEncoder.

See our Migration Guide for more details on the changes, as well as the documentation as a whole.

All Changes

[docs] Point to v4.1 new docs pages in index.html by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3328
[ci] Attempt to avoid 429 Client Error in CI by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3342
[fix, cross-encoder] Propagate the gradient checkpointing to the transformer model by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3331
Fix broken links by @shacharmirkin in https://github.com/UKPLab/sentence-transformers/pull/3340
[fix] add dtype propetry for fsdp2 by @meshidenn in https://github.com/UKPLab/sentence-transformers/pull/3337
[tests] Update test based on M2V version by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3354
[docs] Add two useful recommendations to the docs by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3353
[refactor] Refactor module loading; introduce Module subclass by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3345
feat: expose new model2vec parameters by @stephantul in https://github.com/UKPLab/sentence-transformers/pull/3349
Reload all modules when loading the best saved checkpoint, not just the transformer one by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3360
HF tokenizer support for word embeddings by @talbaumel in https://github.com/UKPLab/sentence-transformers/pull/3362
Add OpenSearch-based semantic search example by @zhichao-aws in https://github.com/UKPLab/sentence-transformers/pull/3369
Add embedding cache mechanism to avoid redundant recomputation by @daegonYu in https://github.com/UKPLab/sentence-transformers/pull/3338
Allow passing a custom batch sampler to the trainer by @alonme in https://github.com/UKPLab/sentence-transformers/pull/3162
fix: device name in multi-node ddp by @sasakiyori in https://github.com/UKPLab/sentence-transformers/pull/3373
Propagate local_files_only to model card to avoid verifying dataset/base model by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3374
Correcting TripletEvaluator.py Docstring by @johneckberg in https://github.com/UKPLab/sentence-transformers/pull/3379
feature - mine hard negatives working with prompts by @GivAlz in https://github.com/UKPLab/sentence-transformers/pull/3334
[tests] Improve robustness of model shape assertion in model2vec test by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3391
Move precision validation before embedding computation by @mvairav in https://github.com/UKPLab/sentence-transformers/pull/3385
[fix] Use transformers Peft integration instead of manual get_peft_model call by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3405
[v5] Add support for Sparse Embedding models by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3401
[docs] Fix formatting of docstring arguments in SpladeRegularizerWeightSchedulerCallback by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3408
[fix] Update .gitignore by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3409
[fix] Remove hub_kwargs in SparseStaticEmbedding.from_json in favor of more explicit kwargs by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3407
[docs] Update collections links by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3410

New Contributors

@shacharmirkin made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3340
@meshidenn made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3337
@talbaumel made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3362
@zhichao-aws made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3369
@alonme made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3162
@sasakiyori made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3373
@GivAlz made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3334
@mvairav made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3385
@arthurbr11 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3401

Thanks

I especially want to thank the following teams and individuals for their contributions to this release, small and large, in no particular order:

Amazon OpenSearch, for being receptive to an integration and working together on documentation, blogpost
NAVER, for being receptive to an integration of your excellent SPLADE models
Qdrant, for assisting with semantic search of sparse embeddings
Prithivi Da, for being receptive to an integration of your excellent Apache 2.0 SPLADE models
CSR authors, for working with us to integrate your architecture and open sourcing your models with an integration
Elastic, for assisting with semantic search of sparse embeddings
IBM, for being receptive to an integration of your Sparse model

Apologies if I forgot anyone. And finally a big thanks to Arthur Bresnu, who led a lot of the work on this release. I wouldn't have been able to introduce Sparse Encoders in this fashion, in this timeline, without his excellent work.

Full Changelog: https://github.com/UKPLab/sentence-transformers/compare/v4.1.0...v5.0.0

Source: README.md, updated 2025-07-01

SentenceTransformers Files

Multilingual sentence & image embeddings with BERT

Sparse Encoder models

Are they any good?

Resources

Update Stats

New methods:`encode_query` and `encode_document`

`encode_multi_process` absorbed by `encode`

Router module

InputModule and Module modules

Custom Learning Rates for parameter groups

Training with composite losses

Small improvements

All Changes

New Contributors

Thanks

SentenceTransformers Files

Multilingual sentence & image embeddings with BERT

Get an email when there's a new version of SentenceTransformers

Sparse Encoder models

Are they any good?

Resources

Update Stats

New methods:encode_query and encode_document

encode_multi_process absorbed by encode

Router module

InputModule and Module modules

Custom Learning Rates for parameter groups

Training with composite losses

Small improvements

All Changes

New Contributors

Thanks

New methods:`encode_query` and `encode_document`

`encode_multi_process` absorbed by `encode`