Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
README.md | 2025-07-01 | 37.7 kB | |
v5.0.0 - SparseEncoder support_ encode_query _ encode_document_ multi-processing in encode_ Router_ and more source code.tar.gz | 2025-07-01 | 13.1 MB | |
v5.0.0 - SparseEncoder support_ encode_query _ encode_document_ multi-processing in encode_ Router_ and more source code.zip | 2025-07-01 | 13.6 MB | |
Totals: 3 Items | 26.8 MB | 0 |
This release consists of significant updates including the introduction of Sparse Encoder models, new methods encode_query
and encode_document
, multi-processing support in encode
, the Router
module for asymmetric models, custom learning rates for parameter groups, composite loss logging, and various small improvements and bug fixes.
Install this version with
:::bash
# Training + Inference
pip install sentence-transformers[train]==5.0.0
# Inference only, use one of:
pip install sentence-transformers==5.0.0
pip install sentence-transformers[onnx-gpu]==5.0.0
pip install sentence-transformers[onnx]==5.0.0
pip install sentence-transformers[openvino]==5.0.0
[!TIP] Our Training and Finetuning Sparse Embedding Models with Sentence Transformers v5 blogpost is an excellent place to learn about finetuning sparse embedding models!
[!NOTE] This release is designed to be fully backwards compatible, meaning that you should be able to upgrade from older versions to v5.x without any issues. If you are running into issues when upgrading, feel free to open an issue. Also see the Migration Guide for changes that we would recommend.
Sparse Encoder models
The Sentence Transformers v5.0 release introduces Sparse Embedding models, also known as Sparse Encoders. These models generate high-dimensional embeddings, often with 30,000+ dimensions, where often only <1% of dimensions are non-zero. This is in contrast to the standard dense embedding models, which produce low-dimensional embeddings (e.g., 384, 768, or 1024 dimensions) where all values are non-zero.
Usually, each active dimension (i.e. the dimension with a non-zero value) in a sparse embedding corresponds to a specific token in the model's vocabulary, allowing for interpretability. This means that you can e.g. see exactly which words/tokens are important in an embedding, and that you can inspect exactly because of which words/tokens two texts are deemed similar.
Let's have a look at naver/splade-v3, a strong sparse embedding model, as an example:
:::python
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("naver/splade-v3")
# Run inference
sentences = [
"The weather is lovely today.",
"It's so sunny outside!",
"He drove to the stadium.",
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# (3, 30522)
# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities)
# tensor([[ 32.4323, 5.8528, 0.0258],
# [ 5.8528, 26.6649, 0.0302],
# [ 0.0258, 0.0302, 24.0839]])
# Let's decode our embeddings to be able to interpret them
decoded = model.decode(embeddings, top_k=10)
for decoded, sentence in zip(decoded, sentences):
print(f"Sentence: {sentence}")
print(f"Decoded: {decoded}")
print()
Sentence: The weather is lovely today.
Decoded: [('weather', 2.754288673400879), ('today', 2.610959529876709), ('lovely', 2.431990623474121), ('currently', 1.5520408153533936), ('beautiful', 1.5046082735061646), ('cool', 1.4664798974990845), ('pretty', 0.8986214995384216), ('yesterday', 0.8603134155273438), ('nice', 0.8322536945343018), ('summer', 0.7702118158340454)]
Sentence: It's so sunny outside!
Decoded: [('outside', 2.6939032077789307), ('sunny', 2.535827398300171), ('so', 2.0600898265838623), ('out', 1.5397940874099731), ('weather', 1.1198079586029053), ('very', 0.9873268604278564), ('cool', 0.9406591057777405), ('it', 0.9026399254798889), ('summer', 0.684999406337738), ('sun', 0.6520509123802185)]
Sentence: He drove to the stadium.
Decoded: [('stadium', 2.7872302532196045), ('drove', 1.8208855390548706), ('driving', 1.6665740013122559), ('drive', 1.5565159320831299), ('he', 1.4721972942352295), ('stadiums', 1.449463129043579), ('to', 1.0441515445709229), ('car', 0.7002660632133484), ('visit', 0.5118278861045837), ('football', 0.502326250076294)]
In this example, the embeddings are 30,522-dimensional vectors, where each dimension corresponds to a token in the model's vocabulary. The decode
method returned the top 10 tokens with the highest values in the embedding, allowing us to interpret which tokens contribute most to the embedding.
We can even determine the intersection or overlap between embeddings, very useful for determining why two texts are deemed similar or dissimilar:
:::python
# Let's also compute the intersection/overlap of the first two embeddings
intersection_embedding = model.intersection(embeddings[0], embeddings[1])
decoded_intersection = model.decode(intersection_embedding)
print(decoded_intersection)
Decoded: [('weather', 3.0842742919921875), ('cool', 1.379457712173462), ('summer', 0.5275946259498596), ('comfort', 0.3239051103591919), ('sally', 0.22571465373039246), ('julian', 0.14787325263023376), ('nature', 0.08582140505313873), ('beauty', 0.0588383711874485), ('mood', 0.018594780936837196), ('nathan', 0.000752730411477387)]
And if we think the embeddings are too big, we can limit the maximum number of active dimensions like so:
:::python
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("naver/splade-v3") # You can also set max_active_dims here instead of encode()
# Run inference
documents = [
"UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.",
"Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.",
"Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)",
]
embeddings = model.encode_document(documents, max_active_dims=64)
print(embeddings.shape)
# (3, 30522)
# Print the sparsity of the embeddings
sparsity = model.sparsity(embeddings)
print(sparsity)
# {'active_dims': 64.0, 'sparsity_ratio': 0.9979031518249132}
Click to see that it has minimal impact on scores
:::python from sentence_transformers import SparseEncoder # Download from the 🤗 Hub model = SparseEncoder("naver/splade-v3") # You can also set max_active_dims here instead of encode() # Run inference queries = ["what causes aging fast"] documents = [ "UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.", "Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.", "Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)", ] query_embeddings = model.encode_query(queries) document_embeddings = model.encode_document(documents) # Determine the sparsity query_sparsity = model.sparsity(query_embeddings) document_sparsity = model.sparsity(document_embeddings) print(query_sparsity, document_sparsity) # {'active_dims': 28.0, 'sparsity_ratio': 0.9990826289233995} {'active_dims': 174.6666717529297, 'sparsity_ratio': 0.9942773516888497} # Calculate the similarity scores for the embeddings similarities = model.similarity(query_embeddings, document_embeddings) print(similarities) # tensor([[11.3767, 10.8296, 4.3457]], device='cuda:0') # Again with smaller max_active_dims smaller_document_embeddings = model.encode_document(documents, max_active_dims=64) # Determine the sparsity for the smaller document embeddings smaller_document_sparsity = model.sparsity(smaller_document_embeddings) print(query_sparsity, smaller_document_sparsity) # {'active_dims': 28.0, 'sparsity_ratio': 0.9990826289233995} {'active_dims': 64.0, 'sparsity_ratio': 0.9979031518249132} # Print the similarity scores for the smaller document embeddings smaller_similarities = model.similarity(query_embeddings, smaller_document_embeddings) print(smaller_similarities) # tensor([[10.1311, 9.8360, 4.3457]], device='cuda:0') # Very similar to the scores for the full document embeddings!Are they any good?
A big question is: How do sparse embedding models stack up against the “standard” dense embedding models, and what kind of performance can you expect when combining various?
For this, I ran a variation of our hybrid_search.py evaluation script, with:
- The NanoMSMARCO dataset (a subset of the MS MARCO eval split)
- Qwen/Qwen3-Embedding-0.6B dense embedding model
- naver/splade-v3-doc sparse embedding model, inference free for queries
- Alibaba-NLP/gte-reranker-modernbert-base reranker
Which resulted in this evaluation:
Dense | Sparse | Reranker | NDCG@10 | MRR@10 | MAP |
---|---|---|---|---|---|
x | 65.33 | 57.56 | 57.97 | ||
x | 67.34 | 59.59 | 59.98 | ||
x | x | 72.39 | 66.99 | 67.59 | |
x | x | 68.37 | 62.76 | 63.56 | |
x | x | 69.02 | 63.66 | 64.44 | |
x | x | x | 68.28 | 62.66 | 63.44 |
Here, the sparse embedding model actually already outperforms the dense one, but the real magic happens when combining the two: hybrid search. In our case, we used Reciprocal Rank Fusion to merge the two rankings.
Rerankers also help improve the performance of the dense or sparse model here, but hurt the performance of the hybrid search, as its performance is already beyond what the reranker can achieve.
[!NOTE] The naver/splade-v3-doc was trained on the MS MARCO training set, so this is in-domain performance, much like what you might expect if you finetune on your own data.
Resources
Check out the following links to get a better feel for what Sparse Encoders are, how they work, what architectures exist, how to use them, what pretrained models exist, how to finetune them, and more:
- Blogpost:
- Documentation:
- Models:
Update Stats
The introduction of SparseEncoder has been one of the largest updates to Sentence Transformers, introducing all of the following:
- Code:
- New Trainer, Training Arguments, Data Collator, Model Card generation + template, with backwards compatibility
- 4 new, 1 refactored modules to support at least 3 model archetypes: SPLADE, Inference-free SPLADE, and CSR
- 12 new losses
- 9 new evaluators
- 1 new Callback
- 4 example integrations with ElasticSearch, OpenSearch, Qdrant, and Seismic
- Tests:
- Docs:
New methods:encode_query
and encode_document
Sentence Transformers v5.0 introduces two new core methods to the SentenceTransformer
and SparseEncoder
classes: encode_query
and encode_document
.
These methods are specialized versions of encode
that differ in exactly two ways:
- If no
prompt_name
orprompt
is provided, it uses a predefined “query”/“document” prompt, if available in the model’sprompts
dictionary (example). - It sets the
task
to “query”/“document”. If the model has aRouter
module, it will use the “query”/“document” task type to route the input through the appropriate submodules.
In short, if you use encode_query
and encode_document
, you can be sure that you're using the model's predefined prompts and use the correct route (if the model has multiple routes).
If you are unsure whether you should use encode
, encode_query
, or encode_documen)
,
your best bet is to use encode_query
and encode_document
for Information Retrieval tasks
with clear query and document/passage distinction, and use encode
for all other tasks.
Note that encode
is the most general method and can be used for any task, including Information
Retrieval, and that if the model was not trained with predefined prompts and/or task types, then all three methods will return identical embeddings.
See for example this snippet, which automatically uses the “query” prompt stored in the Qwen3-Embedding-0.6B model config.
:::python
from sentence_transformers import SentenceTransformer
# Load the model
model = SentenceTransformer("Qwen/Qwen3-Embedding-0.6B")
# The queries and documents to embed
queries = [
"What is the capital of China?",
"Explain gravity",
]
documents = [
"The capital of China is Beijing.",
"Gravity is a force that attracts two bodies towards each other. It gives weight to physical objects and is responsible for the movement of planets around the sun.",
]
# Encode the queries and documents
query_embeddings = model.encode_query(queries) # Equavalent to model.encode(queries, prompt_name="query")
document_embeddings = model.encode_document(documents)
# Compute the (cosine) similarity between the query and document embeddings
similarity = model.similarity(query_embeddings, document_embeddings)
print(similarity)
# tensor([[0.7646, 0.1414],
# [0.1355, 0.6000]])
- Documentation: Migration Guide
- Documentation: SentenceTransformer.encode
- Documentation: SentenceTransformer.encode_query
- Documentation: SentenceTransformer.encode_document
encode_multi_process
absorbed by encode
The encode
method (and by extension the encode_query
and encode_document
methods) can now be used directly for multi-processing/multi-GPU processing, instead of having to use encode_multi_process
.
Previously, you had to manually start a multi-processing pool, use encode_multi_process
, and stop the pool:
:::python
from sentence_transformers import SentenceTransformer
def main():
model = SentenceTransformer("all-mpnet-base-v2")
texts = ["The weather is so nice!", "It's so sunny outside.", ...]
pool = model.start_multi_process_pool(["cpu", "cpu", "cpu", "cpu"])
embeddings = model.encode_multi_process(texts, pool, chunk_size=512)
model.stop_multi_process_pool(pool)
print(embeddings.shape)
# => (4000, 768)
if __name__ == "__main__":
main()
Now you can just pass a list of devices as device
to encode
:
:::python
from sentence_transformers import SentenceTransformer
def main():
model = SentenceTransformer("all-mpnet-base-v2")
texts = ["The weather is so nice!", "It's so sunny outside.", ...]
embeddings = model.encode(texts, device=["cpu", "cpu", "cpu", "cpu"], chunk_size=512)
print(embeddings.shape)
# => (4000, 768)
if __name__ == "__main__":
main()
The multi-processing can be configured using these parameters:
device
: If a list of devices, start multi-processing using those devices. Can be e.g. cpu, but also different GPUs.pool
: You can still usestart_multi_process_pool
andstop_multi_process_pool
to create and stop a multi-processing pool, allowing you to reuse the pool across multipleencode
calls via thepool
arguments.-
chunk_size
: When you use multi-processing with n devices, then the inputs will be subdivided into chunks, and those chunks will be spread across the n processes. The size of the chunk can be defined here, although it’s optional. It can have a minor impact on processing speed and memory usage, but is much less important than thebatch_size
argument. -
Documentation: Migration Guide
- Documentation: SentenceTransformer.encode
Router module
The Sentence Transformers v5.0 release has refactored the Asym
module into the Router
module. The previous implementation wasn’t straightforward to use with the other components of the library. We’ve improved heavily on this to make the integration seamless. This module allows you to create asymmetric models that apply different modules depending on the specified route (often “query” or “document”).
Notably, you can use the task
argument in model.encode
to specify which route to use, and the model.encode_query
and model.encode_document
convenience methods automatically specify task="query"
and task="document"
, respectively.
See for example opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill for an example of a model using a Router
to specify different modules for queries vs documents. Its router_config.json specifies that the query route uses an efficient SparseStaticEmbedding
module, while the document route uses the more expensive standard SPLADE modules: MLMTransformer
with SpladePooling
.
Usage is very straight-forward with the new encode_query
and encode_document
methods:
:::python
from sentence_transformers import SparseEncoder
# Download from the 🤗 Hub
model = SparseEncoder("opensearch-project/opensearch-neural-sparse-encoding-doc-v2-distill")
print(model)
# SparseEncoder(
# (0): Router(
# (query_0_SparseStaticEmbedding): SparseStaticEmbedding({'frozen': True}, dim=30522, tokenizer=DistilBertTokenizerFast)
# (document_0_MLMTransformer): MLMTransformer({'max_seq_length': 512, 'do_lower_case': False, 'architecture': 'DistilBertForMaskedLM'})
# (document_1_SpladePooling): SpladePooling({'pooling_strategy': 'max', 'activation_function': 'relu', 'word_embedding_dimension': 30522})
# )
# )
# Run inference
queries = ["what causes aging fast"]
documents = [
"UV-A light, specifically, is what mainly causes tanning, skin aging, and cataracts, UV-B causes sunburn, skin aging and skin cancer, and UV-C is the strongest, and therefore most effective at killing microorganisms. Again â\x80\x93 single words and multiple bullets.",
"Answers from Ronald Petersen, M.D. Yes, Alzheimer's disease usually worsens slowly. But its speed of progression varies, depending on a person's genetic makeup, environmental factors, age at diagnosis and other medical conditions. Still, anyone diagnosed with Alzheimer's whose symptoms seem to be progressing quickly â\x80\x94 or who experiences a sudden decline â\x80\x94 should see his or her doctor.",
"Bell's palsy and Extreme tiredness and Extreme fatigue (2 causes) Bell's palsy and Extreme tiredness and Hepatitis (2 causes) Bell's palsy and Extreme tiredness and Liver pain (2 causes) Bell's palsy and Extreme tiredness and Lymph node swelling in children (2 causes)",
]
query_embeddings = model.encode_query(queries)
document_embeddings = model.encode_document(documents)
print(query_embeddings.shape, document_embeddings.shape)
# [1, 30522] [3, 30522]
# Get the similarity scores for the embeddings
similarities = model.similarity(query_embeddings, document_embeddings)
print(similarities)
# tensor([[12.0820, 6.5648, 5.0988]])
Note that if you wish to train a model with a Router
, then you must specify the router_mapping
training arguments that maps dataset column names to Router
routes. Then the Trainer knows which route to use for each dataset column.
Note also that any models using Asym
still work as before.
- Documentation: Sentence Transformer > Modules > Router
InputModule and Module modules
Alongside introducing some new modules and refactoring the Asym
module into the Router
module, we also introduced two new "superclass" modules: Module
and InputModule
. The former is the new base class of all modules, with the latter as the base class of all modules that are also responsible for tokenization (i.e. for processing inputs).
The documentation describes which methods still need to be implemented when you subclass one of these, and also which convenience methods are available for you to use already. It should certainly simplify the creation of custom modules.
- Documentation: Sentence Transformer > Modules > Module
- Documentation: Sentence Transformer > Modules > InputModule
Custom Learning Rates for parameter groups
With the introduction of the Router
module, it’s becoming much simpler to train a “two-tower model” where the query and document encoders differ a lot. For example, a regular Sentence Transformer for the document encoder, and a Static Embedding model for the query encoder.
In such settings, it’s worthwhile to set different learning rates for different parts of the model. Because of this, v5.0 adds a learning_rate_mapping
parameter to the Training Arguments classes. This mapping consists of parameter name regular expressions to learning rates, e.g.
:::python
args = SentenceTransformerTrainingArguments(
...,
learning_rate=2e-5,
learning_rate_mapping={"StaticEmbedding.*": 1e-3},
)
Using these training arguments, the learning rate for every parameter whose name matches the regular expression is 1e-3, while all other parameters have a learning rate of 2e-5. Note that we use re.search
for determining whether a parameter matches the regular expression, not match
or fullmatch
.
Training with composite losses
Many models are trained with just one loss, or perhaps one loss for each dataset. In those cases, all of the losses are nicely logged in both the terminal and third party logging tools (e.g. Weights & Biases, Tensorboard, etc.).
But if you’re using one loss that has multiple components, e.g. a SpladeLoss which sums the losses from FlopsLoss and a SparseMultipleNegativesRankingLoss behind the scenes, then you’re often left guessing whether the various loss components are balanced or not: perhaps one of the two is responsible for 90% of the total loss?
As of the v5.0 release, your loss classes can output dictionaries of loss components. The Trainer will sum them and train like normal, but each of the components will also be logged individually! In short, you can see the various loss components in addition to the final loss itself in your logs.
:::python
class SpladeLoss(nn.Module):
...
def forward(
self, sentence_features: Iterable[dict[str, torch.Tensor]], labels: torch.Tensor | None = None
) -> dict[str, torch.Tensor]:
# Compute embeddings using the model
embeddings = [self.model(sentence_feature)["sentence_embedding"] for sentence_feature in sentence_features]
...
return {
"base_loss": base_loss,
"document_regularizer_loss": corpus_loss * self.document_regularizer_weight,
"query_regularizer_loss": query_loss * self.query_regularizer_weight,
}
Small improvements
- Allow training with custom batch samplers and multi-dataset batch samplers (#3162)
- Gradient Checkpointing was fixed for CrossEncoder models (#3331)
- Add
sif_coefficient
,token_remove_pattern
, andquantize_to
parameters from Model2Vec toStaticEmbedding.from_distillation(...)
(#3349) - Added examples for semantic search using OpenSearch and Sentence Transformers (#3369)
- Added caching support to
mine_hard_negatives
(#3338) - Add prompts support to
mine_hard_negatives
(#3334) - You can now pass
truncate_dim
toencode
(andencode_query
,encode_document
) instead of exclusively being able to set thetruncate_dim
when initializing theSentenceTransformer
. - You can now access the underlying
transformers
model withmodel.transformers_model
, works forSentenceTransformer
,CrossEncoder
, andSparseEncoder
.
See our Migration Guide for more details on the changes, as well as the documentation as a whole.
All Changes
- [
docs
] Point to v4.1 new docs pages in index.html by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3328 - [
ci
] Attempt to avoid 429 Client Error in CI by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3342 - [
fix
,cross-encoder
] Propagate the gradient checkpointing to the transformer model by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3331 - Fix broken links by @shacharmirkin in https://github.com/UKPLab/sentence-transformers/pull/3340
- [fix] add dtype propetry for fsdp2 by @meshidenn in https://github.com/UKPLab/sentence-transformers/pull/3337
- [
tests
] Update test based on M2V version by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3354 - [
docs
] Add two useful recommendations to the docs by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3353 - [
refactor
] Refactor module loading; introduce Module subclass by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3345 - feat: expose new model2vec parameters by @stephantul in https://github.com/UKPLab/sentence-transformers/pull/3349
- Reload all modules when loading the best saved checkpoint, not just the transformer one by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3360
- HF tokenizer support for word embeddings by @talbaumel in https://github.com/UKPLab/sentence-transformers/pull/3362
- Add OpenSearch-based semantic search example by @zhichao-aws in https://github.com/UKPLab/sentence-transformers/pull/3369
- Add embedding cache mechanism to avoid redundant recomputation by @daegonYu in https://github.com/UKPLab/sentence-transformers/pull/3338
- Allow passing a custom batch sampler to the trainer by @alonme in https://github.com/UKPLab/sentence-transformers/pull/3162
- fix: device name in multi-node ddp by @sasakiyori in https://github.com/UKPLab/sentence-transformers/pull/3373
- Propagate local_files_only to model card to avoid verifying dataset/base model by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3374
- Correcting TripletEvaluator.py Docstring by @johneckberg in https://github.com/UKPLab/sentence-transformers/pull/3379
- feature - mine hard negatives working with prompts by @GivAlz in https://github.com/UKPLab/sentence-transformers/pull/3334
- [
tests
] Improve robustness of model shape assertion in model2vec test by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3391 - Move precision validation before embedding computation by @mvairav in https://github.com/UKPLab/sentence-transformers/pull/3385
- [
fix
] Use transformers Peft integration instead of manual get_peft_model call by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3405 - [
v5
] Add support for Sparse Embedding models by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3401 - [
docs
] Fix formatting of docstring arguments in SpladeRegularizerWeightSchedulerCallback by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3408 - [
fix
] Update .gitignore by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3409 - [
fix
] Remove hub_kwargs in SparseStaticEmbedding.from_json in favor of more explicit kwargs by @tomaarsen in https://github.com/UKPLab/sentence-transformers/pull/3407 - [
docs
] Update collections links by @arthurbr11 in https://github.com/UKPLab/sentence-transformers/pull/3410
New Contributors
- @shacharmirkin made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3340
- @meshidenn made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3337
- @talbaumel made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3362
- @zhichao-aws made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3369
- @alonme made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3162
- @sasakiyori made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3373
- @GivAlz made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3334
- @mvairav made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3385
- @arthurbr11 made their first contribution in https://github.com/UKPLab/sentence-transformers/pull/3401
Thanks
I especially want to thank the following teams and individuals for their contributions to this release, small and large, in no particular order:
- Amazon OpenSearch, for being receptive to an integration and working together on documentation, blogpost
- NAVER, for being receptive to an integration of your excellent SPLADE models
- Qdrant, for assisting with semantic search of sparse embeddings
- Prithivi Da, for being receptive to an integration of your excellent Apache 2.0 SPLADE models
- CSR authors, for working with us to integrate your architecture and open sourcing your models with an integration
- Elastic, for assisting with semantic search of sparse embeddings
- IBM, for being receptive to an integration of your Sparse model
Apologies if I forgot anyone. And finally a big thanks to Arthur Bresnu, who led a lot of the work on this release. I wouldn't have been able to introduce Sparse Encoders in this fashion, in this timeline, without his excellent work.
Full Changelog: https://github.com/UKPLab/sentence-transformers/compare/v4.1.0...v5.0.0