| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| 2.13.4 source code.tar.gz | 2026-05-16 | 10.8 MB | |
| 2.13.4 source code.zip | 2026-05-16 | 13.1 MB | |
| README.md | 2026-05-16 | 2.0 kB | |
| Totals: 3 Items | 23.9 MB | 0 | |
2.13.4 (2026-05-16)
Documentation
-
docs: Update documentation for caching by default in
task.calculate_descriptive_statistics()(#4676) -
Update documentation for caching by default in task.calculate_descriptive_statistics()
-
lintter
-
remove extra logging
-
fix indentation
-
Apply suggestions from code review
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me> (0884d1e)
Fix
-
fix: add AMI as clustering metric (#4654)
-
feat: always compute AMI alongside V-measure in clustering
Adds Adjusted Mutual Information (AMI) as a clustering metric alongside V-measure. Both metrics are computed on every bootstrap iteration via the _METRIC_FUNCS registry, so tasks get both scores without any opt-in.
The v_measures result key is retained for backward compatibility; AMI scores are exposed under ami_scores / ami / ami_std.
Motivated by review on PR [#4609] (HumanConceptsClustering), whose source paper (arXiv:2505.17117) uses AMI as its main metric.
- refactor: move metric_funcs inside _evaluate_clustering_bootstrapped
The mapping is only used inside the bootstrap loop, so scoping it to the function reads more naturally and keeps the module top-level lean.
- test: cover v_measure + ami in clustering evaluation
Adds focused tests for the clustering evaluation path:
- Direct unit tests on _evaluate_clustering_bootstrapped: registry returns both metrics, perfect labels score near 1, random labels give AMI near 0 (chance correction).
-
End-to-end test that runs mteb.evaluate on MockClusteringFastTask and asserts the result dict exposes v_measure, v_measure_std, v_measures, ami, ami_std, and ami_scores.
-
simplify
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> (26a44b3)