RuFlo - Browse /v3.10.30 at SourceForge.net

The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Name	Modified	Size	InfoDownloads / Week
Parent folder
README.md	2026-05-31	2.1 kB	0
v3.10.30 -- 4-dataset BEIR (rank 3_11 mean) + config-divergence finding source code.tar.gz	2026-05-31	25.4 MB	1
v3.10.30 -- 4-dataset BEIR (rank 3_11 mean) + config-divergence finding source code.zip	2026-05-31	28.5 MB	0
Totals: 3 Items		54.0 MB	1

What ships

4th BEIR dataset (SciDocs) joins NFCorpus + SciFact + ArguAna. New finding: no single pipeline wins everywhere.

SciDocs results

Pipeline	nDCG@10	Rank
dense alone (BGE-base)	0.211	2/11
Lucene RRF (no rerank)	0.203	(-0.008, RRF hurt)

Only behind BGE-large (335M, 0.225). Beats BM25, GTR-XL (1.2B), every other published baseline.

4-dataset mean leaderboard

System	Params	NFCorpus	SciFact	ArguAna	SciDocs	Mean
BGE-large (published)	335M	0.380	0.722	0.636	0.225	0.491
SPLADE++ (published)	110M	0.347	0.704	0.521	0.159	0.433
ruflo best (per-dataset)	110M	0.358	0.683	0.432	0.211	0.421
GTR-XL (1.2B)	1.2B	0.343	0.662	0.439	0.174	0.405
GenQ	110M	0.319	0.644	0.493	0.143	0.400
BM25 (Lucene published)	—	0.325	0.679	0.397	0.158	0.390

Rank 3 of 11 on 4-dataset mean. Beats GTR-XL with 1/10× the params. Loses only to SPLADE++ (-0.012, basically tied) and BGE-large (-0.070, mostly the ArguAna gap).

The config-divergence finding

After 4 datasets, no single pipeline wins everywhere:

Dataset	Best config	What hurts
NFCorpus	Lucene + RRF + CE rerank	nothing
SciFact	Lucene + RRF + CE rerank	nothing
ArguAna	Lucene + RRF (no CE)	CE rerank actively hurts
SciDocs	dense alone	RRF hurt by 0.008

Three of four datasets pick a different best config. Auto-pipeline-selection would need a per-corpus calibrator (cheap, doesn't need GPU — tracked).

Honest limits

4/18 BEIR datasets. The 0.421 mean is suggestive, not BEIR-average.
Zero-shot — NFCorpus and ArguAna train splits remain unused.
The 5 biggest BEIR datasets (TREC-COVID, FiQA, HotpotQA, NQ, DBPedia, all >50k docs) remain GPU-gated.

Install

:::bash
npx ruflo@3.10.30    # latest / alpha / v3alpha all aligned

Full ADR: v3/docs/adr/ADR-091-scidocs-and-config-divergence.md

Source: README.md, updated 2026-05-31

RuFlo Files

The leading agent orchestration platform for Claude

What ships

SciDocs results

4-dataset mean leaderboard

The config-divergence finding

Honest limits

Install

RuFlo Files

The leading agent orchestration platform for Claude

Get an email when there's a new version of RuFlo

What ships

SciDocs results

4-dataset mean leaderboard

The config-divergence finding

Honest limits

Install