sorry for my late reply, but I had teeth surgery, as David knows.

@David: ok, I see why you used it. We thought of leveraging mzML for metadata, but we still have to define how, if testing comes good.

This week I can't work on this project, but next week I will. So, I'll let you know as soon as possible!



Hi Sara,

Actually, this format is really is used to store the mz data of a single
run. We have other schemas for lab experiments.
The tables instrument and run have a single record stored.
This is not conventional but this is the way I find to set the data

The goal of this schema was very close to the one you solved with the Rtree.
I effectively divide the mz acquistion range in slices (by default one
run_slice for each uma).
Scans are cut to be transformed in "scan slices" and they are linked
corresponding run_slice.
mz data points (mz_list and intensity_list) are stored in the scan_slice
in a binary structure (consecutive DOUBLE numbers).

Using the indexation mechanism of SQLite on tables run_slice and
scan_slice it is thus possible to make fast range queries on the mz data.
However some postprocess mz filtering has to be done on the data
contained in the retrieved slices to have only the wanted data points.