Hi David,

sorry for the late reply, I'm very busy.
I haven't yet. It won't take me so much time, but I couldn't find any free time, maybe during this weekend...

For the number of data points, it depends on the experimental setup and on the data resolution, as far as I know, and it should be:

((mz_f-mz_i)/data_resolution) * (scan_number_f-scan_number_i)*data_density

data_resolution: e.g., 0.001 Da (sampling step along the m/z dimension)

data_density: I only saw low density files (no more than 10%), but mine it's a limited experience...I guess they can reach higher densities, what did you see?


Da: David Bouyssié <david.bouyssie@ipbs.fr>
A: proteowizard-mzrtree@lists.sourceforge.net
Inviato: Gio 18 novembre 2010, 13:33:46
Oggetto: Re: [proteowizard-mzrtree] next week working on it

Hi !

I have some time to work on this project.

I have some questions ofr Sara:
- have you setup some java classes to test the feasability ? If so can you share them ?
- do you know the order of magnitude of inserts that have to be done on the data table for a big raw file (i.e. the number of data points) ? I think this this could be an isssue.


Le 11/11/2010 16:29, Sara Nasso a écrit :

sorry for my late reply, but I had teeth surgery, as David knows.

@David: ok, I see why you used it. We thought of leveraging mzML for metadata, but we still have to define how, if testing comes good.

This week I can't work on this project, but next week I will. So, I'll let you know as soon as possible!



Hi Sara,

Actually, this format is really is used to store the mz data of a single
run. We have other schemas for lab experiments.
The tables instrument and run have a single record stored.
This is not conventional but this is the way I
find to set the data

The goal of this schema was very close to the one you solved with the Rtree.
I effectively divide the mz acquistion range in slices (by default one
run_slice for each uma).
Scans are cut to be transformed in "scan slices" and they are linked
corresponding run_slice.
mz data points (mz_list and intensity_list) are stored in the scan_slice
in a binary structure (consecutive DOUBLE numbers).

Using the indexation mechanism of SQLite on tables run_slice and
scan_slice it is thus possible to make fast range queries on the mz data.
However some postprocess mz filtering has to be done on the data
contained in the retrieved slices to have only the wanted data points.


------------------------------------------------------------------------------ Centralized Desktop Delivery: Dell and VMware Reference Architecture Simplifying enterprise desktop deployment and management using Dell EqualLogic storage and VMware View: A highly scalable, end-to-end client virtualization framework. Read more! http://p.sf.net/sfu/dell-eql-dev2dev
_______________________________________________ proteowizard-mzrtree mailing list proteowizard-mzrtree@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/proteowizard-mzrtree