[proteowizard-developer] Bruker CompassXtract Reader in ProteoWizard

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 454-5900

Hi all,

The Caprioli lab here at Vanderbilt now has a license for CompassXtract. 
This is Bruker's data access SDK, equivalent to Thermo's XDK or 
MassLynx's DAL. The license permits redistribution of the binary (DLL) 
with our applications, which in a moderately broad interpretation should 
include ProteoWizard since our applications depend on it. I am in the 
process of developing the Reader now to plug it in to pwiz, but exactly 
how it should work given Bruker's strange directory structure(s) is not 
clear to me. I am not intimately familiar with Bruker data, but from 
what I have gleaned it can take one of three forms: YEP, FID, or BAF.

Should there be a separate Reader_* implementation for each of the 
forms, e.g. Reader_Bruker_YEP, Reader_Bruker_FID, Reader_Bruker_BAF, or 
should it all be supported under Reader_Bruker? The Reader::identify() 
function could be a little tricky in that case, i.e. the same reader 
could return more than two kinds of values (currently it's empty string 
or not empty).

Also, at least in the FID case and probably in the YEP and BAF cases as 
well, the API treats the format like the MassLynx DAL does, i.e. the 
directory itself is the source. With FID data this is particularly 
interesting because each FID is a single spectrum with a separate 
directory structure. It may be possible and desirable in some cases to 
automatically read in multiple FID sources under a single directory to 
create the "run" of spectra that is intrinsic to mzML and ProteoWizard. 
Thoughts on this?

CC'd to spctools-dev because I expect they'll be interested.

-Matt