From: Brian P. <bri...@in...> - 2012-04-16 21:57:44
|
For zero removal I imagine we'd add a method "MS_thresholding" to the record, what about zero adding? On Mon, Apr 16, 2012 at 1:38 PM, Brian Pratt <bri...@in...> wrote: > I like it. > > On Mon, Apr 16, 2012 at 1:36 PM, Matthew Chambers > <mat...@gm...> wrote: >> I'm fine with that except "NonFlanking" might be better as "Extra" since there's a non-obvious >> "ToNonZeroSample" inference needed to understand the former. >> >> -Matt >> >> >> On 4/16/2012 1:56 PM, Brian Pratt wrote: >>> Still unconvinced, but if one goes that route, rather than >>> zeroSamples [exclude|include[=flankingZeroCount]] [ms-levels int_set] >>> it might be more understandable as >>> zeroSamples [removeNonFlanking|addMissing[=flankingZeroCount]] >>> [ms-levels int_set] >>> >>> >>> On Mon, Apr 16, 2012 at 11:43 AM, Matthew Chambers >>> <mat...@gm...> wrote: >>>> On 4/16/2012 1:06 PM, Brian Pratt wrote: >>>>> Hi Matt, >>>>> >>>>> Thanks for your help on this. >>>>> >>>>>> I do still like the idea of combining the include/exclude feature in a single filter >>>>> The more I think about it, filtering out something and making an >>>>> educated guess at putting something in don't strike me as symmetrical. >>>>> I think combining them will be confusing to users. And zero sample >>>>> filling isn't actually what I've been asked to work on, anyway. >>>> Adding and removing zero samples are definitely symmetrical as far as I'm concerned. Their purposes >>>> (reducing file size vs. making popular data filters like Savitsky-Golay work) are not. Since what >>>> they /do/ is converse, they make sense in a single filter. And if only for the sake of reducing >>>> filter bloat, I'd like to see them together. It's bad enough having "precursorRefine" and >>>> "precursorRecalculation" :P >>>> >>>> >>>>>> The ms levels argument should be an IntegerSet like the other filters, not a space delimited set. >>>>> It is an IntegerSet. An IntegerSet can contain multiple intervals, >>>>> and an interval can have a width of one. So "1-3" and "1 2 3" are >>>>> equivalent, but more importantly you can specify "1 3" if you want to >>>>> leave "2" out of things. I was thinking that MSConvertGUI would be >>>>> more useful if we didn't have two text box controls for layer ranges, >>>>> but rather just a single one - more like the "Pages" control found in >>>>> most Print... dialog boxes where you can say "1 3 5-7" etc. >>>> Oops, I was mistaken in thinking IntegerSet handled multiple intervals in comma-delimited form. It >>>> is indeed space delimited. So as long as the ms level set is the last or only argument, it's easy to >>>> pass that part of the string to IntegerSet. >>>> >>>> -Matt >>>> >>>> >>>>> >>>>> On Fri, Apr 13, 2012 at 7:16 PM, Matt Chambers >>>>> <mat...@gm...> wrote: >>>>>> Hi Brian, >>>>>> >>>>>> I do still like the idea of combining the include/exclude feature in a >>>>>> single filter, although I admit that calling something a "filter" when >>>>>> it adds data is odd (then again it's not really changing data, in the >>>>>> sense of changing signal/noise). If we do it that way then the verb can >>>>>> be the argument exclude/include. I don't think the "NonFlanking" is >>>>>> necessary. The ms levels argument should be an IntegerSet like the other >>>>>> filters, not a space delimited set. Are you skipping already centroided >>>>>> scans? You can skip those for an include mode, and remove all 0s for the >>>>>> exclude mode. >>>>>> >>>>>> If we follow the noun convention of the current filters, it would look >>>>>> reasonable as: >>>>>> --filter "zeroSamples include 2-3" >>>>>> >>>>>> The include mode would simply call ZeroSampleFiller. The problem there >>>>>> is that it really needs a count argument: I don't know that it's ever >>>>>> necessary to have ALL zero samples for any signal processing algorithm, >>>>>> just a large enough set of samples around each data cluster. So to >>>>>> handle that, how about: >>>>>> zeroSamples [exclude|include[=flankingZeroCount]] [ms-levels int_set] >>>>>> >>>>>> I realize the =flankingZeroCount optional part makes parsing a bit >>>>>> harder, so if you want to leave that out and pass some default value >>>>>> like 10 and wash your hands of the affair I'm fine with it. Have made >>>>>> your filter using a generic array processing model like >>>>>> ZeroSampleFiller? I think that's best since it can be used for both >>>>>> chromatograms and spectra. Also, the first and last sample of a full >>>>>> scan profile mode spectrum should always be 0, so treat those as edge cases. >>>>>> >>>>>> Thanks! >>>>>> -Matt >>>>>> >>>>>> >>>>>> On 4/13/2012 12:20 PM, Brian Pratt wrote: >>>>>>> OK, newest msconvert commandline option: >>>>>>> >>>>>>> # remove non-flanking zero value samples >>>>>>> msconvert data.RAW --filter "NonFlankingZeroSamples" >>>>>>> >>>>>>> # remove non-flanking zero value samples in MS2 and MS3 only >>>>>>> msconvert data.RAW --filter "NonFlankingZeroSamples 2 3" >>>>>>> >>>>>>> I am not in love with the name - the general (but not wholly >>>>>>> consistent) theme seems to be don't use verbs in the name, the verb >>>>>>> "filter" is implied. But I think "RemoveNonFlankingZeros" might be >>>>>>> more descriptive. Opinons? The working name "NonFlankingZeroSamples" >>>>>>> is me trying to be consistent with the somewhat related idea of >>>>>>> ZeroSampleFiller, but as this is user-facing I don't know if that >>>>>>> naming consistency is important. >>>>>>> >>>>>>> Note that it doesn't try to be clever, it just looks for runs of 0s in >>>>>>> the intensity lists and nukes the middles, along with the >>>>>>> corresponding mz entries. Doesn't think about distance from sample to >>>>>>> sample or anything. >>>>>>> >>>>>>> This makes our output not quite as small as ReAdW, but more useful for >>>>>>> downstream work like peakpicking. And you could use our existing >>>>>>> filtering to simply eliminate all zeros, ReAdW-style, if that's what >>>>>>> you wanted. >>>>>>> >>>>>>> examples: >>>>>>> <start of file> >>>>>>> 1 ms1 350.0013 0.00 >>>>>>> 1 ms1 350.0025 0.00 >>>>>>> 1 ms1 350.0038 0.00 >>>>>>> 1 ms1 350.0050 0.00 >>>>>>> 1 ms1 350.5251 0.00 >>>>>>> 1 ms1 350.5264 0.00 >>>>>>> 1 ms1 350.5276 0.00 >>>>>>> 1 ms1 350.5289 0.00 >>>>>>> 1 ms1 350.5301 3677.97 >>>>>>> 1 ms1 350.5313 5696.06 >>>>>>> becomes >>>>>>> <start of file> >>>>>>> 1 ms1 350.5289 0.00 >>>>>>> 1 ms1 350.5301 3677.97 >>>>>>> 1 ms1 350.5313 5696.06 >>>>>>> >>>>>>> and >>>>>>> 1 ms1 350.6033 5567.16 >>>>>>> 1 ms1 350.6045 3515.25 >>>>>>> 1 ms1 350.6057 0.00 >>>>>>> 1 ms1 350.6070 0.00 >>>>>>> 1 ms1 350.6082 0.00 >>>>>>> 1 ms1 350.6095 0.00 >>>>>>> 1 ms1 350.8030 0.00 >>>>>>> 1 ms1 350.8042 0.00 >>>>>>> 1 ms1 350.8055 0.00 >>>>>>> 1 ms1 350.8067 0.00 >>>>>>> 1 ms1 350.8080 1414.26 >>>>>>> 1 ms1 350.8092 5596.06 >>>>>>> becomes >>>>>>> 1 ms1 350.6033 5567.16 >>>>>>> 1 ms1 350.6045 3515.25 >>>>>>> 1 ms1 350.6057 0.00 >>>>>>> 1 ms1 350.8067 0.00 >>>>>>> 1 ms1 350.8080 1414.26 >>>>>>> 1 ms1 350.8092 5596.06 >>>>>>> >>>>>>> end of file behavior is mirror of start of file behavior. >>>>>>> >>>>>>> Brian >>>> >>>> ------------------------------------------------------------------------------ >>>> For Developers, A Lot Can Happen In A Second. >>>> Boundary is the first to Know...and Tell You. >>>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! >>>> http://p.sf.net/sfu/Boundary-d2dvs2 >>>> _______________________________________________ >>>> proteowizard-developer mailing list >>>> pro...@li... >>>> https://lists.sourceforge.net/lists/listinfo/proteowizard-developer >>> >>> ------------------------------------------------------------------------------ >>> For Developers, A Lot Can Happen In A Second. >>> Boundary is the first to Know...and Tell You. >>> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! >>> http://p.sf.net/sfu/Boundary-d2dvs2 >>> _______________________________________________ >>> proteowizard-developer mailing list >>> pro...@li... >>> https://lists.sourceforge.net/lists/listinfo/proteowizard-developer >>> >> >> ------------------------------------------------------------------------------ >> For Developers, A Lot Can Happen In A Second. >> Boundary is the first to Know...and Tell You. >> Monitor Your Applications in Ultra-Fine Resolution. Try it FREE! >> http://p.sf.net/sfu/Boundary-d2dvs2 >> _______________________________________________ >> proteowizard-developer mailing list >> pro...@li... >> https://lists.sourceforge.net/lists/listinfo/proteowizard-developer |