|
From: Harry M. <man...@ho...> - 2001-02-02 19:33:24
|
Hi Tony, I think there is a bug in the CyberT interface (or R code). Michael Pear, who has been stress testing the whole of GeneX is generating C+E data that he has loaded into his GeneX database and trying to analyze it via the CyberT C+E tool. He found that ===> 2) There appears to be a "minrep" level set to 2 when calling "doitall" that is to prevent t-testing on too small a number of replicates. However, it seems that there is still a "p" value and fold computed for spots that violate this minimum requirement...am I missing something? We are finding the some of our best "p" values are for spots that do not satisfy the "minrep" requirement. <=== Have I gotten this right below? And if so, there is a mismatch between what I understand to be happening and what the doitall script is doing. Is my suggestion below the correct approach or is there a deeper bit that I'm missing? I've partially tracked it down to bad/misleading documentation at minimum. The values that he's trying to analyze have negative #s in them. He currently has the option of indicating that they are either below backgound (in which case they are set to 0) That's what this means in the C+E form: == Values less than [ ] will be set to 0 and ignored in calculations. (Leave blank to include all values). == It turns out that it's actually NOT ignored in calculations. It's USED in the calculations AS ZERO, as ZERO is a valid number for this calculation. I think the interface should allow you to set the number to the cutoff (or even another number) or to set it to 'NA' (which indicates a true missing value - see below) However, you CAN tell the program to IGNORE the values by setting negative values to 'NA' (as is described in the help, but that's off the main page). That results in what you would expect for the above case. This is also reasonably easy to fix, so that the above FORM widget should be rephrased as: == Leave the following value blank to include all values. Values less than [___] will be set to <> that value and included in calculations as that vlaue <> NA and ignored in further calculations. where <> is a radio-type button. == This should also be available from both the DB data entry and the upload data entry. Thanks hjm Harry Mangalam wrote: > > Hi Michael, > > Is this from data passed in directly from a database query? or did you upload it from a file. The difference is that if it comes in from the DB, some variables are already set (the minrep param is > one of them I think). > > However the fact that minrep is getting set to 2 and that's getting passed to doitall is weird. I'm checking now.. > > And attached to this is the replacement 'dendro.jar' file that may fix your applet problem. (actually this has been deleted b/c of the size restriction on the list - I sent it and then I as admin got > the request to approve it... proving the point that idiots rule!) > > hjm > > Michael Pear wrote: > > > > > > 2) There appears to be a "minrep" level set to 2 when calling "doitall" that is to prevent > > t-testing on too > > > > small a number of replicates. However, it seems that there is still a "p" value and fold > > computed for spots > > > > that violate this minimum requirement...am I missing something? We are finding the some of our > > best > > > > "p" values are for spots that do not satisfy the "minrep" requirement. > > > > > > Hmm - that shouldn't be the case - it should refuse to calculate these. No wonder that they're > > the best p's are with these - they'd tend to have less variability. This is with the C+E cybert? > > > > Yes, this is with C+E cybert. I have attached a snippet from the output of one of the runs. > > See, for example, the line "125", which gives a p value of ~.08, but has only one replicate from the > > control set. > > > > Michael Pear > > > > -- Cheers, Harry Harry J Mangalam -- (949) 856 2847 (v&f) -- hj...@nc... || man...@ho... |