|
From: Harry M. <man...@ho...> - 2001-01-27 06:29:40
|
Hi Michael,
OK - I've found it, altho it's a bit bizarre...
The CyberT C&E program tests for a param called TOOLOW and if not set, it is effectively endef or zero in a numeric comparison. Therefore anthing less than 0 (when it's not set) gets set to zero.
In the DB file data transfer, for some reason, TOOLOW never tests as 0 - it's undef but not zero (Jason can probably explain this).
I've patched CyberT to behave the way it should if the TOOLOW is supposed to be undef, but it's deeper than this.
The upshot is that in the file upload, the R lib never sees the negative numbers - they're always caught and converted to 0. In the DB pass-thru, the negative get passed to R where it chokes, so the
R lib has to be patched to fix this.
You get the weird bug of the day badge for this one.
The Paired CyberT won't be afected as it gets passed ratios, which will NOT be negative (unless you force them to be by insisting that they be formed by using these straight raw numbers...?)
So on to the author...
Tony, running the attached data set thru CYberT C+E, we get the following error for the reasons stated above. the reason for including negative numbers is that sometimes people want to include even
obviously bad data if it more accurately reflects the numbers coming off the scanner for their own reasons. So doitall() has to support this and it's choking here:
> library(hdarray,lib.loc='/var/genex/local/lib/R/library')
> ss <- read.table("/var/genex/tmp/genex/hda1874980579171/data",sep="\t")
> doitall(ss, 2, 4, 5, 7, 7, 0.25, 2)
Error in cov(x, y, use = use) : missing observations in cov/cor
there's no cov() in doitall, but there are lots of var() calls which I think share the same error message. I'll debug if you can't but you can do this 20x faster than I.
Can you give it a try?
HJM
Michael Pear wrote:
>
> Hi Harry,
>
> Things work for the Hatfield data set, but not for
> the one I submitted (ShearStress). I don't think the library location was
> an issue, there must be something where the library stuff is
> set during install. I can see a different error if I mess up the path info..
>
> Any problem with our data being "integer" data?
> I'm getting the error
>
> Error in cov(x, y, use = use) : missing observations in cov/cor
> In addition: There were 50 or more warnings (use warnings() to see the first 50)
> Execution halted
>
> Retrieves the columns of data with no problem. Any idea of what to look for in the data that would
> be considered "missing" observations? What if
> there are 0 observations for the control or experiment group because
> of negative/0 values for all the replicates?
>
> Let me know what I can test. Is there a debug flag I can turn on in
> R to trace the execution?
>
> Regards,
>
> Michael Pear
>
> ----- Original Message -----
> From: "Harry Mangalam" <man...@ho...>
> To: "Michael Pear" <mic...@ho...>
> Sent: Friday, January 26, 2001 6:40 PM
> Subject: Re: Data with negative numbers...
>
> > Hi Michael,
> >
> > I *WAS* joking about the insane locations..:) - and yup - we haven;t seen those odd locations so
> while the R libs were picked up in my install, I'm assuming that they were picked up from
> alternative
> > 'std' locations (altho I tried to wipe them out to see if it still worked and it still did...).
> >
> > Oh well, did the below fix work?
> >
> > hjm
> >
> >
> > Michael Pear wrote:
> > >
> > > Hi Harry,
> > > I found an rpm for R v 1.2 and downloaded that from the site
> > > in the INSTALL notes. Didn't do anything special other than that.
> > >
> > > The hdarray pieces were installed through the install-all.pl script.
> > > I didn't do anything outside of that script. Perhaps because I selected
> > > the "unusual" route of installiing under one root diretcory, you've not
> > > seen this before out of the install-all.pl script.
> > >
> > > M. Pear
> > >
> > > ----- Original Message -----
> > > From: "Harry Mangalam" <man...@ho...>
> > > To: "Michael Pear" <mic...@ho...>; "genexdev at SF" <gen...@li...>
> > > Sent: Friday, January 26, 2001 6:21 PM
> > > Subject: Re: Data with negative numbers...
> > >
> > > > Hi Michael,
> > > >
> > > > I thnk the reason is even simpler thatn this. Your installation of R doesn;t know where to
> find
> > > the proper libraries. R by default looks for them in /usr/local/lib/R..
> > > > but you installed it so that your R libs are in:
> > > > /var/genex1.0/local/lib/R/library/hdarray/R/hdarray
> > > >
> > > > The answer is to tell R how to find it which is done via the lib.loc arg to the library call.
> I
> > > never saw this before as my libs are stored in non-insane / system places, so I've just added
> that
> > > bit
> > > > and checked it in to CVS.
> > > >
> > > > if you want to change it in place, the mods are (for your install):
> > > >
> > > > in /your/cgi-bin/genex/cybert/
> > > >
> > > > CyberTDB-6.2.C+E.pl:352:
> > > > print CMD "library(hdarray, lib.loc = '/var/genex1.0/local/lib/R/library')\n";
> > > >
> > > > CyberTDB-6.2.paired.pl:407:
> > > > print CMD "library(hdarray, lib.loc = '/var/genex1.0/local/lib/R/library')\n";
> > > >
> > > > the way to debug this is to look in the hda.. dir created in your genex temporary dir this
> file
> > > typically contains a bunch of stuff
> > > >
> > > > -rw-r--r-- 1 apache apache 489 Jan 26 19:36 Rcommandfile
> > > > -rw-r--r-- 1 apache apache 162 Jan 26 19:36 Rerror
> > > > -rw-r--r-- 1 apache apache 37825 Jan 26 19:36 data
> > > > -rw-r--r-- 1 apache apache 174 Jan 26 19:36 xginput
> > > >
> > > > The Rcommandfile is the file containing the R commands that R tried to run. you can start R
> and
> > > run them one by one, just pasting them into the R shell. That's how I found that it couldn;t
> find
> > > your
> > > > hdarray lib (rather than the ubiquitous munge4R permission drecko.) The Rerror file contains
> the
> > > R errors, also useful sometimes when things appear to run but don;t get finished.
> > > >
> > > > the data file is... the data that R has to chew on . If it's 0 length or munged, we have a
> > > possible DB comms problem.
> > > >
> > > > xginput the input for xgobi and is a funny file that is created in 2 parts, the 1st is the
> header
> > > and is about 150 bytes, depending on the # of headers that are being written. Then the bulk of
> the
> > > > data is written by the munge4R.pl file which slices/dices/sorts/scrunches the data from the DB
> > > around until it's extruded in a form that xgobi can digest. The fact that it's short implies
> that
> > > the
> > > > data didn;t get writ.
> > > >
> > > > A complete family of files in the hda... dir would look like:
> > > >
> > > > -rw-r--r-- 1 nobody nobody 275934 Jan 26 18:01 ALLGENES
> > > > -rw-r--r-- 1 nobody nobody 240552 Jan 26 18:01 CyberT.ps
> > > > -rw-r--r-- 1 nobody nobody 497 Jan 26 18:01 Rcommandfile
> > > > -rw-r--r-- 1 nobody nobody 301 Jan 26 18:01 Rerror
> > > > -rw-r--r-- 1 nobody nobody 275778 Jan 26 18:01 allgene.txt
> > > > -rw-r--r-- 1 nobody nobody 34423 Jan 26 18:01 data
> > > > -rw-r--r-- 1 nobody nobody 38187 Jan 26 18:01 temp
> > > > -rw-r--r-- 1 nobody nobody 240552 Jan 26 18:01 temp.ps
> > > > -rw-r--r-- 1 nobody nobody 44045 Jan 26 18:01 xginput
> > > > -rw-r--r-- 1 nobody nobody 19417 Jan 26 18:01 xgobi-980560894.tar.gz
> > > >
> > > > These are mostly self-explanatory. Lemme know if you need more.
> > > >
> > > > Michael Pear wrote:
> > > > >
> > > > > Permissions were ok on merge4R.pl...mergem wasn't
> > > > > execute so I changed that.
> > > > >
> > > > > Logins were lost with our reinstall. It is probably best for
> > > > > you to work through me on this right now...and helps me
> > > > > become more aware of things. If it gets too tedious we'll
> > > > > get you a login.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Michael Pear
> > > > >
> > > > > ----- Original Message -----
> > > > > From: "Harry Mangalam" <man...@ho...>
> > > > > To: "Michael Pear" <mic...@ho...>
> > > > > Sent: Friday, January 26, 2001 4:44 PM
> > > > > Subject: Re: Data with negative numbers...
> > > > >
> > > > > > Hi Associate GeneX God,
> > > > > >
> > > > > > The problem looks to be in the transition from the DB retirieval to the anaytical tool and
> I
> > > bet
> > > > > it's the permissions on munge4R.pl I just tracked that bugger down on some other machines -
> > > check
> > > > > it -
> > > > > > it needs to be rx-all. It's fixed in the latest CVS, but you can just touch it gently and
> fix
> > > it
> > > > > in place.
> > > > > >
> > > > > > You (properly) locked down your machien - the old login you gave me doesn;t work anymmore,
> so
> > > I
> > > > > couldn;t log in ans see if this was the case or not. If you want me to be able to do that,
> call
> > > me
> > > > > with
> > > > > > a login and password. based on your past experience, please don't send it to me by email.
> > > > > >
> > > > > > Phones are still better for some things...
> > > > > >
> > > > > > hjm
> > > > > >
> > > > > >
> > > > > > > Michael Pear wrote:
> > > > > > >
> > > > > > > Hi Harry,
> > > > > > >
> > > > > > > I've got some local data loaded. It is priimary raw data with 3 replicates
> > > > > > > at 4 data points. When I run CyberT I get no results, possibly because of
> > > > > > > the negative numbers.
> > > > > > >
> > > > > > > I get the following message:
> > > > > > >
> > > > > > > Error in cov(x, y, use = use) : missing observations in cov/cor
> > > > > > > In addition: There were 50 or more warnings (use warnings() to see the first 50)
> > > > > > > Execution halted
> > > > > > >
> > > > > > > Is my interpretation correct? I imagine a good thing to consider is handling
> > > > > > > "off" data like this. Clearly, needs a more sophisticated background correction.
> > > > > > > Would some sort of filter be appropriate in the analysis to process those
> > > > > > > things that can be processed? How are zero values handled?
> > > > > >
> > > > > > Re: the analyses, yes, I'm going to have to come up with some better data filtering soon
> to
> > > handle
> > > > > this - in fact that was what my REAL job is supposed to be, not a SW developer (can you
> > > tell?!??).
> > > > > > What I'll be doing soon hopefully is changing the interface so that it a) gives you an
> > > indication
> > > > > of what kind of data it is and what parameters it will need to allow normalization and then
> b)
> > > allow
> > > > > > the user to apply this to the arrays under consideration.
> > > > > >
> > > > > > > Also, is my interpretation correct that I will end up with ratios (log?) of the
> > > > > > > experimental to control? I'll poke through the doc, but I'm a bit sketchy
> > > > > > > at this point on how the ratios are figured.
> > > > > >
> > > > > > Yes, you'll end up with ratios (called fold I believe) in the output.
> > > > > >
> > > > > > > I've been wondering about the background issue. It seems to me that if
> > > > > > > some gene's signal is within background, then you are basically saying
> > > > > > > that there is no detectable signal. Perhaps a stringent test of a gene
> > > > > > > being expressed relative to this is to pick a background threshold and
> > > > > > > set the gene with no signal to this threshold in evaluating the ratios.
> > > > > > > Thoughts?
> > > > > >
> > > > > > That's not bad in the absence of other data. This is the aching question - how do you
> > > calculate a
> > > > > ratio in the absence of a numerator (or worse, denominator). It may be that the better way
> of
> > > > > > representing this is the case of missing values is to just give an indicator of MAX or MIN
> or
> > > the
> > > > > absolute value of the detectable value. It's also possible to use the rank of the gene's
> > > expression
> > > > > > where it is detectable to give it a weighting.
> > > > > >
> >
> > --
> > Cheers,
> > Harry
> >
> > Harry J Mangalam -- (949) 856 2847 (v&f) -- hj...@nc... || man...@ho...
--
Cheers,
Harry
Harry J Mangalam -- (949) 856 2847 (v&f) -- hj...@nc... || man...@ho...
|