From: Elisabetta M. <man...@pc...> - 2007-09-12 23:52:53
|
Hi Dave, the LoadBatchArrayResult wants to know the cel protocols because it enters 2 entries in RAD.Quantification per assay: one for the .CEL quantification and one for the probe set quantification. What's entered in Quantification are just the protocol references (e.g. reference to entries in RAD.Protocol describing the CEL 4, MAS 5, RMA protocols) and the uri with the path to the actual data files on the fileserver). Then LoadBatchArrayResult calls LoadSimpleArrayResults which actually takes care of entering the quantified data in views of RAD.ElementResultImp or RAD.CompositeElementResultImp. Now, definitely the latter plugin will populate views such as AffymetrixMAS4 and AFFymetrixMAS5 and RMAExpress, which corresponds to probe set quantified data. I believe from earlier correspondence Junmin (here cc-ed), who wrote that LoadSimpleArrayResult, said that doesn't support loading of AffymetrixCel. But I see this view mentioned in the code of that plugin, so I'm deferring to Junmin to double-check on that. The plugin *only accepts text files* as data files. So files like the Metrics files (the .txt correspondent of the .CHP MAS4/5 files) will do, as well as RMA like text files. I believe with GCOS it is possible to export the data as metrics (txt) files corresponding to quantifications using the MAS5 algorithm. Elisabetta --- On Wed, 12 Sep 2007, Dave Hau wrote: > Elisabetta, > > Thanks for your and John Brestelli's (via personal email) very informative > replies. They are very helpful indeed. > > Regarding loading .CEL files (probe cell data, not probe set data), John > mentioned the plugin GUS::Community::Plugin::LoadBatchArrayResults which I > had noticed too. The help page for this plugin mentions a number of > quantification protocols supported including mas4/mas5 (Affymetrix MAS 4.0 > and 5.0 Probe Set quantification protocol) and cel4/cel5 (Affymetrix MAS 4.0 > and 5.0 Probe Cell quantification protocol). It seems that cel4/cel5 would > correspond to the .CEL files I need to load (i.e. probe *cell* data). Is this > correct? I was wondering because you mentioned in your reply that there's no > plugin available for loading probe cell data. > > Also, in the Affymetrix file format description document ( > http://www.affymetrix.com/support/developer/AffxFileFormats.ZIP ), two file > formats are described: Version 3 files (text data) generated by the MAS > software, and version 4 files (binary data) generated by the GCOS software. > So both cel4 and cel5 for the plugin would correspond to Version 3 files, > right? That means the LoadBatchArrayResults plugin does not support the > Version 4 (binary) file format, correct? > > Thanks again for your help. > > Best regards, > Dave Hau > > > Elisabetta Manduchi wrote: >> >> Hi Dave, >> let me clarify GUS vs Affy. >> Affymetrix quantified results are of two types, corresponding to 2 >> different level of analysis: >> >> (i) probe-cell level results (e.g. from .CEL files), which contain >> intensity values for each individual probe cell on the chip; and >> (ii) probe-set level results (e.g. obtained from MAS4 or MAS 5 and in the >> .CHP files, or from RMA or gcRMA) which contain *summarized* intensities >> for probe sets on the chip. >> >> The GUS schema in principle supports storage of both: >> >> (i) the probe cell results would go into a view of RAD.ElementResultImp (in >> fact there is a view to this end called RAD.AffymetrixCEL); >> (ii) the probe set results would go to view of >> RAD.CompositeElementResultImp. For the latter, currently we have views to >> accomodate MAS4 or 5 (RAD.AffymetrixMAS4 or RAD.AffymetrixMAS5) and >> RMA/gcRMA results (RAD.RMAExpress, which will actually be renamed RAD.RMA >> in the next GUS release). >> >> Now, here at CBIL, we do not store or support loading of the .CEL file data >> in the database, because we really only use the probe-set level results in >> our applications, so we have no need to store .CEL in the db. >> So the way we do it is as follows: >> * for every Affymetrix assay, we have TWO related quantifications, one >> corresponding to the .CEL quantification and the other corresponding to >> whatever summarization quantification was created (e.g. with MAS4, MAS5, >> RMA); >> * we place 2 entries in RAD.Quantifications, one pointing to the uri of the >> .CEL file (which we keep on our server) and one pointing to the uri of the >> probe-set level result file >> * we however do not store the data from the .CEL file in RAD.AffymetrixCEL >> * we only store the data from the probe-set level results in one of the >> RAD.CompositeElementResultImp views mentioned above. >> >> The current plugin in GUS::Supported, as Junmin mentioned in the posting >> you are referring to, can be used to populate the data for the probe-set >> level results. As far as I know, we do not have currently a plugin to store >> the .CEL files in the db. >> So the db allows for the latter, but you'd have to write your own plugin. >> We didn't find useful to store .CEL results in GUS, but again this depends >> on the type of applications you might be interested in. >> Hope this helps, >> Elisabetta >> >> >> On Tue, 28 Aug 2007, Dave Hau wrote: >> >>> I would like to import a number of Affymetrix .CEL files into the GUS >>> database, which was installed from top of trunk from the GUS svn >>> repository. The CEL files each have some text headers, and then binary >>> data afterwards. So I suppose they are in CEL Version 4 format. >>> >>> Doing some search on previous posts, I came across this one: >>> >>> http://sourceforge.net/mailarchive/message.php?msg_id=Pine.LNX.4.61.0512141526200.18143%40hera.pcbi.upenn.edu >>> >>> It seems that at the time of the post (12/2005), the way these .CEL >>> files would be imported was that the headers would go to one of the >>> Affymetrix views (AffymetrixMAS4 or AffymetrixMAS5 or AffymetrixCEL), >>> the actual file would sit in the file system, and we'd insert a row to >>> the RAD.Quantification table with a URI pointing to the location of the >>> .CEL file. >>> >>> Also, looking through the different plugins in both the Supported and >>> Community folders, it seems LoadBatchArrayResults supports the cel4 >>> format. Is this the plugin I should use? >>> >>> Any help would be much appreciated. Thanks. >>> >>> Best regards, >>> Dave Hau >> > |