From: Elisabetta M. <man...@pc...> - 2007-09-14 14:30:20
|
Hi Dave, in line: > Thanks Junmin and Elisabetta for your helpful comments. > > The consensus not to load CEL files into the database - is it because we only > query for probe set data based on the gene, but not for probe cell data? If I yes typically people query the summarized results at the probe set level. > store the CEL file in the filesystem and only store a file URI in the > database, does RAD provide a way to run summarization algorithms (e.g. RMA, > Plier) on those files? Not currently. RAD provides the database where the results of such algorithms can be stored. One could certainly write a plugin that goes to the .CEL file indicated by the uri and then uses it to run their summarization algorithms of choice. However we do not currently have any such plugin in Supported or Community. > Can I load multiple sets of probe set data for a > single set of probe cell data (e.g. one for RMA, one for Plier)? Certainly. You would create as many entries in RAD.Quantification as the number of summarization protocols you run (e.g. MAS 5, RMA, Plier) on the same .CEL file, each such entry will point to the appropriate summarization protocol. You would additionally have a quantification referring to the .CEL file. In RAD.RelatedQuantification you can connect to the .cel quantification each of the others (summarization ones) that have used that .cel file. Then you can load the results of the summarization algorithms in the corresponding views of RAD.CompositeElementResultImp. Currently we have views for MAS4, MAS5, RMAExpress (which will simply be renamed in the next release RMA, and which accomodates RMA, gcRMA, etc.) and MOID. But it's easy to create additional views of the same table in your own istance that might accomodate other summarization programs. > Also, according to the instructions in the RAD website on how to load a > complete microarray study into the GUS database, the first step mentions > "Further array annotation can be loaded via > GUS::Community::Plugin::InsertArray2DbRefAndNaSeq. I tried to run this > plugin, but got this error: > > FATAL: Can't locate GUS/Model/RAD/CompositeElementDbRef.pm in @INC > > Do you know where I can find this CompositeElementDbRef.pm file? I think this is because the tables RAD.(Composite)ElementDbRef and RAD.(Composite)ElementNASequence where added after the last official GUS release. They are scheduled for the next GUS release (which probably won't occur in the near future). We have added them to our own instance of GUS at CBIL. So, if you want to use these tables, you first need to add those 4 tables to your db instance (you can find the latest sql for GUS in the GusSchema svn at https://www.cbil.upenn.edu/svn/gus/GusSchema/trunk/Definition/config/gus_schema.xml). (Note that this contains also other modifications made to tables subsequently to the 3.5 GUS release). Then you need to populate Core.TableInfo with entries for these new tables. Then you need to rebuild GUS forcing rebuilding of the objects. This way the code generator will see the new tables and create the corresponding objects, including the one you are referring to above. > I would like to load the annotation file I obtained from the Affymetrix > website for the HG-U133_Plus_2 array into the GUS database. What's the best > way to go about this? There are multiple choices for where to store array annotation at the moment. 1. RAD.CompositeElementDbRef and RAD.CompositeElementNASequence have been added to more quickly annotate Affy data with Entrez Genes and RefSeq info respectively. 2. Another possibility is to use the external_database_release_id and source_id pair in RAD.ShortOligoFamily to point to one preferred annotation for each probe set (but you would have to choose one). 3. Another, less structured possibility, is to use RAD.CompositeElementAnnotation, where you use the attribute 'name' to denote the annotation (e.g. "Entrez Gene", "RefSeq", etc.) and the attribute 'value' for the annotation (e.g. entrez gene id, or refseq id, etc.) itself. This has less structured but it will allow you to load as many annotations as you like. Elisabetta |