From: Elisabetta M. <man...@pc...> - 2004-05-05 00:55:54
|
Hi Sucheta, my understanding is that GCOS is a wrapper around MAS 5, i.e. it has a db layer which in particular stores the output as per the MAS 5 probe analysis algorithm. The output you describe below is indeed basically that which you'll get in a metrics file and the measurements you show indeed fit the AffymetrixMAD5 view (see the schema browser http://www.gusdb.org/cgi-bin/schemaBrowser for a description of this view). As for the plugin, the one in question is the ArrayResultLoader (not the ArrayLoader, which instead is used to load information about the array itself). The ArrayResultLoader is very generic and it will work with any view of either ElementResultImp or CompositeElementeEsultImp (so it can be used even if new views are created to accomodate a new quantification software). You can look at the details of the documentation at http://www.gusdb.org/documentation/plugins/GUS-RAD-Plugin-ArrayResultLoader.html, but basically the main part is that concering the format of data_file input into the plugin. The header should contain columns with names spelled exactly as those of the relevant attributes of the view one needs to populate plus additional identifier column(s) (which in the case of Affy would just be a column with the Affy_id and header "name"). For the AffymetrixMAS5 view the data_file would have a header like (case sensitive): name signal detection detection_p_value stat_pairs stat_pairs_used (not all these fields are mandatory). If there are additional columns with other names they can stay but will be ignored. Elisabetta P.S. For RAD specific emails you might want to subscribe to the source-fourge mailing list at https://lists.sourceforge.net/lists/listinfo/gusdev-rad-issues. I've cc-ed it in this reply as I thought it could be useful to other RAD users. On Tue, 4 May 2004, Sucheta Tripathy wrote: > Hi Elisabetta, > > Thanks for all the info!! > > We are currently using a software called gene chip operating software(GCOS) > from affy instead of MAS5. Now I am not sure if it generates the matrix > files from the .CHP file. It takes either the .cel or the .dat file and > gives an output in a four column format as below: > Affy_id Y1_Signal Y1_Detection Descriptions > AFFX-MurIL2_at 5.9 A "M16762 Mouse interleukin 2 > (IL-2) gene, exon 4" > > Please tell me if this type of files are some of the files that you upload > to RAD3. > > What about the ArrayLoader plugin. What input file does it take? > > Thanks > > Sucheta > > At 12:17 PM 5/4/2004 -0400, you wrote: > > >Hi Sucheta, > > > >1. .CHP files. These are not in text format, but through MAS 5.0 one can > >generate their corresponding text files, i.e. the Metrics files. These can > >be loaded into RAD. More precisely minor parsing, of the Metrics files is > >needed in order to use them as input of the ArrayResultLoader > >(basically just the header needs to be parsed a bit, see doc for > >this plugin). > >We have recently written another plugin, the BathArrayResult loader, > >which loads in batch results for a group of assays. For this plugin > >there is no need of parsing of Metrics files, all needed is an xml > >configuration file. We have not deposited this plugin into the Sanger cvs > >repository yet, but we can make available upon request in the meantime. > > > >2. .CEL files. Up to recently these were in text format and in principle > >could be loaded into RAD using the ArrayResultLoader (and in fact, we have > >loaded a few in our instance of RAD). However we have now opted not to > >load the probe cell data into RAD. We keep the .CEL files on our > >filesystem and store in RAD a pointer to their location (see point 4 > >below). > > > >3. For image files like .DAT files (and similarly for .tif files), we > >store in RAD a pointer to their location, but and store the files in our > >filesystem. > > > >4. Summarizing how we currently deal with a given assay, with files > >myassay.data, myassay.CEL, myassay.CHP (and myassay_Metrics.txt): > > > >a. The assay has one child in RAD3.Acquisition and the > >RAD3.Acquisition.uri of this points to the location of the .DAT file in > >our filesystem. > > > >b. The acquisition in (a) has TWO children in RAD.Quantification (and > >these are related in RAD3.RelatedQuantification): one corresponds to the > >.CEL quantification (i.e. the probe cell analysis) and its uri points to > >the location of the .CEL file in our system, the other corresponds to the > >.CHP (probe set) analysis and its uri points to the location of the > >corresponding Metrics file (or .CHP file) in our file system. > > > >c. We load the results corresponding to the .CHP file (from the Metrics > >file) into the AffymetrixMAS5 view of CompositeElementResultImp. Every > >entry in the latter will have quantification_id pointing to the > >quantification_id of the quantification in (b) corresponding to the probe > >set analysis (.CHP). > > > >Hope this clarifies things a bit, > >Elisabetta > > > >On Tue, 4 May 2004, Sucheta Tripathy wrote: > > > > > Hello Elisabetta and Sam, > > > > > > I am new to RAD modules in GUS. Now we have some microarray data, which is > > > available in 3 different file formats(.cel,.dat,.CHP). Which of these 3 > > > files can be uploaded to RAD tables. None of the tables are in text format. > > > > > > Thanks in advance. > > > > > > Sucheta > > > > > > At 07:55 PM 11/16/2003 -0800, sam wang wrote: > > > >Dear Elisabetta, > > > > > > > >thank you very much for your great help! now I am trying according to > > your > > > >suggestion, besides, it seems a little complicated :-) > > > > > > > >Thanks again, > > > >Sammy > > > > > > > >Elisabetta Manduchi <man...@pc...> wrote: > > > >Dear Sam, > > > >regarding the part of your question referring to the manufacturer_id > > > >error, if you look at schemaBrowser for RAD3.Array at > > > >http://www.gusdb.org/cgi-bin/schemaBrowser?db=CBILBLD&table=RAD3::Array > > &path=RAD3::Array > > > > > > > >you should see that manufacturer_id and platform_type_id are nonnullable. > > > >Thus a valid manufacturer_id and a valid platform_type_id must be given > > > >when running the ArrayLoader plugin (see also plugin documentation at > > > >http://www.gusdb.org/documentation/plugins/GUS-RAD-Plugin-ArrayLoader.h > > tml). > > > >This means that before you can load an array, you will need to have > > > >entered the manufacturer in the SRes.Contact (this can be done also > > > >via the StudyAnnotator Contact form) and you should have in > > > >RAD3.OntologyEntry > > > >an entry that describes the platform type. MGED has an ontology available > > > >that we use to populate RAD3.OntologyEntry for terms referring to > > > >platform types, namely the ontology for class=TechnologyType. The idea is > > > >that in RAD3.OntologyEntry we store the instances of this class, > > > >all of which have category set to 'TechnologyType". Similarly for > > > >SubstrateType, an array entry in RAD3.Array has a substrate_type_id that > > > >points to a term in RAd3.OntologyEntry describing the substrate type. For > > > >this we use the instances of the MGED SubstrateType class, that is we > > > >enter in RAD3.OntologyEntry an entry for each such instance, all having > > > >category set to "SubstrateType". To get the MGED instatnces (individuals) > > > >for TechnologyType and SubstrateType, you can go to > > > >http://mged.sourceforge.net/ontologies/MGEDontology.php#TechnologyType and > > > >http://mged.sourceforge.net/ontologies/MGEDontology.php#SubstrateType > > > >I believe that, even though substrate_type_id is nullable in RAD3.Array, > > > >the ArrayLoader might require that this value is provided in the arguments > > > >anyway. Junmin, cc'ed in this email, can confirm this, as he is the author > > > >of the ArrayLoader plugin. > > > >Elisabetta > > > > > > > >On Fri, 14 Nov 2003, sam wang wrote: > > > > > > > > > Dear friends, > > > > > > > > > > after I download the new version of DbiDbHandle.pm, > > > > > though it can go more steps, but there is no any data > > > > > in my database, besides, there is an error "ERROR a > > > > > VALID -- manufacturer_id must be on the commandline > > > > > manufacturer_id = 0", I don't know the meaning and if > > > > > it's the problem that I can't upload data? if so, how > > > > > can I correct it? > > > > > > > > > > Thank you very much for your help! > > > > > Sammy > > > > > > > > > > > > > > > please check the following error message: > > > > > > > > > > [oracle@GUS oracle]$ ga GUS::RAD::Plugin::ArrayLoader > > > > > --cfg_file > > > > > /opt/gus/projects/RAD/DataLoad/config/ArrayLoader.cfg > > > > > --data_file /home/oracle/RADDOTSdataset/assay5quan3 > > > > > --manufacturer_id 0 --platform_type_id 0 > > > > > Reading properties from > > > > > /opt/gus/gus_home/config/GUS-PluginMgr.prop > > > > > Reading properties from /home/oracle/.gus.properties > > > > > DBI subclasses 'GUS::ObjRelP::DbiDbHandle::db' and > > > > > ::st are not setup, RootClass ignored at > > > > > /opt/gus/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm > > > > > line 152 > > > > > DBI subclasses 'GUS::ObjRelP::DbiDbHandle::db' and > > > > > ::st are not setup, RootClass ignored at > > > > > /opt/gus/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm > > > > > line 152 > > > > > DBI subclasses 'GUS::ObjRelP::DbiDbHandle::db' and > > > > > ::st are not setup, RootClass ignored at > > > > > /opt/gus/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm > > > > > line 152 > > > > > DBI subclasses 'GUS::ObjRelP::DbiDbHandle::db' and > > > > > ::st are not setup, RootClass ignored at > > > > > /opt/gus/gus_home/lib/perl/GUS/ObjRelP/DbiDatabase.pm > > > > > line 152 > > > > > Fri Nov 14 19:43:23 2003 ALGINVID 3 > > > > > Fri Nov 14 19:43:23 2003 ARGS algoinvo > > > > > 1 > > > > > Fri Nov 14 19:43:23 2003 ARGS cfg_file > > > > > /opt/gus/projects/RAD/DataLoad/config/ArrayLoader.cfg > > > > > Fri Nov 14 19:43:23 2003 ARGS comment > > > > > Fri Nov 14 19:43:23 2003 ARGS commit > > > > > Fri Nov 14 19:43:23 2003 ARGS data_file > > > > > /home/oracle/RADDOTSdataset/assay5quan3 > > > > > Fri Nov 14 19:43:23 2003 ARGS debug > > > > > Fri Nov 14 19:43:23 2003 ARGS group > > > > > Fri Nov 14 19:43:23 2003 ARGS gusconfigfile > > > > > /home/oracle/.gus.properties > > > > > Fri Nov 14 19:43:23 2003 ARGS > > > > > manufacturer_id 0 > > > > > Fri Nov 14 19:43:23 2003 ARGS noWarning > > > > > Fri Nov 14 19:43:23 2003 ARGS > > > > > platform_type_id 0 > > > > > Fri Nov 14 19:43:23 2003 ARGS project > > > > > Fri Nov 14 19:43:23 2003 ARGS protocol_id > > > > > Fri Nov 14 19:43:23 2003 ARGS restart > > > > > Fri Nov 14 19:43:23 2003 ARGS sqlVerbose > > > > > Fri Nov 14 19:43:23 2003 ARGS > > > > > substrate_type_id > > > > > Fri Nov 14 19:43:23 2003 ARGS testnumber > > > > > Fri Nov 14 19:43:23 2003 ARGS usage > > > > > Fri Nov 14 19:43:23 2003 ARGS user > > > > > Fri Nov 14 19:43:23 2003 ARGS verbose > > > > > Fri Nov 14 19:43:23 2003 ARGS veryVerbose > > > > > Fri Nov 14 19:43:23 2003 COMMIT commit off > > > > > Fri Nov 14 19:43:23 2003 ERROR a VALID -- > > > > > manufacturer_id must be on the commandline > > > > > manufacturer_id = 0 > > > > > Fri Nov 14 19:43:23 2003 RESULT > > > > > [oracle@GUS oracle]$ > > > > > > > >------------------------------------------------------- > >This SF.Net email is sponsored by: Oracle 10g > >Get certified on the hottest thing ever to hit the market... Oracle 10g. > >Take an Oracle 10g class now, and we'll give you the exam FREE. > >http://ads.osdn.com/?ad_id=3149&alloc_id=8166&op=click > >_______________________________________________ > >Gusdev-gusdev mailing list > >Gus...@li... > >https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev > -- Elisabetta Manduchi Computational Biology and Informatics Laboratory Center for Bioinformatics University of Pennsylvania 1428 Blockley Hall 423 Guardian Drive Philadelphia, PA 19104-6021 phone: 215-573-4408 fax: 215 573-3111 email: man...@pc... web: http://www.cbil.upenn.edu/~manduchi --- |