From: Tony Z. <fz...@vb...> - 2005-09-05 13:49:02
|
Hi Chris, Thanks for your reply. InsertSequenceFeatures.pm is helpful.I have more questions about this. 1. Does InsertSequenceFeatures have the same (or more) functionality as GBParser? Should we use InsertSequenceFeatures (supported) instead of GBParser(community)? 2. To use InsertSequenceFeature, should we define "mapFile" (XML) ourselves and pass it to InsertSequenceFeature as an argument? 3. Seems that there are many ways to use GUS to deal with the relationship of "DNA->gene->CDS->AASEQUNCE". The documentation of InsertSequenceFeatures.pm seems not very straightforward to me. Could you or other GUS people give me a suggested method? For example, I have an example genbank file below. LOCUS NC_006932 2124241 bp DNA circular BCT 08-APR-2005 DEFINITION Brucella abortus biovar 1 str. 9-941 chromosome I, complete sequence. ACCESSION NC_006932 VERSION NC_006932.1 GI:62288991 KEYWORDS . SOURCE Brucella abortus biovar 1 str. 9-941 ORGANISM Brucella abortus biovar 1 str. 9-941 <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=262698> Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales; Brucellaceae; Brucella. REFERENCE 1 (bases 1 to 2124241) AUTHORS Halling,S.M., Peterson-Burch,B.D., Bricker,B.J., Zuerner,R.L., Qing,Z., Li,L.L., Kapur,V., Alt,D.P. and Olsen,S.C. TITLE Completion of the genome sequence of Brucella abortus and comparison to the highly similar genomes of Brucella melitensis and Brucella suis JOURNAL J. Bacteriol. 187 (8), 2715-2726 (2005) PUBMED 15805518 <http://www.ncbi.nlm.nih.gov/entrez/utils/qmap.cgi?uid=15805518&form=6&db=m&Dopt=r> REFERENCE 2 (bases 1 to 2124241) AUTHORS . CONSRTM NCBI Genome Project TITLE Direct Submission JOURNAL Submitted (06-APR-2005) National Center for Biotechnology Information, NIH, Bethesda, MD 20894, USA REFERENCE 3 (bases 1 to 2124241) AUTHORS Halling,S.M., Bricker,B.J., Alt,D.P., Peterson-Burch,B.D., Zuerner,R.L., Olsen,S.C., Whipple,D.L., Zhang,Q., Li,L.-L. and Kapur,V. TITLE Direct Submission JOURNAL Submitted (03-FEB-2004) ARS, USDA, National Animal Disease Center, 2300 N. Dayton, P.O. Box 70, Ames, IA 50010, USA COMMENT PROVISIONAL REFSEQ <http://www.ncbi.nlm.nih.gov/RefSeq/>: This record has not yet been subject to final NCBI review. The reference sequence was derived from AE017223 <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AE017223>. COMPLETENESS: full length. FEATURES Location/Qualifiers source 1..2124241 /organism="Brucella abortus biovar 1 str. 9-941" /mol_type="genomic DNA" /strain="9-941" /db_xref="taxon:262698 <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=262698>" /chromosome="I" /biovar="1" gene <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=62288991&itemID=2031&view=gbwithparts> 784..2274 /gene="dnaA" /locus_tag="BruAb1_0001" /db_xref="GeneID:3339217 <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=retrieve&dopt=graphics&list_uids=3339217>" CDS <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=62288991&itemID=1&view=gbwithparts> 784..2274 /gene="dnaA" /locus_tag="BruAb1_0001" /note="similar to BR0001, chromosomal replication initiator protein DnaA" /codon_start=1 /transl_table=11 <http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG11> /product="DnaA, chromosomal replication initiator protein" /protein_id="YP_220785.1 <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=YP_220785.1>" /db_xref="GI:62288992" /db_xref="GeneID:3339217 <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=retrieve&dopt=graphics&list_uids=3339217>" /translation="MKMDSAVSEEAFERLTAKLKARVGGEIYSSWFGRLKLDDISKSI VRLSVPTAFLRSWINNHYSELLTELWQEENPQILKVEVVVRGVSRVVRSAAPAETCDN AEAKPAVTPREKMVFPVGQSFGGQSLGEKRGSAVVAESAAATGAVLGSPLDPRYTFDT FVDGASNRVALAAARTIAEAGSSAVRFNPLFIHASVGLGKTHLLQAIAAAALQRQEKA RVVYLTAEYFMWRFATAIRDNNALSFKEQLRDIDLLVIDDMQFLQGKSIQHEFCHLLN TLLDSAKQVVVAADRAPSELESLDVRVRSRLQGGVALEVAAPDYEMRLEMLRRRLASA QCEDASLDIGEEILAHVARTVTGSGRELEGAFNQLLFRQSFEPNISIDRVDELLGHLT RAGEPKRIRIEEIQRIVARHYNVSKQDLLSNRRTRTIVKPRQVAMYLAKMMTPRSLPE IGRRFGGRDHTTVLHAVRKIEDLVGADTKLAQELELLKRLINDQAA" ... ORIGIN 1 ttttccacac ttatccacag ggcgcgggcg ggactcggtt gcccctctga gtcaagcata ... I consider myself a newbie to GUS and the above questions may be naive to many of you. However, your help is highly appreciated. Thanks again. - Tony Chris Stoeckert wrote: > Hi Tony, > A beta release of the plugin InsertSequenceFeatures.pm is now > available for loading GenBank records (as well as other formats). The > semantics of how to use the GUS tables for this purpose are encoded in > that plugin. Is that what you were asking? > Chris > > On Sep 2, 2005, at 10:31 AM, Tony Zhang wrote: > >> This is old topic and it may have been discussed many times. Still, I >> would like to get suggestion again about how to use GUS to store DNA, >> CDS, and corresponding protein sequence. Suppose I have one original >> record in Genkbank format. Thanks. >> >> - Tony >> >> >> >> ------------------------------------------------------- >> SF.Net email is Sponsored by the Better Software Conference & EXPO >> September 19-22, 2005 * San Francisco, CA * Development Lifecycle >> Practices >> Agile & Plan-Driven Development * Managing Projects & Teams * Testing >> & QA >> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf >> _______________________________________________ >> Gusdev-gusdev mailing list >> Gus...@li... >> <mailto:Gus...@li...> >> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev >> > > > > Chris Stoeckert, Ph.D. > > Research Associate Professor, Dept. of Genetics > > 1415 Blockley Hall, Center for Bioinformatics > > 423 Guardian Dr., University of Pennsylvania > > Philadelphia, PA 19104 > > Ph: 215-573-4409 FAX: 215-573-3111 > > |