Re: [GUSDEV] DNA -> CDS -> AASEQUENCE

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

Hi Chris,

Thanks for your reply.

InsertSequenceFeatures.pm is helpful.I have more questions about this.
1. Does InsertSequenceFeatures have the same (or more) functionality as 
GBParser? Should we use InsertSequenceFeatures (supported) instead of 
GBParser(community)?
2. To use InsertSequenceFeature, should we define "mapFile" (XML) 
ourselves and pass it to InsertSequenceFeature as an argument?
3. Seems that there are many ways to use GUS to deal with the 
relationship of "DNA->gene->CDS->AASEQUNCE". The documentation of 
InsertSequenceFeatures.pm seems not very straightforward to me. Could 
you or other GUS people give me a suggested method? For example, I have 
an example genbank file below.

LOCUS       NC_006932            2124241 bp    DNA     circular BCT 08-APR-2005
DEFINITION  Brucella abortus biovar 1 str. 9-941 chromosome I, complete
            sequence.
ACCESSION   NC_006932
VERSION     NC_006932.1  GI:62288991
KEYWORDS    .
SOURCE      Brucella abortus biovar 1 str. 9-941
  ORGANISM  Brucella abortus biovar 1 str. 9-941 <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=262698>
            Bacteria; Proteobacteria; Alphaproteobacteria; Rhizobiales;
            Brucellaceae; Brucella.
REFERENCE   1  (bases 1 to 2124241)
  AUTHORS   Halling,S.M., Peterson-Burch,B.D., Bricker,B.J., Zuerner,R.L.,
            Qing,Z., Li,L.L., Kapur,V., Alt,D.P. and Olsen,S.C.
  TITLE     Completion of the genome sequence of Brucella abortus and
            comparison to the highly similar genomes of Brucella melitensis and
            Brucella suis
  JOURNAL   J. Bacteriol. 187 (8), 2715-2726 (2005)
   PUBMED   15805518 <http://www.ncbi.nlm.nih.gov/entrez/utils/qmap.cgi?uid=15805518&form=6&db=m&Dopt=r>
REFERENCE   2  (bases 1 to 2124241)
  AUTHORS   .
  CONSRTM   NCBI Genome Project
  TITLE     Direct Submission
  JOURNAL   Submitted (06-APR-2005) National Center for Biotechnology
            Information, NIH, Bethesda, MD 20894, USA
REFERENCE   3  (bases 1 to 2124241)
  AUTHORS   Halling,S.M., Bricker,B.J., Alt,D.P., Peterson-Burch,B.D.,
            Zuerner,R.L., Olsen,S.C., Whipple,D.L., Zhang,Q., Li,L.-L. and
            Kapur,V.
  TITLE     Direct Submission
  JOURNAL   Submitted (03-FEB-2004) ARS, USDA, National Animal Disease Center,
            2300 N. Dayton, P.O. Box 70, Ames, IA 50010, USA
COMMENT     PROVISIONAL REFSEQ <http://www.ncbi.nlm.nih.gov/RefSeq/>: This record has not yet been subject to final
            NCBI review. The reference sequence was derived from AE017223 <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=AE017223>.
            COMPLETENESS: full length.
FEATURES             Location/Qualifiers
     source          1..2124241
                     /organism="Brucella abortus biovar 1 str. 9-941"
                     /mol_type="genomic DNA"
                     /strain="9-941"
                     /db_xref="taxon:262698 <http://www.ncbi.nlm.nih.gov/Taxonomy/Browser/wwwtax.cgi?id=262698>"
                     /chromosome="I"
                     /biovar="1"
     gene <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=62288991&itemID=2031&view=gbwithparts>            784..2274
                     /gene="dnaA"
                     /locus_tag="BruAb1_0001"
                     /db_xref="GeneID:3339217 <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=retrieve&dopt=graphics&list_uids=3339217>"
     CDS <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=62288991&itemID=1&view=gbwithparts>             784..2274
                     /gene="dnaA"
                     /locus_tag="BruAb1_0001"
                     /note="similar to BR0001, chromosomal replication
                     initiator protein DnaA"
                     /codon_start=1
                     /transl_table=11 <http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c#SG11>
                     /product="DnaA, chromosomal replication initiator protein"
                     /protein_id="YP_220785.1 <http://www.ncbi.nlm.nih.gov/entrez/viewer.fcgi?val=YP_220785.1>"
                     /db_xref="GI:62288992"
                     /db_xref="GeneID:3339217 <http://www.ncbi.nlm.nih.gov/entrez/query.fcgi?db=gene&cmd=retrieve&dopt=graphics&list_uids=3339217>"
                     /translation="MKMDSAVSEEAFERLTAKLKARVGGEIYSSWFGRLKLDDISKSI
                     VRLSVPTAFLRSWINNHYSELLTELWQEENPQILKVEVVVRGVSRVVRSAAPAETCDN
                     AEAKPAVTPREKMVFPVGQSFGGQSLGEKRGSAVVAESAAATGAVLGSPLDPRYTFDT
                     FVDGASNRVALAAARTIAEAGSSAVRFNPLFIHASVGLGKTHLLQAIAAAALQRQEKA
                     RVVYLTAEYFMWRFATAIRDNNALSFKEQLRDIDLLVIDDMQFLQGKSIQHEFCHLLN
                     TLLDSAKQVVVAADRAPSELESLDVRVRSRLQGGVALEVAAPDYEMRLEMLRRRLASA
                     QCEDASLDIGEEILAHVARTVTGSGRELEGAFNQLLFRQSFEPNISIDRVDELLGHLT
                     RAGEPKRIRIEEIQRIVARHYNVSKQDLLSNRRTRTIVKPRQVAMYLAKMMTPRSLPE
                     IGRRFGGRDHTTVLHAVRKIEDLVGADTKLAQELELLKRLINDQAA"
...

ORIGIN      
        1 ttttccacac ttatccacag ggcgcgggcg ggactcggtt gcccctctga gtcaagcata
...

I consider myself a newbie to GUS and the above questions may be naive to many of you. However, your help is highly appreciated.

Thanks again.

- Tony

Chris Stoeckert wrote:

> Hi Tony,
> A beta release of the plugin InsertSequenceFeatures.pm is now 
> available for loading GenBank records (as well as other formats). The 
> semantics of how to use the GUS tables for this purpose are encoded in 
> that plugin. Is that what you were asking?
> Chris
>
> On Sep 2, 2005, at 10:31 AM, Tony Zhang wrote:
>
>> This is old topic and it may have been discussed many times. Still, I 
>> would like to get suggestion again about how to use GUS to store DNA, 
>> CDS, and corresponding protein sequence. Suppose I have one original 
>> record in Genkbank format. Thanks.
>>
>> - Tony
>>
>>
>>
>> -------------------------------------------------------
>> SF.Net email is Sponsored by the Better Software Conference & EXPO
>> September 19-22, 2005 * San Francisco, CA * Development Lifecycle 
>> Practices
>> Agile & Plan-Driven Development * Managing Projects & Teams * Testing 
>> & QA
>> Security * Process Improvement & Measurement * http://www.sqe.com/bsce5sf
>> _______________________________________________
>> Gusdev-gusdev mailing list
>> Gus...@li... 
>> <mailto:Gus...@li...>
>> https://lists.sourceforge.net/lists/listinfo/gusdev-gusdev
>>
>
>
>
> Chris Stoeckert, Ph.D.
>
> Research Associate Professor, Dept. of Genetics
>
> 1415 Blockley Hall, Center for Bioinformatics
>
> 423 Guardian Dr., University of Pennsylvania
>
> Philadelphia, PA 19104
>
> Ph: 215-573-4409 FAX: 215-573-3111
>
>