From: Scott C. <sc...@sc...> - 2010-08-26 18:33:56
|
Hi Anja, I'm cc'ing this to the JBrowse list they can chime in to. The wonderful thing about JBrowse is that it is quite flexible in what it can display; that is also a downside, though, as it might require you to create the underlying graphics if native support isn't there. Since I don't really understand what you would want the image to look like, would it be possible for you to create a sketch of what you want to see? That would make it a lot easier to say whether what you want to do is possible, easy or hard. Scott On Thu, Aug 26, 2010 at 2:21 PM, Anja Friedrich <fri...@ho...> wrote: > Hi Scott, > > thanks for your explanation. I will go through it. > As I told J my request to add nested_tandem_repeat to SO was accepted. My > problem is that I have 3 different repeat regions with 2 motifs. I want to > display the regions and the motifs in JBrowse. I was wondering how I have to > create the single lines. One line for the region and 1 for each motif? For > all 3 repeat regions of course... Would that work? > > Cheers, > Anja > > >> Date: Thu, 26 Aug 2010 14:10:58 -0400 >> Subject: Re: [Gmod-schema] nested tandem repeats/gff >> From: sc...@sc... >> To: jm...@vc... >> CC: fri...@ho...; gmo...@li... >> >> Hi Anja, >> >> I'm going to put comments below in response to both you and J. >> >> Scott >> >> >> On Thu, Aug 26, 2010 at 11:54 AM, J.M.P. Alves <jm...@vc...> wrote: >> > Hallo, >> > >> > A few possibilities I see, although I am not positive they are the >> > actual problem: >> > >> > > 2 NTR program nested_repeat 831 1720 . . . ID=ID1;Name=ID1 >> > >> > Are "nested_repeat" and "repeat_fragment" part of the sequence ontology >> > vocabulary? I think you can use anything you want on the second column >> > there ("program"), but the 3rd one is restricted. >> >> In my instance of Chado (a few weeks old at the most), SO has both >> terms, so that isn't likely the problem, but it's a good idea to >> check. A query like this would tell you if it's present: >> >> select cvterm.* from cvterm join cv using (cv_id) >> where cv.name='sequence' and cvterm.name='nested_repeat'; >> >> > >> > NTR is the name of the sequence (as in the FASTA file), right? >> >> Right, if a feature named NTR isn't already in the database, you will >> have problems. From the error message you are getting, I'm guessing >> it's already there (or it would have complained about that), but >> perhaps I misremembering the order of error messaging. I would >> suggest getting rid of the sequence-region directive, as unless you >> have a fairly recent checkout of bioperl-live it will cause problems, >> and in any event, will never be supported for defining a feature (and >> this one isn't properly formed anyway--there's no start value and "bp" >> isn't part of the spec). Instead, add a full gff line: >> >> NTR . contig 1 5428 . . . ID=NTR;Name=NTR >> >> Looking at your sample GFF again, it looks to me like you want these >> feature to reside on a feature called "taro", is that right? Or is >> there a feature called NTR? If the contig/chromosome/whatever is >> called taro, then you should replace the text in the first column of >> the gff with "taro" and create a GFF line for it, like I did for NTR >> above. >> >> > >> > Another possible problem: >> > >> > > 3 NTR program repeat_fragment 1505 1553 . + . Parent=ID1 >> > >> > I don't know if this is the case, but I thought every line had to have >> > an ID attribute, e.g.: >> > >> > NTR program repeat_fragment 1505 1553 . + . ID=rf1;Parent=ID1 >> >> Not so: ID tags are only needed in two cases: >> >> 1. To identify a feature so it can be referred to later to show >> parentage (as Anja did in the sample GFF) and >> >> 2. To identify a reference sequence so it can be referred to in column >> one (this is NOT part of the GFF3 spec, but life will work a lot >> better with Chado if reference sequences look like >> "ID=chr1;Name=chr1..." >> >> Scott >> >> > >> > I hope some of these ideas help. >> > >> > J >> > >> > Anja Friedrich wrote: >> >> Hi all, >> >> >> >> not sure if my earlier mail reached, because I was texting from a >> >> different e-mail. >> >> >> >> I tried to load nested tandem repeats into chado. As this fature doesnt >> >> exist yet for gff3 I tried to get around: >> >> >> >> 0 ##gff-version 3 >> >> 1 ##sequence-region taro 5428 bp >> >> 2 NTR program nested_repeat 831 1720 . . . ID=ID1;Name=ID1 >> >> 3 NTR program repeat_fragment 1505 1553 . + . Parent=ID1 >> >> 4 NTR program repeat_fragement 473 483 . + . Parent=ID1 >> >> >> >> But I get this error message: >> >> >> >> anou@anou-laptop:~$ gmod_bulk_load_gff3.pl --organism Taro --gfffile >> >> taro.gffCommand line argument used for root >> >> Preparing data for inserting into the chado database >> >> (This may take a while ...) >> >> >> >> --------------------- WARNING --------------------- >> >> MSG: Calling end without a defined start position >> >> --------------------------------------------------- >> >> Use of uninitialized value $featuretype in pattern match (m//) at >> >> /usr/local/bin/gmod_bulk_load_gff3.pl line 808, <GEN0> line 1. >> >> Use of uninitialized value $featuretype in pattern match (m//) at >> >> /usr/local/bin/gmod_bulk_load_gff3.pl line 809, <GEN0> line 1. >> >> >> >> ------------- EXCEPTION: Bio::Root::Exception ------------- >> >> MSG: no cvterm for >> >> STACK: Error::throw >> >> STACK: Bio::Root::Root::throw >> >> /usr/local/share/perl/5.10.1/Bio/Root/Root.pm:368 >> >> STACK: Bio::GMOD::DB::Adapter::get_type >> >> /usr/local/share/perl/5.10.1/Bio/GMOD/DB/Adapter.pm:4579 >> >> STACK: /usr/local/bin/gmod_bulk_load_gff3.pl:838 >> >> ----------------------------------------------------------- >> >> >> >> Someone an idea? Cant I add the feature like this? >> >> >> >> Cheers, >> >> Anja >> >> >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> >> >> ------------------------------------------------------------------------------ >> >> Sell apps to millions through the Intel(R) Atom(Tm) Developer Program >> >> Be part of this innovative community and reach millions of netbook >> >> users >> >> worldwide. Take advantage of special opportunities to increase revenue >> >> and >> >> speed time-to-market. Join now, and jumpstart your future. >> >> http://p.sf.net/sfu/intel-atom-d2d >> >> >> >> >> >> >> >> ------------------------------------------------------------------------ >> >> >> >> _______________________________________________ >> >> Gmod-schema mailing list >> >> Gmo...@li... >> >> https://lists.sourceforge.net/lists/listinfo/gmod-schema >> > >> > -- >> > ------------------------------- >> > João Marcelo Pereira Alves (J) >> > Post-doctoral fellow >> > MCV / VCU - Richmond, VA >> > http://bioinfo.lpb.mic.vcu.edu >> > f. 1-804-828-3897 >> > >> > >> > >> > ------------------------------------------------------------------------------ >> > Sell apps to millions through the Intel(R) Atom(Tm) Developer Program >> > Be part of this innovative community and reach millions of netbook users >> > worldwide. Take advantage of special opportunities to increase revenue >> > and >> > speed time-to-market. Join now, and jumpstart your future. >> > http://p.sf.net/sfu/intel-atom-d2d >> > _______________________________________________ >> > Gmod-schema mailing list >> > Gmo...@li... >> > https://lists.sourceforge.net/lists/listinfo/gmod-schema >> > >> >> >> >> -- >> ------------------------------------------------------------------------ >> Scott Cain, Ph. D. scott at scottcain >> dot net >> GMOD Coordinator (http://gmod.org/) 216-392-3087 >> Ontario Institute for Cancer Research > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Coordinator (http://gmod.org/) 216-392-3087 Ontario Institute for Cancer Research |