From: Scott C. <sc...@sc...> - 2024-01-11 23:01:42
|
Another thing I should have mentioned when I said this looked like GTF: GTF is frequently called GFF2.5. What you posted would also pass as GFF 2. On Thu, Jan 11, 2024 at 2:45 PM Scott Cain <sc...@sc...> wrote: > Hi Hans, > > I'd say the problem is primarily that the snippet you've shown isn't GFF3; > it looks much more like GTF (GFF3 would have "tag=value;tag2=value2" in the > ninth column, as opposed to 'tag "value"; tag2 "value2"' in the ninth > column). While JBrowse 2 does support GTF, it has some drawbacks, the > biggest of which is there isn't an indexed form of it, so the tracks will > load very slowly since JBrowse has to load the entire file and parse it in > order to draw any portion of it. If you had GFF3, each of the exon features > would have a "Parent" tag that pointed to the ID of the parent transcript. > > The other thing that jumps out at me for the data snippet you provided is > that all of the transcripts appear to share the same start and end > coordinates of the child exons, so they would only show up as individual > exons anyway. I would guess that elsewhere in your GTF file you have more > complicated examples with transcripts that have multiple exon children. > > So, it's hard to say what I would expect to see without a better example > of your GTF, but you probably want to look at generated GFF3 anyway, so > that you can take advantage of tabix indexing of the GFF3 files. > > Of course, feel free to follow up with more questions or example data and > we can figure out where to go from there. > > Scott > > > On Thu, Jan 11, 2024 at 2:29 PM Hans Vasquez-Gross <hva...@un...> > wrote: > >> Hello All, >> >> I have the output from isoseq collapse then pigeon index to create a >> sorted .gff3 file for a new assembly. Currently, this gff3 file has >> transcript and exon definitions. However, when I load this track data on >> JBrowse2, it shows the transcripts as one large unit and the exons as a >> separate unit. It doesn't seem to correct render the intron/exon >> boundaries. The annotation track is on top in yellow and the isoseq_reads >> bam file is below. >> >> Example data: >> ##pacbio-collapse-version 1.0 >> ##date Thu Jan 11 00:10:30 2024 UTC >> ctg_p_c_003493_0_75000_89999 PacBio transcript 11587 12122 . - . gene_id >> "PB.32086"; transcript_id "PB.32086.1"; >> ctg_p_c_003493_0_75000_89999 PacBio exon 11587 12122 . - . gene_id >> "PB.32086"; transcript_id "PB.32086.1"; >> ctg_p_c_033075_0 PacBio transcript 20564 22031 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.1"; >> ctg_p_c_033075_0 PacBio exon 20564 22031 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.1"; >> ctg_p_c_033075_0 PacBio transcript 20564 21887 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.2"; >> ctg_p_c_033075_0 PacBio exon 20564 21887 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.2"; >> ctg_p_c_033075_0 PacBio transcript 20564 21758 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.3"; >> ctg_p_c_033075_0 PacBio exon 20564 21758 . + . gene_id "PB.31043"; >> transcript_id "PB.31043.3"; >> >> >> Any suggestions? >> >> Thank you, >> -Hans >> >> -- >> >> >> >> [image: signature_998258195] >> >> *Hans Vasquez-Gross, Ph.D* >> >> Bioinformatics Scientist, >> Nevada Bioinformatics Center >> >> https://www.unr.edu/bioinformatics >> >> hva...@un... >> >> _______________________________________________ >> Gmod-ajax mailing list >> Gmo...@li... >> https://lists.sourceforge.net/lists/listinfo/gmod-ajax >> > > > -- > ------------------------------------------------------------------------ > Scott Cain, Ph. D. scott > at scottcain dot net > GMOD Project Manager (http://gmod.org/) > 216-392-3087 > WormBase Developer (http://wormbase.org/) > Alliance of Genome Resources Group Leader (http://alliancegenome.org/) > VirusSeq Project Manager (https://virusseq-dataportal.ca/) > Human Cancer Models Initiative Project Manager ( > https://hcmi-searchable-catalog.nci.nih.gov/) > -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Project Manager (http://gmod.org/) 216-392-3087 WormBase Developer (http://wormbase.org/) Alliance of Genome Resources Group Leader (http://alliancegenome.org/) VirusSeq Project Manager (https://virusseq-dataportal.ca/) Human Cancer Models Initiative Project Manager ( https://hcmi-searchable-catalog.nci.nih.gov/) |