Re: [Gmod-ajax] Requirements for a GFF3 to render properly?

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

Hi Hans,

I'd say the problem is primarily that the snippet you've shown isn't GFF3;
it looks much more like GTF (GFF3 would have "tag=value;tag2=value2" in the
ninth column, as opposed to 'tag "value"; tag2 "value2"' in the ninth
column). While JBrowse 2 does support GTF, it has some drawbacks, the
biggest of which is there isn't an indexed form of it, so the tracks will
load very slowly since JBrowse has to load the entire file and parse it in
order to draw any portion of it. If you had GFF3, each of the exon features
would have a "Parent" tag that pointed to the ID of the parent transcript.

The other thing that jumps out at me for the data snippet you provided is
that all of the transcripts appear to share the same start and end
coordinates of the child exons, so they would only show up as individual
exons anyway. I would guess that elsewhere in your GTF file you have more
complicated examples with transcripts that have multiple exon children.

So, it's hard to say what I would expect to see without a better example of
your GTF, but you probably want to look at generated GFF3 anyway, so that
you can take advantage of tabix indexing of the GFF3 files.

Of course, feel free to follow up with more questions or example data and
we can figure out where to go from there.

Scott

On Thu, Jan 11, 2024 at 2:29 PM Hans Vasquez-Gross <hva...@un...>
wrote:

> Hello All,
>
> I have the output from isoseq collapse then pigeon index to create a
> sorted .gff3 file for a new assembly. Currently, this gff3 file has
> transcript and exon definitions.  However, when I load this track data on
> JBrowse2, it shows the transcripts as one large unit and the exons as a
> separate unit. It doesn't seem to correct render the intron/exon
> boundaries. The annotation track is on top in yellow and the isoseq_reads
> bam file is below.
>
> Example data:
> ##pacbio-collapse-version 1.0
> ##date Thu Jan 11 00:10:30 2024 UTC
> ctg_p_c_003493_0_75000_89999 PacBio transcript 11587 12122 . - . gene_id
> "PB.32086"; transcript_id "PB.32086.1";
> ctg_p_c_003493_0_75000_89999 PacBio exon 11587 12122 . - . gene_id
> "PB.32086"; transcript_id "PB.32086.1";
> ctg_p_c_033075_0 PacBio transcript 20564 22031 . + . gene_id "PB.31043";
> transcript_id "PB.31043.1";
> ctg_p_c_033075_0 PacBio exon 20564 22031 . + . gene_id "PB.31043";
> transcript_id "PB.31043.1";
> ctg_p_c_033075_0 PacBio transcript 20564 21887 . + . gene_id "PB.31043";
> transcript_id "PB.31043.2";
> ctg_p_c_033075_0 PacBio exon 20564 21887 . + . gene_id "PB.31043";
> transcript_id "PB.31043.2";
> ctg_p_c_033075_0 PacBio transcript 20564 21758 . + . gene_id "PB.31043";
> transcript_id "PB.31043.3";
> ctg_p_c_033075_0 PacBio exon 20564 21758 . + . gene_id "PB.31043";
> transcript_id "PB.31043.3";
>
>
> Any suggestions?
>
> Thank you,
> -Hans
>
> --
>
>
>
> [image: signature_998258195]
>
> *Hans Vasquez-Gross, Ph.D*
>
> Bioinformatics Scientist,
> Nevada Bioinformatics Center
>
> https://www.unr.edu/bioinformatics
>
> hva...@un...
>
> _______________________________________________
> Gmod-ajax mailing list
> Gmo...@li...
> https://lists.sourceforge.net/lists/listinfo/gmod-ajax
>

-- 
------------------------------------------------------------------------
Scott Cain, Ph. D.                                                  scott
at scottcain dot net
GMOD Project Manager (http://gmod.org/)
216-392-3087
WormBase Developer (http://wormbase.org/)
Alliance of Genome Resources Group Leader (http://alliancegenome.org/)
VirusSeq Project Manager (https://virusseq-dataportal.ca/)
Human Cancer Models Initiative Project Manager (
https://hcmi-searchable-catalog.nci.nih.gov/)