|
From: Hans Vasquez-G. <hva...@un...> - 2024-01-18 20:28:04
|
Hi Scott, Thank you! I converted the file from GTF to GFF with agat utility. Now the tracks are properly rendering. Betss, -Hans ________________________________ From: Scott Cain <sc...@sc...> Sent: Thursday, January 11, 2024 3:01 PM To: Hans Vasquez-Gross <hva...@un...> Cc: gmo...@li... <gmo...@li...> Subject: Re: [Gmod-ajax] Requirements for a GFF3 to render properly? [EXTERNAL EMAIL] Another thing I should have mentioned when I said this looked like GTF: GTF is frequently called GFF2.5. What you posted would also pass as GFF 2. On Thu, Jan 11, 2024 at 2:45 PM Scott Cain <sc...@sc...<mailto:sc...@sc...>> wrote: Hi Hans, I'd say the problem is primarily that the snippet you've shown isn't GFF3; it looks much more like GTF (GFF3 would have "tag=value;tag2=value2" in the ninth column, as opposed to 'tag "value"; tag2 "value2"' in the ninth column). While JBrowse 2 does support GTF, it has some drawbacks, the biggest of which is there isn't an indexed form of it, so the tracks will load very slowly since JBrowse has to load the entire file and parse it in order to draw any portion of it. If you had GFF3, each of the exon features would have a "Parent" tag that pointed to the ID of the parent transcript. The other thing that jumps out at me for the data snippet you provided is that all of the transcripts appear to share the same start and end coordinates of the child exons, so they would only show up as individual exons anyway. I would guess that elsewhere in your GTF file you have more complicated examples with transcripts that have multiple exon children. So, it's hard to say what I would expect to see without a better example of your GTF, but you probably want to look at generated GFF3 anyway, so that you can take advantage of tabix indexing of the GFF3 files. Of course, feel free to follow up with more questions or example data and we can figure out where to go from there. Scott On Thu, Jan 11, 2024 at 2:29 PM Hans Vasquez-Gross <hva...@un...<mailto:hva...@un...>> wrote: Hello All, I have the output from isoseq collapse then pigeon index to create a sorted .gff3 file for a new assembly. Currently, this gff3 file has transcript and exon definitions. However, when I load this track data on JBrowse2, it shows the transcripts as one large unit and the exons as a separate unit. It doesn't seem to correct render the intron/exon boundaries. The annotation track is on top in yellow and the isoseq_reads bam file is below. Example data: ##pacbio-collapse-version 1.0 ##date Thu Jan 11 00:10:30 2024 UTC ctg_p_c_003493_0_75000_89999 PacBio transcript 11587 12122 . - . gene_id "PB.32086"; transcript_id "PB.32086.1"; ctg_p_c_003493_0_75000_89999 PacBio exon 11587 12122 . - . gene_id "PB.32086"; transcript_id "PB.32086.1"; ctg_p_c_033075_0 PacBio transcript 20564 22031 . + . gene_id "PB.31043"; transcript_id "PB.31043.1"; ctg_p_c_033075_0 PacBio exon 20564 22031 . + . gene_id "PB.31043"; transcript_id "PB.31043.1"; ctg_p_c_033075_0 PacBio transcript 20564 21887 . + . gene_id "PB.31043"; transcript_id "PB.31043.2"; ctg_p_c_033075_0 PacBio exon 20564 21887 . + . gene_id "PB.31043"; transcript_id "PB.31043.2"; ctg_p_c_033075_0 PacBio transcript 20564 21758 . + . gene_id "PB.31043"; transcript_id "PB.31043.3"; ctg_p_c_033075_0 PacBio exon 20564 21758 . + . gene_id "PB.31043"; transcript_id "PB.31043.3"; Any suggestions? Thank you, -Hans -- [signature_998258195] Hans Vasquez-Gross, Ph.D Bioinformatics Scientist, Nevada Bioinformatics Center https://www.unr.edu/bioinformatics hva...@un...<mailto:hva...@un...> _______________________________________________ Gmod-ajax mailing list Gmo...@li...<mailto:Gmo...@li...> https://lists.sourceforge.net/lists/listinfo/gmod-ajax -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Project Manager (http://gmod.org/) 216-392-3087 WormBase Developer (http://wormbase.org/) Alliance of Genome Resources Group Leader (http://alliancegenome.org/) VirusSeq Project Manager (https://virusseq-dataportal.ca/) Human Cancer Models Initiative Project Manager (https://hcmi-searchable-catalog.nci.nih.gov/) -- ------------------------------------------------------------------------ Scott Cain, Ph. D. scott at scottcain dot net GMOD Project Manager (http://gmod.org/) 216-392-3087 WormBase Developer (http://wormbase.org/) Alliance of Genome Resources Group Leader (http://alliancegenome.org/) VirusSeq Project Manager (https://virusseq-dataportal.ca/) Human Cancer Models Initiative Project Manager (https://hcmi-searchable-catalog.nci.nih.gov/) This email originated outside of the University of Nevada, Reno. Do not click on links or attachments unless you recognize the sender and know the content is safe. Report suspicious emails to the Office of Information Technology (OIT) at ab...@un.... |