Hi Nowlan, I really appreciate all your help with this, It's working great now. Best, Stephanie
Hi Stephanie, You will need to load the genome sequence file from dryad (american_eel_genome_v5.fasta) as a custom genome in IGB instead of the genome sequence file from NCBI (GCA_001606085.1_ASM160608v1_genomic.fna.gz). Then load the gff file I previously attached (american_eel_genome_v5.sorted.gff.gz). I have attached a screenshot of what this looks like on my IGB. I don't know too much about the american eel genome, but the sequence and annotation on NCBI (mentioned in your Biostars post) is formatted...
Hi Nowlan, I loaded the recommended file (american_eel_genome_v5.sorted.gff.gz) in IGB, and I have taken a screen shot of the results and attached them to this comment. This is my first time viewing or attempting to interpret a gene annotation, but aren't I supposed to see bands of colour, not base pair letters? Thanks, Stephanie
Hi Stephanie, The issue appears to be with the gff file you are trying to view. The file does not appear to contain gene annotations mapped to a genome. The Anguilla rostrata (American eel) annotation is available from dryad. If you unpack the file, there is a file called american_eel_genome_v5.gff that appears to contain the annotation. I sorted and compressed the file and index (attached). Try loading it (american_eel_genome_v5.sorted.gff.gz) in IGB and let me know if it works for you. Best, N...
Are there a lot of reference sequences mentioned in the GFF file? I would recommend opening the same file sequence in anIDE with debugger to see where the hang up occurs. It would be nice if IGB could handle the various issues that come up with NCBI gff — NCBI is a major clearinghouse for genomic data that many people use. Regards, Ann Sent from my iPhone On Feb 20, 2019, at 10:46 AM, Nowlan Freese nfreese@users.sourceforge.net wrote: Hi Stephanie, The heap size at the bottom right has to do with...
Hi Stephanie, The heap size at the bottom right has to do with the amount of memory (RAM) being used by IGB. By default, IGB will use 25% of your computer's available memory (but can be changed if needed). IGB manages its memory usage automatically, there's nothing you need to do. The only concern would be if you are loading lots of data at once and running out of available memory. Nothing in IGB should take many hours, retrieving chromosomes normally takes a few seconds. How much memory do you have...
Hi Nowlan, Thanks for the info. I loaded the genome as a fna file, then the annotation as a .gff file, which was recommended to me on another online forum. The program has been "retreiving chromosomes" for many hours, do you know why this might be? Also, do you know what the "maximum heap size" is? It's displayed on the bottom right corner of the IGB interface, and the proportin of max heap size being used contnues to change as the program attempts to retreive chromosomes. Thanks again!
Hi Stephanie, To load a custom genome, IGB is looking for sequence files such as fasta, fna, or 2bit. If the genome for your data is not available, and you do not have a sequence file, you can drag and drop the .gff file directly into IGB and then click Load Data. Let me know if that helps. Nowlan
Hi Stephanie, To load a custom genome, IGB is looking for sequence files such as fasta, fna, or 2bit. If the genome for your data is not available, and you do not have a sequence file, you can drag and drop the .gff file directly into IGB and then click Load Data. Let me know if that helps. Nowlan
The IGB website says their program supports many different file formats, including gff. I have saved a genome from NCBI in several different file formats to my computer (using Ubuntu), but when I go to open the custom genome from IGB, it only recognizes the .fna file type, when I have genomes saved as .gbff and .gff in the same folder. It's like the program won't recognize these file types saved on my computer. Does anyone know why this might be happening? Thanks in advance.