Name | Modified | Size | Downloads / Week |
---|---|---|---|
Parent folder | |||
ncbi_refseq_complete_microbes.fna.gz | 2012-12-07 | 2.3 GB | |
taxids.txt | 2012-12-07 | 582.8 kB | |
README.txt | 2012-12-07 | 886 Bytes | |
ncbi_refseq_complete_viruses.fna.gz | 2012-12-07 | 30.1 MB | |
ncbi_refseq_complete_protozoa.fna.gz | 2012-12-07 | 93.4 MB | |
Totals: 5 Items | 2.4 GB | 0 |
This directory contains files useful for GAAS (http://sourceforge.net/projects/gaas) The *.fna.gz files are gzip-compressed FASTA nucleic sequence files obtained from NCBI Refseq on Dec 7, 2012 (RefSeq Release 56) (ftp://ftp.ncbi.nih.gov/refseq/release/). All sequences whose title contained the following words were removed: shotgun, contig, partial, end, part. Thus the resulting FASTA files contain complete nucleic sequences (from viruses, microbes and protozoa). The taxids.txt.gz file is gzipped and contains the accession number, taxon ID and taxon name of the sequences present in the FASTA files based on the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/). The taxon ID of the following type of sequences was removed unless the main genome was also present: plasmid, transposon, chloroplast, plastid, mitochondrion, apicoplast, macronuclear, cyanelle and kinetoplast.