A simple command-line utility to calculate biological sequence (DNA or protein) sizes in a (multi) FASTA file. It gives averages, GC (or methionine) content, N50, N90, N95, number of N's, and total bases, and can also report by codon if requested.
Features
- sequence sizes (DNA or protein)
- GC content, in percentage (for each sequence and overall weighted average)
- methionine content, in absolute number and percentage (protein only)
- codon GC content (DNA only)
- multi FASTA input files
- reports average sequence size, total nucleotides, N50, N90, and N95
- by default, report shows sequence names sorted in descending size order
- report is tab-delimited text with results from one FASTA entry per line
- gzip-compressed input supported
Categories
Bio-InformaticsLicense
GNU General Public License version 3.0 (GPLv3)Follow mfsizes
Other Useful Business Software
$300 Free Credits for Your Google Cloud Projects
Launch your next project with $300 in free Google Cloud credits—no strings attached. Test, build, and deploy without risk. Use your credits across the entire Google Cloud platform to find what works best for your needs. After your credits are used, continue with always-free tier services. Only pay when you're ready to scale. Sign up in minutes and start exploring.
Rate This Project
Login To Rate This Project
User Reviews
Be the first to post a review of mfsizes!