A simple command-line utility to calculate biological sequence (DNA or protein) sizes in a (multi) FASTA file. It gives averages, GC (or methionine) content, N50, N90, N95, number of N's, and total bases, and can also report by codon if requested.

Features

  • sequence sizes (DNA or protein)
  • GC content, in percentage (for each sequence and overall weighted average)
  • methionine content, in absolute number and percentage (protein only)
  • codon GC content (DNA only)
  • multi FASTA input files
  • reports average sequence size, total nucleotides, N50, N90, and N95
  • by default, report shows sequence names sorted in descending size order
  • report is tab-delimited text with results from one FASTA entry per line
  • gzip-compressed input supported

Project Activity

See All Activity >

Categories

Bio-Informatics

License

GNU General Public License version 3.0 (GPLv3)

Follow mfsizes

mfsizes Web Site

Other Useful Business Software
Simple, Secure Domain Registration Icon
Simple, Secure Domain Registration

Get your domain at wholesale price. Cloudflare offers simple, secure registration with no markups, plus free DNS, CDN, and SSL integration.

Register or renew your domain and pay only what we pay. No markups, hidden fees, or surprise add-ons. Choose from over 400 TLDs (.com, .ai, .dev). Every domain is integrated with Cloudflare's industry-leading DNS, CDN, and free SSL to make your site faster and more secure. Simple, secure, at-cost domain registration.
Sign up for free
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of mfsizes!

Additional Project Details

Intended Audience

Science/Research

User Interface

Command-line

Programming Language

Perl

Related Categories

Perl Bio-Informatics Software

Registered

2011-05-20