Due to the increasing size of the major biological sequence databases (e.g. DDBJ, EMBL-Bank, GenBank, RefSeq and UniProtKB) there is an increasing need for simple tools which can reformat the flat-file formats used by these databases into fasta sequence format for use with other tools.
While many multi-purpose sequence reformatting tools are available (e.g. EMBOSS and Readseq), due to their generic support for many sequence formats and extensive feature sets, they have limited performance compared to older dedicated tools. Unfortunately the older tools have issues with modern platforms (e.g. support for files >2GB, library dependencies, binary compatibility, etc.).
The 'x2fasta' project aims to provide a collection of highly efficient sequence reformatting tools based on the the programs provided with WU-BLAST:
And where possible maintaining command-line and output compatibility with these tools.
The currently available programs are:
Complete source code can be found in the x2fasta Subversion repository (see Code).