Warning against using phylip2fasta.sh
BBMap short read aligner, and other bioinformatic tools.
Brought to you by:
brian-jgi
The script phylip2fasta.sh
removes gap symbols (-
) from the multiple sequence alignments. This is undocumented (as fas as I can tell. Nothing mentioned when issuing phylip2fasta.sh -h
).
Input format for this script is a file in the "phylip format" which is a commonly used format for multiple sequence alignments (MSA), which is a file where all sequences have the same length. Length differences in such files are commonly accomodated by using "gap characters", very often the dash (-
). It makes no sence removing those when converting a phylip MSA to a MSA in Fasta format.
Suggestion:
Change default format to NOT to remove gap symbols, or at least add this to the built-in documentation.