Welcome to Pontos, version 3.0
What is Pontos
Pontos is an easy-to-use, graphical Java program for the calculation of uncorrected distance (or similarity) matrices from DNA sequence alignments in PHYLIP format. It also creates ["difference" alignments] from regular ones (and vice-versa).
It can handle gaps and ambiguities in different ways, explicitly and in a manner that is easy to cite for reproducibility.
Gaps can be:
- all used;
- all ignored;
- ignored only at the ends of the sequences, in a pairwise manner;
- ignored only at the ends, but now globally (in effect trimming the whole alignment to the farthest sequences from the ends).
(gaps are always ignored when they are common between the two sequences being compared at the time)
See [Gap modes explained] for a graphical representation of the gap handling modes.
Ambiguities (things like R, Y, N, W, etc. in a DNA sequence) can be treated like:
- consider ambiguities as always different;
- consider ambiguity as partially different (e.g. R would be 0.5 different from A or G);
- ignore ambiguities in each pairwise comparison;
- remove all columns, globally, that show any ambiguity.
[Weights for ambiguities] are calculated using simple probability rules.
Pontos was written in Linux, but should run in any system where Java works. Pontos is licensed under the GPL version 3.
The only program that I have found whose feature set comes close to the one implemented in Pontos is EMBOSS' distmat.
Pros of distmat
Compared to Pontos, in my understanding distmat might be considered better in that it:
- is easy to automate from the command-line, if that is what you need and know how to do it;
- allows for other types of distance measures, like Jukes-Cantor, Kimura 2-parameter, etc..;
- allows arbitrary weight assignment to gaps, while Pontos always considers gaps as a full difference;
- allows the choice of which codon positions to use in calculations;
- calculates protein distances;
- last but not least, has had a lot of audience and testing over many years, ironing out bugs.
Cons of distmat compared to Pontos
Compared to Pontos, in my understanding distmat might be considered worse in that it:
- is harder to use for most biologists, being exclusively command-line driven;
- only allows the comparison of distances, not similarities;
- does not allow the user to specify that output values would be percentages;
- does not allow the user to choose the number of decimal places in the output matrix;
- does not allow for the output of sequence positions used in each calculation;
- does not generate detailed output on the actual number of differences found and the number of positions compared (effectively the two numbers used to calculate the distance when divided);
- allows only the assignment of different weights for all gaps, but does not differentiate between terminal and internal gaps -- it is possible then to ignore all gaps (like Pontos can do) by giving a weight of 0 to gaps (default of distmat), but to ignore terminal gaps the user must manually give the start and end coordinates for the calculation (using -sbegin and -send);
- has less options on how to handle gaps, e.g. either globally or in each pairwise comparison;
- is not completely explicit in how different types of ambiguity comparisons are being handled (documentation only gives one example, a comparison of M and A giving a difference score of 0.5);
- has less options on how to handle ambiguous positions, e.g. either globally or in each pairwise comparison.