I've been using the mauve-parser.pl and calculate-dxy.pl scripts and identified one possible bug and one thing I'd like clarified:
1.) From mauve-parser.pl, I'm consistently getting the output along the lines of "Not fitting character states in chromosome KB667555; at position 266212; G not equal to C". However, the script seems to run to completion and the output of calculate-dxy.pl, downstream, is sensible and includes KB667555. Can you give me any more information about this warning so I can determine how to fix my pipeline?
2.) The output of calculate-dxy.pl has six columns: the chromosome, the position, three columns for distance, and a sixth with numbers ranging from 0.0 to 29.67. All the rows where the distance columns are "na" have either 0.0 or 0.01 in this sixth column, and all the rows where the distance columns have a value have at least 0.60 in this column, so I thought it might be the fraction covered (I did not change the default min-covered-fraction); however, the maximum is well over 1. The help for calculate-dxy.pl only documents five columns. Is this sixth column supposed to be there, is it sensible, and if so, what is it?
Thanks.
To follow up:
as often seems to happen, once I submitted this ticket I discovered that I had confused the reference and the outgroup in the input to mauve-parser.pl. After correcting that mistake, the sixth column of calculate-dxy.pl has values of 0.0, and 0.6-1.0. Can you confirm that this is indeed coverage? And if so, how is it calculated? The calculate-dxy.pl script refers to "the minimum fraction of a window being between min-coverage and max-coverage in ALL populations," which doesn't seem to apply here.
Last edit: rrlove 2015-12-02