Hi, what is the new -band option on shuffle_pads do? Also is there some documentation on any of these anywhere?
It effectively implements Gene Myers' ReAligner algorithm. It started using a different, but similar technique and then after finding his paper I tweaked it to use the same scoring function.
The band size is the band used in the dynamic alignment algorithm. Keeping it small is fast but limits the distance a pad can move in a single pass.
The basic premise is this: take a reading out of the assembly and realign it back to the multi-sequence alignment that forms the contig at that point (minus the sequence you're realigning). Repeat this for every reading in the contig. If the net result has fewer disagreements to the consensus than before then we changed something, so start all over again and cycle until we cannot improve it.
It's pretty brute force, but does do a good job of improving the alignments and is miles better than the old pad shuffling code. (The very idea of moving pads about is flawed - the optimal alignment may require removing pads or adding new ones.)
Log in to post a comment.