GP suggested optimisations

wlangdon
2013-01-21
2013-06-05
  • wlangdon
    wlangdon
    2013-01-21

    bt2_io.cpp line 622 remove loop for(uint32_t i = 0; i < offsLenSampled; i++)
       it only invokes assert (many times)

    sa_rescomb.cpp line 50. for(size_t i = 0; i < satup_->offs.size(); i++) {
    The suggestion is simply to remove the loop which sets up needResolving.
    This works in some cases but perhaps more widely hints that this loop is expensive and is not
    being useful. How often is needResolving > 0? Could it be maintained instep with satup_ rather
    than being counted by this expensive loop?

    sa_resomb.cpp line 69 for(size_t j = 0; j < satup_->offs.size(); j++) {
    Again the suggestion is simply to remove the loop which again works in some cases.
    However again perhaps it suggest the code is not as efficient as it might be. We
    have two nested loops (indexes i and j) which scan two lists one setting found__
    the other clearing it. How often is it cleared? May be there is a better way to find the
    intersection of the two lists?

    aligner_swsse_ee_u8.cpp line 707 vh = _mm_max_epu8(vh, vf);
    aligner_swsse_ee_u8.cpp line 766 pvFStore += ROWSTRIDE;
    aligner_swsse_ee_u8.cpp line 772 _mm_store_si128(pvHStore, vh);
    aligner_swsse_ee_u8.cpp line 778 ve = _mm_max_epu8(ve, vh);

    line 707 suggests _mm_max_epu8(vh, vf); can be removed. I guess replacing it with  vmax = vlo; has no effect
    since vmax has already been set to vlo. (Could the compiler spot this, and remove the second redundant assignment?)
    However I am guessing vh = _mm_max_epu8(vh, vf); can be removed
    because vf are seldom bigger than ve (used in the previous line, line 706)
    or vh.

    Perhaps removing line 766 compensates for the deletion of line 772.
    (If so perhaps also deleting line 773 would be clearer.)

    Replacing line 772 with vh = _mm_max_epu8(vh, vf); is effectively the
    same as deleting it (since line 772 now simply repeats line 771.
    (Could the compiler spot this, and remove the second redundant
    instruction?)

    Removing line 778 ve = _mm_max_epu8(ve, vh); suggests that
    pvEStore is not smaller than vh

    Line numbers refer to the initial release (2.0.0-beta2).
    Bill_