#16 opt: mmx mo-comp & misc

closed-accepted
nobody
None
5
2005-04-13
2005-02-25
michael_f
No

- Use MMX for motion compensation. Although this only
uses MMX registers, it requires SSE2, so you need to
#define HAVE_SSE2 to enable this.

- Some other tweaks to mot_comp , mostly trying to
improve cache locality.

- Included clamp to (0,1020) bounds in motion
compensation output, so only I frames need to be
clipped in frame_decompress.cpp.

This includes the contents of my previous patch (making
array rows contiguous in memory).

Profiled on an Athlon64 3200, Windows2000, vs.net compiler

Total:
unopt 49126 samples
opt 33809 samples

CompensateBlock + children:
unopt 14986
opt 3966

CompensateComponent + ClipComponent
unopt 7545
opt 3098

1 sample ~= 0.1ms,

Discussion

  • michael_f

    michael_f - 2005-02-25
     
  • michael_f

    michael_f - 2005-03-01

    Additional patch

     
  • michael_f

    michael_f - 2005-03-01

    Logged In: YES
    user_id=1215900

    Notes for additional patch (patch2.txt):
    - Use SSE2 to accumulate and clip motion-compensation output.
    - Includes cache prefetch hints to try to reduce cache
    pollution a little.

    Apply on top of the main patch. As with the main patch, you
    need to define HAVE_SSE2 to get any benefit.

    The routine this optimises isn't a big bottleneck so
    performance improvements are modest.

     
  • Anuradha Suraparaju

    Logged In: YES
    user_id=692051

    Partially applied this patch.
    The following changes were applied
    - Tweaks to mot_comp.cpp to improve cache locality. Changes to
    mot_comp.cpp to use pointer arithmetic instead of index
    operator
    when using TwoDArray objects.
    Modified files: mot_comp.cpp
    new revision: 1.16; previous revision: 1.15

    The following changed were rejected
    - Clipping inter frames while performing motion
    compensation. An
    mmx/sse2 implementation of the Frame::ClipComponent was
    written
    instead.
    Modified file: frame.cpp
    new revision: 1.10; previous revision: 1.9

    The following changes were postponed
    - SSE2 implementation of Motion Compensation. Even though a
    modest
    decoding speedup is achieved, this change is postponed
    pending
    investigation into why encoding speed is greatly reduced
    when
    enabling both MMX (in motion estimation) and SSE2
    optimisation (in
    motion compensation).

     
  • Anuradha Suraparaju

    • status: open --> closed-accepted
     

Log in to post a comment.

Get latest updates about Open Source Projects, Conferences and News.

Sign up for the SourceForge newsletter:





No, thanks