Menu

#4 Improved whirlpool hash performance

None
closed-fixed
nobody
None
5
2014-08-04
2014-07-08
And Sch
No

This patch allows gcc to better optimize the whirlpool hash compression function and increases speed noticeably. Try it out. I just attached the new rhash_whirlpool_process_block function, that is all that has changed.

Technically, I created an even faster version, but it requires a conversion to little endian, and -funroll-loops, which isn't that safe.

100 MB file benchmark
old version: 0.895s
new version: 0.730s
little endian -funroll-loops (not posted): 0.662s

1 Attachments

Discussion

  • Aleksey

    Aleksey - 2014-07-08

    Thanks for the patch. I'll check it later, when I have time :)

     
  • And Sch

    And Sch - 2014-07-23

    Here is a newer version that is a bit faster and one less variable. I also recommend increasing the buffer size of rhash for slightly better performance on all hashes.

     
  • And Sch

    And Sch - 2014-08-03

    I managed to squeeze a bit more performance out of it. I think this is the most that can be done, except for assembly, which I do have, but it's not portable.

     
  • Aleksey

    Aleksey - 2014-08-03

    Running several benchmarks on Core i7, Win7 produced the following results.

    Compilers used: MinGW 32/64-bit, and 64-bit MS VC 2013.
    Notations: "orig" - the current version from git;
    "optN" - the N-th optimization from this thread (see above);
    "openssl" the OpenSSL Whirlpool implementation (used for comparision).

    gcc (mingw) 4.7.0 x64
    orig: WHIRLPOOL 256 MiB total in 2,199 sec, 116,429 MBps, CPB=26,84
    opt1: WHIRLPOOL 256 MiB total in 1,820 sec, 140,629 MBps, CPB=22,21
    opt2: WHIRLPOOL 256 MiB total in 1,812 sec, 141,289 MBps, CPB=22,17
    opt3: WHIRLPOOL 256 MiB total in 1,791 sec, 142,911 MBps, CPB=21,85
    openssl: WHIRLPOOL 256 MiB total in 1,944 sec, 131,659 MBps, CPB=23,41

    gcc (mingw) 4.8.1 32-bit
    orig: WHIRLPOOL 256 MiB total in 4,636 sec, 55,223 MBps, CPB=56,53
    opt1: WHIRLPOOL 256 MiB total in 4,230 sec, 60,525 MBps, CPB=51,64
    opt2: WHIRLPOOL 256 MiB total in 4,176 sec, 61,297 MBps, CPB=51,00
    opt3: WHIRLPOOL 256 MiB total in 3,809 sec, 67,200 MBps, CPB=46,52
    openssl: WHIRLPOOL 256 MiB total in 1,869 sec, 136,970 MBps, CPB=22,52

    msvc 13 x64
    orig: WHIRLPOOL 256 MiB total in 1,885 sec, 135,811 MBps, CPB=22,62
    opt1: WHIRLPOOL 256 MiB total in 1,802 sec, 142,042 MBps, CPB=21,72
    opt2: WHIRLPOOL 256 MiB total in 1,804 sec, 141,868 MBps, CPB=22,06
    opt3: WHIRLPOOL 256 MiB total in 2,325 sec, 110,113 MBps, CPB=28,42
    openssl: WHIRLPOOL 256 MiB total in 1,931 sec, 132,567 MBps, CPB=23,40

    Summary: the 2nd and 3rd optimization are the best for 32/64-bit GCC,
    while the 3rd one degrades the Whirlpool performance under MS VC.
    So it's better to switch to the 2nd optimization.

     
  • Aleksey

    Aleksey - 2014-08-04

    The second optimization was partly incorporated into RHash 1.3.3.

     
  • Aleksey

    Aleksey - 2014-08-04
    • status: open --> closed-fixed
     

Log in to post a comment.