From: Jos v.d.V. <jo...@us...> - 2004-12-27 22:01:15
|
Update of /cvsroot/win32forth/win32forth/src/lib/fmacro In directory sc8-pr-cvs1.sourceforge.net:/tmp/cvs-serv10329/src/lib/fmacro Modified Files: mm_fw_fm.f Log Message: - Jos: Updated de results since Win32Forth and fmacro.f were updated Index: mm_fw_fm.f =================================================================== RCS file: /cvsroot/win32forth/win32forth/src/lib/fmacro/mm_fw_fm.f,v retrieving revision 1.1 retrieving revision 1.2 diff -C2 -d -r1.1 -r1.2 *** mm_fw_fm.f 21 Dec 2004 00:19:10 -0000 1.1 --- mm_fw_fm.f 27 Dec 2004 22:01:04 -0000 1.2 *************** *** 7,42 **** \ * March 17th, 2003 J.v.d.Ven: Changed for the updated fmacro.f \ * May 12th, 2003 J.v.d.Ven: Changed DDOT DAXPY do-WARNER() and DO-MAENO. ! \ * \ A changed MM benchmark from: http://home.iae.nl/users/mhx/mm.fw (( - 80x80 mm - normal algorithm 27.30 MFlops, 14.61 ticks/flop, 0.037 s - 80x80 mm - blocking, factor of 20 18.45 MFlops, 21.62 ticks/flop, 0.055 s - 80x80 mm - transposed B matrix 34.96 MFlops, 11.41 ticks/flop, 0.029 s - 80x80 mm - Robert's algorithm 36.67 MFlops, 10.87 ticks/flop, 0.027 s - 80x80 mm - T. Maeno's algorithm, subarray 20x20 11.72 MFlops, 34.01 ticks/flop, 0.087 s - 80x80 mm - D. Warner's algorithm, subarray 20x20 13.73 MFlops, 29.05 ticks/flop, 0.074 s - - ! ! ALL-TESTS \ Using the orginal code and Win32Forth CLK 400 MHz ! 80x80 mm - normal algorithm 5.93 MFlops, 67.36 ticks/flop, 0.172 s ! 80x80 mm - blocking, factor of 20 2.86 MFlops, 139.54 ticks/flop, 0.357 s ! 80x80 mm - transposed B matrix 5.38 MFlops, 74.25 ticks/flop, 0.190 s ! 80x80 mm - Robert's algorithm 5.58 MFlops, 71.58 ticks/flop, 0.183 s ! 80x80 mm - T. Maeno's algorithm, subarray 20x20 2.25 MFlops, 177.50 ticks/flop, 0.454 s ! 80x80 mm - D. Warner's algorithm, subarray 20x20 2.37 MFlops, 168.39 ticks/flop, 0.431 s - ALL-TESTS \ Using Win32Forth, fmacro.f date May 12th, 2003 and FSL-Utilities_1.04 CLK 400 MHz ! 80x80 mm - normal algorithm 27.00 MFlops, 14.81 ticks/flop, 0.037 s ! 80x80 mm - blocking, factor of 20 18.30 MFlops, 21.85 ticks/flop, 0.055 s ! 80x80 mm - transposed B matrix 34.76 MFlops, 11.50 ticks/flop, 0.029 s ! 80x80 mm - Robert's algorithm 36.44 MFlops, 10.97 ticks/flop, 0.028 s ! 80x80 mm - T. Maeno's algorithm, subarray 20x20 10.96 MFlops, 36.47 ticks/flop, 0.093 s ! 80x80 mm - D. Warner's algorithm, subarray 20x20 12.61 MFlops, 31.70 ticks/flop, 0.081 s )) --- 7,34 ---- \ * March 17th, 2003 J.v.d.Ven: Changed for the updated fmacro.f \ * May 12th, 2003 J.v.d.Ven: Changed DDOT DAXPY do-WARNER() and DO-MAENO. ! \ * December 27th, 2004 J.v.d.Ven: Updated de results since Win32Forth and fmacro.f were updated \ A changed MM benchmark from: http://home.iae.nl/users/mhx/mm.fw (( ! ALL-TESTS \ Using the orginal code and Win32Forth Version: 6.09 CLK 400 MHz ! 80x80 mm - normal algorithm 5.36 MFlops, 74.19 ticks/flop, 0.190 s ! 80x80 mm - blocking, factor of 20 3.46 MFlops, 114.85 ticks/flop, 0.295 s ! 80x80 mm - transposed B matrix 5.17 MFlops, 76.83 ticks/flop, 0.197 s ! 80x80 mm - Robert's algorithm 5.04 MFlops, 78.85 ticks/flop, 0.202 s ! 80x80 mm - T. Maeno's algorithm, subarray 20x20 3.19 MFlops, 124.60 ticks/flop, 0.320 s ! 80x80 mm - D. Warner's algorithm, subarray 20x20 2.66 MFlops, 149.06 ticks/flop, 0.383 s ! ! ALL-TESTS \ Using Win32Forth Version: 6.09, fmacro.f date December 27th, 2004 and FSL-Utilities_1.04 CLK 400 MHz ! 80x80 mm - normal algorithm 48.76 MFlops, 8.20 ticks/flop, 0.020 s ! 80x80 mm - blocking, factor of 20 19.22 MFlops, 20.80 ticks/flop, 0.053 s ! 80x80 mm - transposed B matrix 35.08 MFlops, 11.40 ticks/flop, 0.029 s ! 80x80 mm - Robert's algorithm 36.64 MFlops, 10.91 ticks/flop, 0.027 s ! 80x80 mm - T. Maeno's algorithm, subarray 20x20 12.41 MFlops, 32.21 ticks/flop, 0.082 s ! 80x80 mm - D. Warner's algorithm, subarray 20x20 15.25 MFlops, 26.22 ticks/flop, 0.067 s )) |