From: Brad K. <bra...@ki...> - 2006-07-18 15:37:28
|
Hello, all: I've completed the changes to make netlib thread safe along with all vnl calls to it. Details of the new v3p_netlib library are covered in v3p/netlib/README. All non-constant statics have been removed. Previously static data that had to be persistent across calls to netlib functions (like the iterations of lbfgs) have been replaced by working-space struct arguments allocated on the stack in vnl code before calling. A few one-time-initialized statics are setup at program load time by static initialization (when there is only one thread). The activator pattern in vnl has been replaced by userdata arguments for the callbacks. I'll still watch the dashboard for a few more days in case there are some problems with today's last few changes. After that I think we can declare vnl thread safe. The real test will come when I merge the new vnl into ITK and its multi-threaded filters use vnl/netlib algorithms. -Brad |
From: Ian S. <ian...@st...> - 2006-07-18 15:47:41
|
Brad Thank-you very much for putting this huge effort into netlib. We will all benefit from the fixing of one of VXL's major weaknesses. Ian. Brad King wrote: > Hello, all: > > I've completed the changes to make netlib thread safe along with all vnl > calls to it. Details of the new v3p_netlib library are covered in > v3p/netlib/README. > > All non-constant statics have been removed. Previously static data that > had to be persistent across calls to netlib functions (like the > iterations of lbfgs) have been replaced by working-space struct > arguments allocated on the stack in vnl code before calling. A few > one-time-initialized statics are setup at program load time by static > initialization (when there is only one thread). The activator pattern > in vnl has been replaced by userdata arguments for the callbacks. > > I'll still watch the dashboard for a few more days in case there are > some problems with today's last few changes. After that I think we can > declare vnl thread safe. The real test will come when I merge the new > vnl into ITK and its multi-threaded filters use vnl/netlib algorithms. > > -Brad > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vxl-maintainers mailing list > Vxl...@li... > https://lists.sourceforge.net/lists/listinfo/vxl-maintainers |
From: Brendan M. <mc...@cs...> - 2006-07-18 21:56:49
|
Awesome. Thanks Brad. I'll start organising a new release in a few days after things settle down. On Tue, 2006-07-18 at 11:37 -0400, Brad King wrote: > Hello, all: > > I've completed the changes to make netlib thread safe along with all vnl > calls to it. Details of the new v3p_netlib library are covered in > v3p/netlib/README. > > All non-constant statics have been removed. Previously static data that > had to be persistent across calls to netlib functions (like the > iterations of lbfgs) have been replaced by working-space struct > arguments allocated on the stack in vnl code before calling. A few > one-time-initialized statics are setup at program load time by static > initialization (when there is only one thread). The activator pattern > in vnl has been replaced by userdata arguments for the callbacks. > > I'll still watch the dashboard for a few more days in case there are > some problems with today's last few changes. After that I think we can > declare vnl thread safe. The real test will come when I merge the new > vnl into ITK and its multi-threaded filters use vnl/netlib algorithms. > > -Brad -- Cheers, Brendan. ---------------------------------------------------------------------------- Brendan McCane Email: mc...@cs... Department of Computer Science Phone: +64 3 479 8588/8578. University of Otago Fax: +64 3 479 8529 Box 56, Dunedin, New Zealand. |
From: Brad K. <bra...@ki...> - 2006-07-28 13:19:54
|
Brendan McCane wrote: > Awesome. Thanks Brad. I'll start organising a new release in a few days > after things settle down. I've just committed a few more fixes discovered when I updated vxl in ITK. We may want to wait another day to see these fixes on the vxl dashboard. -Brad |
From: David C. <dav...@ma...> - 2006-07-31 10:52:55
|
Hi Brad/Brendan, I've been having a few problems with the new "v3p_netlib" libraries using MS Visual Studio 2003 under winxp. The debug version works fine and cause no problems. However when creating release binaries a few strange things happen. 1. Some of the "v3p_netlib" c files take ages to compile (ie compared to the debug time). Specifically the following files:- zlarfx.c zgemm.c dhgeqz.c ztrmm.c hqr2.c 2. When running any release applications which access the v3p_netlib library (statically compiled), for example "netlib_slamch_test". The dos terminal window appears, but only about 5 mins later is any output produced in the terminal window. The debug version of "netlib_slamch_test" has no problems, it starts producing output immediately. My current work around is to build the entire release version of "v3p_netlib" using the "Disabled (/Od)" optimisation option (set manually in VS). This speeds up the compilation of the "v3p_netlib" libarary and allows release binaries to be built without the 5 minute start delay, but presumably the resulting code is slower. I think this needs fixing before any new VXL release. I'm not sure how to go about fixing this myself :-). Regards, David ps I've made sure my VXL code is up to date this morning + observed the same behaviour on a fresh VXL build on my home PC over the weekend. ps2 I've noticed that there is a VXL "WinXP_msvc-7.1_Debug" build but not a "WinXP_msvc-7.1_Release" build on the VXL dashboard. However, the above problem would not necessarily be noticed by the dashboard, as the compilation and test would complete eventually. ps3 In the CMakeLists.txt file within "v3p_netlib", optimisation is switched off for some individual files when using gcc. Maybe some exceptions need to be added for the VS compiler too. I've tried disabling some files, but can currently only remove the time delay by disabling optimisation for all of v3p_netlib. Brad King wrote: > Brendan McCane wrote: >> Awesome. Thanks Brad. I'll start organising a new release in a few days >> after things settle down. > > I've just committed a few more fixes discovered when I updated vxl in > ITK. We may want to wait another day to see these fixes on the vxl > dashboard. > > -Brad > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vxl-maintainers mailing list > Vxl...@li... > https://lists.sourceforge.net/lists/listinfo/vxl-maintainers > -- Dr David Cristinacce Research Associate Imaging Science and Biomedical Engineering, University of Manchester, M13 9PT, UK. dav...@ma... http://mimban.smb.man.ac.uk/ tel: +44 161 275 6871 fax: +44 161 275 5145 |
From: Brad K. <bra...@ki...> - 2006-07-31 15:07:45
|
David Cristinacce wrote: > 1. Some of the "v3p_netlib" c files take ages to compile (ie compared to > the debug time). Specifically the following files:- > > zlarfx.c > zgemm.c > dhgeqz.c > ztrmm.c > hqr2.c This is probably because the optimizer is spending alot of time on these files. Many of these functions have been in vxl for a long time in the old netlib library. They've just been re-converted from fortran and moved into the new library. Did this happen before? > 2. When running any release applications which access the v3p_netlib > library (statically compiled), for example "netlib_slamch_test". The dos > terminal window appears, but only about 5 mins later is any output > produced in the terminal window. > > The debug version of "netlib_slamch_test" has no problems, it starts > producing output immediately. > > My current work around is to build the entire release version of > "v3p_netlib" using the "Disabled (/Od)" optimisation option (set > manually in VS). This speeds up the compilation of the "v3p_netlib" > libarary and allows release binaries to be built without the 5 minute > start delay, but presumably the resulting code is slower. The slamch.c and dlamch.c files have had optimization disabled for GCC because they seemed to be getting into infinite loops. I bet the loops were not actually infinite but just very long, and the VS optimizer is making the same mistake. Try adding this at the top of those files for your build: #ifdef _MSC_VER # pragma optimize ("",off) #endif If that solves the problem then we can commit the fix. However in that case two compilers will have made the same optimization mistake so it would be interesting to try to figure out why. Thanks, -Brad |
From: Ian S. <ian...@st...> - 2006-07-31 16:40:09
|
Brad King wrote: > The slamch.c and dlamch.c files have had optimization disabled for GCC > because they seemed to be getting into infinite loops. I bet the loops > were not actually infinite but just very long, and the VS optimizer is > making the same mistake. The optimisation disabling doesn't appear to have worked on the ISBE dashboard testing machine. "vnl_algo_test_all test_algo" is taking a very long time to finish, looped in dlamch.o It appears that "-O" has been added to the build line. clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> rm dlamch.o clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> make -n dlamch.o echo "Building object file dlamch.o..." gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -O -fPIC -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl -I/home/clue_rbt/dart_client/vxl_cont/bin/core -I/home/clue_rbt/dart_client/vxl_cont/vxl/core -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c Manually compiling dlamch without the extra -O allows vnl_algo_test_all test_algo to run quickly. gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -fPIC -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl -I/home/clue_rbt/dart_client/vxl_cont/bin/core -I/home/clue_rbt/dart_client/vxl_cont/vxl/core -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c The relevant settings from my CMakeCache are CMAKE_CXX_FLAGS:STRING=-g -O2 CMAKE_C_FLAGS:STRING=-g -O It seems that CMake (v2.0.2 and others) is adding the "-O0" command in the wrong place, and that rather than overriding the default "-O", the default "-O" is overriding the explicit "-O0". I appear to have come across this behaviour before judging by the comment on line 296 of an older version of netlib's CMakeLists.txt file http://vxl.cvs.sourceforge.net/vxl/vxl/v3p/netlib/CMakeLists.txt?annotate=1.31 Is this something I am doing wrong? I assume you have got this to work. Ian. |
From: Brad K. <bra...@ki...> - 2006-07-31 17:37:50
|
Ian Scott wrote: > Brad King wrote: > >> The slamch.c and dlamch.c files have had optimization disabled for GCC >> because they seemed to be getting into infinite loops. I bet the loops >> were not actually infinite but just very long, and the VS optimizer is >> making the same mistake. > > The optimisation disabling doesn't appear to have worked on the ISBE > dashboard testing machine. > "vnl_algo_test_all test_algo" is taking a very long time to finish, > looped in dlamch.o > > > It appears that "-O" has been added to the build line. > > clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> rm dlamch.o > clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> make -n dlamch.o > echo "Building object file dlamch.o..." > gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -O -fPIC > -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl > -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl > -I/home/clue_rbt/dart_client/vxl_cont/bin/core > -I/home/clue_rbt/dart_client/vxl_cont/vxl/core > -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD > -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c > /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c > > > Manually compiling dlamch without the extra -O allows vnl_algo_test_all > test_algo to run quickly. > > gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -fPIC > -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl > -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl > -I/home/clue_rbt/dart_client/vxl_cont/bin/core > -I/home/clue_rbt/dart_client/vxl_cont/vxl/core > -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD > -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c > /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c > > The relevant settings from my CMakeCache are > CMAKE_CXX_FLAGS:STRING=-g -O2 > CMAKE_C_FLAGS:STRING=-g -O > > It seems that CMake (v2.0.2 and others) is adding the "-O0" command in > the wrong place, and that rather than overriding the default "-O", the > default "-O" is overriding the explicit "-O0". I appear to have come > across this behaviour before judging by the comment on line 296 of an > older version of netlib's CMakeLists.txt file > http://vxl.cvs.sourceforge.net/vxl/vxl/v3p/netlib/CMakeLists.txt?annotate=1.31 > > Is this something I am doing wrong? I assume you have got this to work. It works with CMake 2.4 and I forgot to test it with earlier versions. I'll look at it when I get a chance. -Brad |
From: Brad K. <bra...@ki...> - 2006-08-01 17:57:28
|
Brad King wrote: > Ian Scott wrote: >> Brad King wrote: >> >>> The slamch.c and dlamch.c files have had optimization disabled for GCC >>> because they seemed to be getting into infinite loops. I bet the loops >>> were not actually infinite but just very long, and the VS optimizer is >>> making the same mistake. >> The optimisation disabling doesn't appear to have worked on the ISBE >> dashboard testing machine. >> "vnl_algo_test_all test_algo" is taking a very long time to finish, >> looped in dlamch.o >> >> >> It appears that "-O" has been added to the build line. >> >> clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> rm dlamch.o >> clue_rbt@paine:~/dart_client/vxl_cont/bin/v3p/netlib> make -n dlamch.o >> echo "Building object file dlamch.o..." >> gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -O -fPIC >> -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl >> -I/home/clue_rbt/dart_client/vxl_cont/bin/core >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/core >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD >> -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c >> /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c >> >> >> Manually compiling dlamch without the extra -O allows vnl_algo_test_all >> test_algo to run quickly. >> >> gcc -o dlamch.o -Dv3p_netlib_EXPORTS -O0 -g -fPIC >> -I/home/clue_rbt/dart_client/vxl_cont/bin/vcl >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/vcl >> -I/home/clue_rbt/dart_client/vxl_cont/bin/core >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/core >> -I/home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib -DDART_BUILD >> -DVXL_WARN_DEPRECATED -DVXL_WARN_DEPRECATED_ONCE -DV3P_NETLIB_SRC -c >> /home/clue_rbt/dart_client/vxl_cont/vxl/v3p/netlib/blas/dlamch.c >> >> The relevant settings from my CMakeCache are >> CMAKE_CXX_FLAGS:STRING=-g -O2 >> CMAKE_C_FLAGS:STRING=-g -O >> >> It seems that CMake (v2.0.2 and others) is adding the "-O0" command in >> the wrong place, and that rather than overriding the default "-O", the >> default "-O" is overriding the explicit "-O0". I appear to have come >> across this behaviour before judging by the comment on line 296 of an >> older version of netlib's CMakeLists.txt file >> http://vxl.cvs.sourceforge.net/vxl/vxl/v3p/netlib/CMakeLists.txt?annotate=1.31 >> >> Is this something I am doing wrong? I assume you have got this to work. > > It works with CMake 2.4 and I forgot to test it with earlier versions. > I'll look at it when I get a chance. I've just added code to simply remove all GCC optimization flags from CMAKE_C_FLAGS* variables for CMake versions lower than 2.4. That will make sure the -O0 works. -Brad |
From: Brad K. <bra...@ki...> - 2006-07-31 17:41:26
|
David Cristinacce wrote: > Hi Brad, > > I'm not sure how long "v3p_netlib" took to compile before. In the past > I've just compiled it alongside the rest of VXL and not paid much > attention. Maybe optimising the code just takes time. > > However, I think your fix for "slamch.c and "dlamch.c" has worked! :-) > > I've added the pragma option for these two files and then recompiled > "v3p_netlib". This has removed the 5 minute delay in starting release > applications using v3p_netlib. Do you want me to commit this fix for > these two files now? > > The only other thing to note is that "zlarfx.c" takes about 10mins to > compile with optimisation switched on and 2 seconds without optimisation > (3Ghz WinXP machine). Do you think the pragma switch should be added to > this file also?! Or is the optimisation worth the wait at compile time? Yes, please commit these optimization "fixes" for all files in which you find them useful. However you may not actually need to disable all optimizations. See here for disabling each kind of optimization: http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_predir_optimize.asp Since slamch.c and dlamch.c do some numerical computation in a loop, the "Improve floating point consistency" optimization catches my eye and looks dangerous. Try just this: #pragma optimize ("p", off) instead of all optimizations. If that doesn't work please try the other types one at a time. Thanks, -Brad |
From: David C. <dav...@ma...> - 2006-08-01 08:45:19
|
Hi Brad, The optimisation which is causing the start delay problems with msvc is this one:- option g = Enable global optimizations Therefore I've added the following lines:- #ifdef _MSC_VER # pragma optimize ("g",off) #endif to the following files:- slamch.c dlamch.c zlarfx.c and committed them to the repository. I think it's fixed now. Thanks for your help! David Brad King wrote: > David Cristinacce wrote: >> Hi Brad, >> >> I'm not sure how long "v3p_netlib" took to compile before. In the past >> I've just compiled it alongside the rest of VXL and not paid much >> attention. Maybe optimising the code just takes time. >> >> However, I think your fix for "slamch.c and "dlamch.c" has worked! :-) >> >> I've added the pragma option for these two files and then recompiled >> "v3p_netlib". This has removed the 5 minute delay in starting release >> applications using v3p_netlib. Do you want me to commit this fix for >> these two files now? >> >> The only other thing to note is that "zlarfx.c" takes about 10mins to >> compile with optimisation switched on and 2 seconds without optimisation >> (3Ghz WinXP machine). Do you think the pragma switch should be added to >> this file also?! Or is the optimisation worth the wait at compile time? > > Yes, please commit these optimization "fixes" for all files in which you > find them useful. However you may not actually need to disable all > optimizations. See here for disabling each kind of optimization: > > http://msdn.microsoft.com/library/default.asp?url=/library/en-us/vclang/html/_predir_optimize.asp > > Since slamch.c and dlamch.c do some numerical computation in a loop, the > "Improve floating point consistency" optimization catches my eye and > looks dangerous. Try just this: > > #pragma optimize ("p", off) > > instead of all optimizations. If that doesn't work please try the other > types one at a time. > > Thanks, > -Brad > > ------------------------------------------------------------------------- > Take Surveys. Earn Cash. Influence the Future of IT > Join SourceForge.net's Techsay panel and you'll get the chance to share your > opinions on IT & business topics through brief surveys -- and earn cash > http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV > _______________________________________________ > Vxl-maintainers mailing list > Vxl...@li... > https://lists.sourceforge.net/lists/listinfo/vxl-maintainers > -- Dr David Cristinacce Research Associate Imaging Science and Biomedical Engineering, University of Manchester, M13 9PT, UK. dav...@ma... http://mimban.smb.man.ac.uk/ tel: +44 161 275 6871 fax: +44 161 275 5145 |
From: David C. <dav...@ma...> - 2006-07-31 15:54:05
|
Hi Brad, I'm not sure how long "v3p_netlib" took to compile before. In the past I've just compiled it alongside the rest of VXL and not paid much attention. Maybe optimising the code just takes time. However, I think your fix for "slamch.c and "dlamch.c" has worked! :-) I've added the pragma option for these two files and then recompiled "v3p_netlib". This has removed the 5 minute delay in starting release applications using v3p_netlib. Do you want me to commit this fix for these two files now? The only other thing to note is that "zlarfx.c" takes about 10mins to compile with optimisation switched on and 2 seconds without optimisation (3Ghz WinXP machine). Do you think the pragma switch should be added to this file also?! Or is the optimisation worth the wait at compile time? Thanks for the quick and helpful response, David Brad King wrote: > David Cristinacce wrote: >> 1. Some of the "v3p_netlib" c files take ages to compile (ie compared to >> the debug time). Specifically the following files:- >> >> zlarfx.c >> zgemm.c >> dhgeqz.c >> ztrmm.c >> hqr2.c > > This is probably because the optimizer is spending alot of time on these > files. Many of these functions have been in vxl for a long time in the > old netlib library. They've just been re-converted from fortran and > moved into the new library. Did this happen before? > >> 2. When running any release applications which access the v3p_netlib >> library (statically compiled), for example "netlib_slamch_test". The dos >> terminal window appears, but only about 5 mins later is any output >> produced in the terminal window. >> >> The debug version of "netlib_slamch_test" has no problems, it starts >> producing output immediately. >> >> My current work around is to build the entire release version of >> "v3p_netlib" using the "Disabled (/Od)" optimisation option (set >> manually in VS). This speeds up the compilation of the "v3p_netlib" >> libarary and allows release binaries to be built without the 5 minute >> start delay, but presumably the resulting code is slower. > > The slamch.c and dlamch.c files have had optimization disabled for GCC > because they seemed to be getting into infinite loops. I bet the loops > were not actually infinite but just very long, and the VS optimizer is > making the same mistake. > > Try adding this at the top of those files for your build: > > #ifdef _MSC_VER > # pragma optimize ("",off) > #endif > > If that solves the problem then we can commit the fix. However in that > case two compilers will have made the same optimization mistake so it > would be interesting to try to figure out why. > > Thanks, > -Brad > > -- Dr David Cristinacce Research Associate Imaging Science and Biomedical Engineering, University of Manchester, M13 9PT, UK. dav...@ma... http://mimban.smb.man.ac.uk/ tel: +44 161 275 6871 fax: +44 161 275 5145 |