You can subscribe to this list here.
1998 |
Jan
|
Feb
|
Mar
|
Apr
(19) |
May
(9) |
Jun
(62) |
Jul
(42) |
Aug
(32) |
Sep
(10) |
Oct
(23) |
Nov
(10) |
Dec
(38) |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1999 |
Jan
(15) |
Feb
(27) |
Mar
(31) |
Apr
(19) |
May
(3) |
Jun
(24) |
Jul
(5) |
Aug
(7) |
Sep
|
Oct
|
Nov
(3) |
Dec
(2) |
2000 |
Jan
(6) |
Feb
(1) |
Mar
(3) |
Apr
|
May
(1) |
Jun
(2) |
Jul
|
Aug
(6) |
Sep
|
Oct
(8) |
Nov
(5) |
Dec
(1) |
2001 |
Jan
(18) |
Feb
(7) |
Mar
(2) |
Apr
(2) |
May
|
Jun
(1) |
Jul
(1) |
Aug
|
Sep
(1) |
Oct
(2) |
Nov
|
Dec
(1) |
2002 |
Jan
(2) |
Feb
|
Mar
|
Apr
(1) |
May
(7) |
Jun
(1) |
Jul
(2) |
Aug
|
Sep
|
Oct
(34) |
Nov
(7) |
Dec
(5) |
2003 |
Jan
(13) |
Feb
(4) |
Mar
|
Apr
|
May
(3) |
Jun
(1) |
Jul
(3) |
Aug
|
Sep
(3) |
Oct
(1) |
Nov
(1) |
Dec
(14) |
2004 |
Jan
|
Feb
(1) |
Mar
(2) |
Apr
(8) |
May
(13) |
Jun
|
Jul
(9) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
|
Dec
(1) |
2005 |
Jan
|
Feb
(3) |
Mar
|
Apr
(6) |
May
|
Jun
(9) |
Jul
|
Aug
|
Sep
(1) |
Oct
|
Nov
(2) |
Dec
|
2006 |
Jan
|
Feb
|
Mar
|
Apr
|
May
(2) |
Jun
(2) |
Jul
|
Aug
(1) |
Sep
(3) |
Oct
(1) |
Nov
|
Dec
|
2007 |
Jan
|
Feb
(2) |
Mar
(1) |
Apr
(2) |
May
(1) |
Jun
(3) |
Jul
(7) |
Aug
(5) |
Sep
(2) |
Oct
(3) |
Nov
(4) |
Dec
(3) |
2008 |
Jan
(1) |
Feb
(7) |
Mar
(3) |
Apr
(1) |
May
(6) |
Jun
(4) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2009 |
Jan
|
Feb
(5) |
Mar
(2) |
Apr
(1) |
May
(1) |
Jun
(1) |
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(2) |
Dec
(1) |
2010 |
Jan
(9) |
Feb
(1) |
Mar
|
Apr
|
May
|
Jun
|
Jul
(2) |
Aug
|
Sep
(2) |
Oct
|
Nov
|
Dec
(3) |
2011 |
Jan
|
Feb
(8) |
Mar
|
Apr
|
May
(27) |
Jun
(35) |
Jul
(22) |
Aug
(13) |
Sep
(5) |
Oct
|
Nov
|
Dec
(9) |
2012 |
Jan
(3) |
Feb
(1) |
Mar
(1) |
Apr
(6) |
May
(4) |
Jun
(7) |
Jul
(1) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2013 |
Jan
(2) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(1) |
Nov
|
Dec
|
2014 |
Jan
(1) |
Feb
|
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
|
Dec
|
2015 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
|
Nov
(1) |
Dec
|
2016 |
Jan
(26) |
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
(1) |
Aug
(3) |
Sep
|
Oct
|
Nov
|
Dec
|
From: Patrick G. <pa...@mn...> - 2011-07-25 16:21:11
|
Hi Patrik, Thank you, it seems to have fixed the issue about extremely long compilation time. I need to do more thorough testing but the interprocedural optimization (-ip and -ipo) do not seem to work correctly any longer :-( I will let you know when I have done more testing. Cheers, Patrick On 07/22/2011 05:22 PM, Patrik Jonsson wrote: > Hi Patrick, > > Can you try the latest update? I've now completed the compile-time > selection of evaluation routines for TinyVector and TinyMatrix so it > doesn't instantiate the full evaluation in those cases. I also removed > some cases of "#pragma forceinline recursive" which apparently is not > recognized by v11 but is by v12. With these changes, compile time for > my code was cut in half and for the more complicated test cases from > 18+hours to 2 minutes. Hopefully this will fix your out of memory > error too. > > cheers, > > /Patrik |
From: Paul H. <pph...@gm...> - 2011-07-24 05:21:13
|
Hi Patrik, thanks for the update, I could now successfully compile blitz and my program (ifort 11.1 and ifort 12.0, excluding last update). I tried to reproduce some of my previous simulations with blitz-0.10 and compare to blitz-0.9, the results are slightly different, but surprisingly blitz-0.10 gives the better results :D. It seems that rounding errors are better handled with the new version. Do you have an idea why ? Here are the graphs : [ Roughly the graphs show the mode powers (FFT transformation) of the domain for a gyro-kinetic simulation, for blitz-10 the fluctuations are less dominant, I will try to find out why ... ] (blitz-0.9 , gcc-4.1, 14961s) http://www.2shared.com/photo/kUQKsMuK/110724_Blitz_blitz09_gcc4_1_Po.html (blitz-0.9 , ifort-11.1, 14567s) http://www.2shared.com/photo/y5NwLfbz/110724_Blitz_blitz09_ifort11_1.html (blitz-0.10, ifort-11.1, 16668s) http://www.2shared.com/photo/g-HHg0tb/110724_Blitz_blitz10_ifort11_1.html (bllitz-0.10, ifort-12.0, 17673s) http://www.2shared.com/photo/XTDBctsW/110724_Blitz_blitz10_ifort12_0.html It seems also that the new version is around 10% slower ! all programs where compiled with -O3 and for blitz-0.10 I had simd width of 16. Do you also have some real world benchmarking of your code ? I wonder if I chose some wrong compile parameters. thanks again and best wishes, Paul |
From: Patrik J. <co...@fa...> - 2011-07-22 16:22:25
|
Hi Patrick, Can you try the latest update? I've now completed the compile-time selection of evaluation routines for TinyVector and TinyMatrix so it doesn't instantiate the full evaluation in those cases. I also removed some cases of "#pragma forceinline recursive" which apparently is not recognized by v11 but is by v12. With these changes, compile time for my code was cut in half and for the more complicated test cases from 18+hours to 2 minutes. Hopefully this will fix your out of memory error too. cheers, /Patrik |
From: Patrik J. <co...@fa...> - 2011-07-22 16:20:18
|
On Fri, Jul 22, 2011 at 9:57 AM, Patrick Guio <pa...@mn...> wrote: > > Hi Patrik, > > I am compiling with the latest update (4) of icpc 12 (12.0.4 20110427). > Unfortunately I cannot use v11.1 of icpc any longer as I have the > following compatibility problem with gcc described here > http://origin-software.intel.com/en-us/forums/showthread.php?t=74691&p=2&o=d&s=lr > > I have now compared compilation of one of my simulation with the "old" > separated ET for Array and TinyVectors and the "new" unified one. > I think there are several problems to be addressed with the "new" ET > machinery before it can be accepted. > > Compiling my code with the same (optimised) options using the "old" ET > of blitz takes 2 mins while it takes 25 mins with the "new" one. > As far as I remember it used to compile also in about a few minutes with > icpc 11.0 and 11.1. Yes, there's some interaction between the new code and v12 that makes compilation very slow. It's not simply the new code, because compiling with v11 (which I used when developing it) does not have this problem. My hunch is that it has to do with inlining. Could you try commenting out all lines with '#pragma forceinline' and see how that changes your compilation times? In general, however, I think compilation times will go up, because the "new ET" machinery is more complicated than the old one that TinyVector used to use. > There seems to be a big difference of the counts of loops that can and > can't be vectorized according to the reported diagnostics between old > and new ET: > LOOP WAS VECTORIZED: old 70, new 946 > loop was not vectorized: old 2192, new 4277 > But I am not sure how to interpret that. Any idea? Well, there are many loops in the code, and the only one that's vectorized is the inner loop for unit stride stack evaluations, so it makes sense both that the number of loops has gone up and that most of them are not vectorized (because they are not inner loops or use the construct that allows the compiler to vectorize). The count has probably gone up because the new code has more paths depending on the alignment, stride and type of expression, and the vectorized array expressions redirects to TinyVector evaluations, so there are recursive loops. > Running a simple test case with the new version returns different > numbers compared to the old version. On another test case the new ET > crashes while the old ET returns what is expected. > I am looking into this with compiling with debug option but with -g -O2 > takes even longer to compile, over 1.5 hours now and still not finished. All tests in the testsuite pass for me (except storage but that's because the iterators don't work for arrays with non-ascending storage). If you have your own tests that don't pass, please add them to the testsuite. cheers, /Patrik |
From: Patrick G. <pa...@mn...> - 2011-07-22 14:58:33
|
The only place is use TinyMatrix is in another code that I am not able to compile due to these memory issues. On 07/22/2011 03:52 PM, Patrik Jonsson wrote: > By the way, in your tests that take very long to compile, are you by > any chance using TinyMatrix? As of now, there is no code path that > does simplified evaluation of the TinyMatrix class, so I think that's > why the multicomponent test cases (which tests things like Arrays of > TinyMatrix of TinyVector) take so long to compile. > > cheers, > > /Patrik > > On Fri, Jul 22, 2011 at 10:26 AM, Patrik Jonsson > <co...@fa...> wrote: >> On Fri, Jul 22, 2011 at 9:57 AM, Patrick Guio <pa...@mn...> wrote: >>> >>> Hi Patrik, >>> >>> I am compiling with the latest update (4) of icpc 12 (12.0.4 20110427). >>> Unfortunately I cannot use v11.1 of icpc any longer as I have the >>> following compatibility problem with gcc described here >>> http://origin-software.intel.com/en-us/forums/showthread.php?t=74691&p=2&o=d&s=lr >>> >>> I have now compared compilation of one of my simulation with the "old" >>> separated ET for Array and TinyVectors and the "new" unified one. >>> I think there are several problems to be addressed with the "new" ET >>> machinery before it can be accepted. >>> >>> Compiling my code with the same (optimised) options using the "old" ET >>> of blitz takes 2 mins while it takes 25 mins with the "new" one. >>> As far as I remember it used to compile also in about a few minutes with >>> icpc 11.0 and 11.1. >> >> Yes, there's some interaction between the new code and v12 that makes >> compilation very slow. It's not simply the new code, because compiling >> with v11 (which I used when developing it) does not have this problem. >> My hunch is that it has to do with inlining. Could you try commenting >> out all lines with '#pragma forceinline' and see how that changes your >> compilation times? >> >> In general, however, I think compilation times will go up, because the >> "new ET" machinery is more complicated than the old one that >> TinyVector used to use. >> >>> There seems to be a big difference of the counts of loops that can and >>> can't be vectorized according to the reported diagnostics between old >>> and new ET: >>> LOOP WAS VECTORIZED: old 70, new 946 >>> loop was not vectorized: old 2192, new 4277 >>> But I am not sure how to interpret that. Any idea? >> >> Well, there are many loops in the code, and the only one that's >> vectorized is the inner loop for unit stride stack evaluations, so it >> makes sense both that the number of loops has gone up and that most of >> them are not vectorized (because they are not inner loops or use the >> construct that allows the compiler to vectorize). >> >> The count has probably gone up because the new code has more paths >> depending on the alignment, stride and type of expression, and the >> vectorized array expressions redirects to TinyVector evaluations, so >> there are recursive loops. >> >>> Running a simple test case with the new version returns different >>> numbers compared to the old version. On another test case the new ET >>> crashes while the old ET returns what is expected. >>> I am looking into this with compiling with debug option but with -g -O2 >>> takes even longer to compile, over 1.5 hours now and still not finished. >> >> All tests in the testsuite pass for me (except storage but that's >> because the iterators don't work for arrays with non-ascending >> storage). If you have your own tests that don't pass, please add them >> to the testsuite. >> >> cheers, >> >> /Patrik >> |
From: Patrik J. <co...@fa...> - 2011-07-22 14:58:25
|
By the way, in your tests that take very long to compile, are you by any chance using TinyMatrix? As of now, there is no code path that does simplified evaluation of the TinyMatrix class, so I think that's why the multicomponent test cases (which tests things like Arrays of TinyMatrix of TinyVector) take so long to compile. cheers, /Patrik On Fri, Jul 22, 2011 at 10:26 AM, Patrik Jonsson <co...@fa...> wrote: > On Fri, Jul 22, 2011 at 9:57 AM, Patrick Guio <pa...@mn...> wrote: >> >> Hi Patrik, >> >> I am compiling with the latest update (4) of icpc 12 (12.0.4 20110427). >> Unfortunately I cannot use v11.1 of icpc any longer as I have the >> following compatibility problem with gcc described here >> http://origin-software.intel.com/en-us/forums/showthread.php?t=74691&p=2&o=d&s=lr >> >> I have now compared compilation of one of my simulation with the "old" >> separated ET for Array and TinyVectors and the "new" unified one. >> I think there are several problems to be addressed with the "new" ET >> machinery before it can be accepted. >> >> Compiling my code with the same (optimised) options using the "old" ET >> of blitz takes 2 mins while it takes 25 mins with the "new" one. >> As far as I remember it used to compile also in about a few minutes with >> icpc 11.0 and 11.1. > > Yes, there's some interaction between the new code and v12 that makes > compilation very slow. It's not simply the new code, because compiling > with v11 (which I used when developing it) does not have this problem. > My hunch is that it has to do with inlining. Could you try commenting > out all lines with '#pragma forceinline' and see how that changes your > compilation times? > > In general, however, I think compilation times will go up, because the > "new ET" machinery is more complicated than the old one that > TinyVector used to use. > >> There seems to be a big difference of the counts of loops that can and >> can't be vectorized according to the reported diagnostics between old >> and new ET: >> LOOP WAS VECTORIZED: old 70, new 946 >> loop was not vectorized: old 2192, new 4277 >> But I am not sure how to interpret that. Any idea? > > Well, there are many loops in the code, and the only one that's > vectorized is the inner loop for unit stride stack evaluations, so it > makes sense both that the number of loops has gone up and that most of > them are not vectorized (because they are not inner loops or use the > construct that allows the compiler to vectorize). > > The count has probably gone up because the new code has more paths > depending on the alignment, stride and type of expression, and the > vectorized array expressions redirects to TinyVector evaluations, so > there are recursive loops. > >> Running a simple test case with the new version returns different >> numbers compared to the old version. On another test case the new ET >> crashes while the old ET returns what is expected. >> I am looking into this with compiling with debug option but with -g -O2 >> takes even longer to compile, over 1.5 hours now and still not finished. > > All tests in the testsuite pass for me (except storage but that's > because the iterators don't work for arrays with non-ascending > storage). If you have your own tests that don't pass, please add them > to the testsuite. > > cheers, > > /Patrik > |
From: Patrick G. <pa...@mn...> - 2011-07-22 14:56:14
|
On 07/22/2011 03:26 PM, Patrik Jonsson wrote: > On Fri, Jul 22, 2011 at 9:57 AM, Patrick Guio <pa...@mn...> wrote: >> Hi Patrik, >> >> I am compiling with the latest update (4) of icpc 12 (12.0.4 20110427). >> Unfortunately I cannot use v11.1 of icpc any longer as I have the >> following compatibility problem with gcc described here >> http://origin-software.intel.com/en-us/forums/showthread.php?t=74691&p=2&o=d&s=lr >> >> I have now compared compilation of one of my simulation with the "old" >> separated ET for Array and TinyVectors and the "new" unified one. >> I think there are several problems to be addressed with the "new" ET >> machinery before it can be accepted. >> >> Compiling my code with the same (optimised) options using the "old" ET >> of blitz takes 2 mins while it takes 25 mins with the "new" one. >> As far as I remember it used to compile also in about a few minutes with >> icpc 11.0 and 11.1. > > Yes, there's some interaction between the new code and v12 that makes > compilation very slow. It's not simply the new code, because compiling > with v11 (which I used when developing it) does not have this problem. > My hunch is that it has to do with inlining. Could you try commenting > out all lines with '#pragma forceinline' and see how that changes your > compilation times? Ok then you should perhaps contact intel about that problem? Or perhaps you have already done it? What about GNU g++? Is there also a large difference in compilation time between the two versions? > In general, however, I think compilation times will go up, because the > "new ET" machinery is more complicated than the old one that > TinyVector used to use. I am not sure what other people mean but I don't think it is acceptable to have a compilation time 10 x larger? Note that also after 2 hours the compilation I started with -g -O exits with the same fatal error concerning memory problem! So it is difficult to sort out why the code is crashing... > >> There seems to be a big difference of the counts of loops that can and >> can't be vectorized according to the reported diagnostics between old >> and new ET: >> LOOP WAS VECTORIZED: old 70, new 946 >> loop was not vectorized: old 2192, new 4277 >> But I am not sure how to interpret that. Any idea? > > Well, there are many loops in the code, and the only one that's > vectorized is the inner loop for unit stride stack evaluations, so it > makes sense both that the number of loops has gone up and that most of > them are not vectorized (because they are not inner loops or use the > construct that allows the compiler to vectorize). > > The count has probably gone up because the new code has more paths > depending on the alignment, stride and type of expression, and the > vectorized array expressions redirects to TinyVector evaluations, so > there are recursive loops. > >> Running a simple test case with the new version returns different >> numbers compared to the old version. On another test case the new ET >> crashes while the old ET returns what is expected. >> I am looking into this with compiling with debug option but with -g -O2 >> takes even longer to compile, over 1.5 hours now and still not finished. > > All tests in the testsuite pass for me (except storage but that's > because the iterators don't work for arrays with non-ascending > storage). If you have your own tests that don't pass, please add them > to the testsuite. Sorry what I meant by simple test case is a run of my simulation code in a simple configuration which I can compare with published results, and they differ in the new version with full optimization while it works fine with the old one. I am not sure really what to do Cheers, Patrick > cheers, > > /Patrik |
From: Patrick G. <pa...@mn...> - 2011-07-22 13:57:29
|
Hi Patrik, I am compiling with the latest update (4) of icpc 12 (12.0.4 20110427). Unfortunately I cannot use v11.1 of icpc any longer as I have the following compatibility problem with gcc described here http://origin-software.intel.com/en-us/forums/showthread.php?t=74691&p=2&o=d&s=lr I have now compared compilation of one of my simulation with the "old" separated ET for Array and TinyVectors and the "new" unified one. I think there are several problems to be addressed with the "new" ET machinery before it can be accepted. Compiling my code with the same (optimised) options using the "old" ET of blitz takes 2 mins while it takes 25 mins with the "new" one. As far as I remember it used to compile also in about a few minutes with icpc 11.0 and 11.1. There seems to be a big difference of the counts of loops that can and can't be vectorized according to the reported diagnostics between old and new ET: LOOP WAS VECTORIZED: old 70, new 946 loop was not vectorized: old 2192, new 4277 But I am not sure how to interpret that. Any idea? Running a simple test case with the new version returns different numbers compared to the old version. On another test case the new ET crashes while the old ET returns what is expected. I am looking into this with compiling with debug option but with -g -O2 takes even longer to compile, over 1.5 hours now and still not finished. As I reported previoysly another code which used to compile now exit with a fatal compilation error: Out of memory asking for 8200. Does anyone has similar experience? Patrik, do you have any idea how to proceed? Cheers, Patrick On 07/22/2011 01:06 AM, Patrik Jonsson wrote: > Hmm, this wouldn't be with icpc 12, would it? I've noticed that 12 > seems a lot slower than 11.1, to the point that the "multicomponent" > test case which takes 30s to compile w v11, had not completed after 18 > hours(!) on v12. > > But yes, I think the reason it takes longer is that currently it still > instantiates the full expression evaluation even for tinyvector-only > expressions that don't use it. This should be an easy improvement. > > /P. > > On Thu, Jul 21, 2011 at 6:02 PM, Patrick Guio <pa...@mn...> wrote: >> >> Ok using a constructor fix it but I have another problem. I got the >> following message when compiling >> >> Fatal compilation error: Out of memory asking for 8200. >> xiar: error #10014: problem during multi-file optimization compilation >> (code 1) >> xiar: error #10014: problem during multi-file optimization compilation >> (code 1) >> >> and the code used to compile earlier with the same options ("-xSSE4.2 >> -ansi -std=c++0x -O3 -ipo -restrict -vec-report1 -no-prec-div >> -no-ansi-alias"). I also noticed that the compilation time is much >> longer. Could there be more overhead with the new machinery? >> >> Best, >> Patrick >> >> On 07/21/2011 10:04 PM, Patrik Jonsson wrote: >>> On Thu, Jul 21, 2011 at 4:59 PM, Patrick Guio <pa...@mn...> wrote: >>>> >>>> Hi again Patrik, >>>> >>>> I have another problem with another code, it seems that the last >>>> constructor does not compile any longer >>>> >>>> blitz::Array<double,3> F(10,10,10); >>>> blitz::TinyVector<int, 3> size(F.shape()); >>>> blitz::Array<double,3> F1(size+1); >>>> >>>> Any idea? >>> >>> because size+1 is an expression and not a tinyvector, probably. >>> >>> you'll have to use TinyVector<int,3>(size+1) to evaluate it, just like >>> has already been necessary for arrays for a long time. the alternative >>> is to add versions of the Array constructor taking expressions, but >>> that'll be messy... especially since those expressions also can be >>> array expressions. >>> >>> /P. >> >> |
From: Patrick G. <pa...@mn...> - 2011-07-19 15:31:29
|
Dear all, On 07/13/2011 02:59 PM, Patrik Jonsson wrote: > On Wed, Jul 13, 2011 at 3:57 AM, Paul Hilscher <pph...@gm...> wrote: > >> Also it seems that the configure script does not accept or overwrites >> CXXFLAGS parameters, is this desired ? Yes the default is --enable-cxx-flags-preset. In this case the default are loaded from the file m4/ac_cxx_flags_preset.m4 > Yes, unless --disable-cxx-flags-preset is used. This has been the case > since as long as I remember, but whether it's desired is more > questionable to me. It seems that even with the autodetected options, > the CXXFLAGS should be *added* to the compiler flags, not just > overridden, but I don't know why it works like it does. > I think it makes sense to have a fixed flags preset. There would be two ways to add CXXFLAGS (prepend and append), which one would be more appropriate? You might even end up with non-compatible options... If someone wants to fiddle with other flags it is always possible to run configure in the following way configure --disable-cxx-flags-preset CXX=your_compiler CXXFLAGS=your_flags Cheers, Patrick |
From: Patrik J. <co...@fa...> - 2011-07-13 14:00:12
|
On Wed, Jul 13, 2011 at 3:57 AM, Paul Hilscher <pph...@gm...> wrote: > Dear Patrick, > welcome back, hoped you enjoyed your trip. > Thank you very much for your reply and clarifying my questions. > > Concerning the compilation : >> >> (1). It seems uintptr_t is *optional* in C99 and in C++0x. It has >> worked on all machines I've tried on, though. What compiler and system >> are you using? >> > > I tried to compile with g++-4.5/Ubuntu 11.04 on my machine and on our > supercomputer with icc (11.1) / RHEL 4 , > compilation failed on both machines with same error. I compiled using a > fresh version from VCS head (mercurial). > Could it be that not all files have been updated to VCS ? Weird. I've certainly tried it with intel 11.1 and it worked without a problem. I guess we'll have to add a configure check for this and, like Julian suggested, use size_t if uintptr_t isn't available. > Also it seems that the configure script does not accept or overwrites > CXXFLAGS parameters, is this desired ? Yes, unless --disable-cxx-flags-preset is used. This has been the case since as long as I remember, but whether it's desired is more questionable to me. It seems that even with the autodetected options, the CXXFLAGS should be *added* to the compiler flags, not just overridden, but I don't know why it works like it does. cheers, /Patrik |
From: Paul H. <pph...@gm...> - 2011-07-13 07:57:42
|
Dear Patrick, welcome back, hoped you enjoyed your trip. Thank you very much for your reply and clarifying my questions. Concerning the compilation : > (1). It seems uintptr_t is *optional* in C99 and in C++0x. It has > worked on all machines I've tried on, though. What compiler and system > are you using? > > I tried to compile with g++-4.5/Ubuntu 11.04 on my machine and on our supercomputer with icc (11.1) / RHEL 4 , compilation failed on both machines with same error. I compiled using a fresh version from VCS head (mercurial). Could it be that not all files have been updated to VCS ? Also it seems that the configure script does not accept or overwrites CXXFLAGS parameters, is this desired ? thanks and best wishes, Paul |
From: Patrik J. <co...@fa...> - 2011-07-12 20:38:42
|
Hi Paul et al., Sorry for the delay, I was on travel. (1). It seems uintptr_t is *optional* in C99 and in C++0x. It has worked on all machines I've tried on, though. What compiler and system are you using? Incidentally, I know of no architecture that has a simd width of 4. AFAIK, all SSE implementations have a 16-byte width, while AVX processors have 32 bytes. configure is a generated file and as such should not be included in the vcs. (2). Regarding alignment, it's more complicated. Current x86 processors have separate simd instructions to load aligned and unaligned data. If the array operands are fully aligned on simd boundaries, the compiler can simply issue aligned loads and stores without overhead. This is the "aligned" case. "unaligned" means that the start of the array operands are not simd aligned. In this case, we must first do a few scalar operations until we hit the simd boundary, and can then use the aligned loads. This extra loop introduces some overhead, but for large arrays the performance is essentially identical to the aligned case. "misaligned" means that the different array operands have different alignments. In this case, there is no way to align the loads, and the compiler must issue unaligned loads/stores for the entire array length. (This is why it is advantageous to pad the lowest rank dimension of higher-dimensional arrays to the simd width, because otherwise different stripes have different alignment.) However, in all of these cases the actual flops, once the data has made in into the simd registers, are vectorized. The alignment only affects loads/stores. You can test the no-vectorization performance by just omitting the --enable-simd-width argument to configure. For large arrays, it will essentially be 1/2 or 1/4 of the vectorized performance for doubles and floats, respectively. cheers, /Patrik On Mon, Jul 4, 2011 at 11:18 PM, Paul Hilscher <pph...@gm...> wrote: > Hi Patrick, Hi all, > great work Patrick. I just wanted to try out your modifications in my code > but unfortunately > I could not go past the compilation of the blitz library (hg tip version). > Here is the output. > (1) > hg clone ssh://sta...@bl.../hgroot/blitz/blitz > cd blitz > automake --add-missing > autoreconf > --- is there a reason why the configure file is not included in the hg trunk > ? > ./configure --prefix=$PWD --enable-simd-width=4 > make lib &> log > attached is the log. I am using gcc-4.5.2 on Ubuntu. For the > undefined ‘uintptr_t’ I can fixed this > by including <stdint.h> but still some parameters are shadowed thus > compilation failed. > Do you see where is the problem ? > (2) > For your benchmark cases, I assume that roughly speaking misaligned means no > vectorization at all, > unaligned means vectorization with some overhead and aligned means full > vectorization support. > Now for most benchmark cases it seems that the difference between aligned > and misaligned is quite low, > for some benchmarks misaligned is even faster than aligned. Does this means > that performance improvement > due to vectorization is negligible ? This seems to be somewhat in contrast > to your previous results, e.g. > http://governator.ucsc.edu/filer/blitzbench/blitzcomp.html. > Did you found any speedup for your code when comparing blitz compiled with > and without vectorization support ? > thanks again for your amazing work and best wishes, > Paul > > > > On Fri, Jul 1, 2011 at 1:02 PM, Patrik Jonsson <co...@fa...> > wrote: >> >> Hi everyone, >> >> There is a new page with loop plots for r1845 that I just pushed at >> http://governator.ucsc.edu/filer/blitzbench_r1845/blitzcomp.html. >> >> There are descriptions on those pages but the short summary is that >> blitz performance with icpc is now pretty much comparable to Fortran >> (compiled with ifort) across all sizes. It's still a bit slower for >> small arrays, but not outrageously so. I have also successfully used >> the vectorized version in my radiation-transfer code and the results >> are the same, so it's passing some decidedly nontrivial tests in >> addition to the test suite. >> >> I'm about to go on vacation, so this will be the last update from me >> for a while. >> >> cheers, >> >> /Patrik >> >> >> ------------------------------------------------------------------------------ >> All of the data generated in your IT infrastructure is seriously valuable. >> Why? It contains a definitive record of application performance, security >> threats, fraudulent activity, and more. Splunk takes this data and makes >> sense of it. IT sense. And common sense. >> http://p.sf.net/sfu/splunk-d2d-c2 >> _______________________________________________ >> Blitz-devel mailing list >> Bli...@li... >> https://lists.sourceforge.net/lists/listinfo/blitz-devel > > |
From: Theodore P. <The...@in...> - 2011-07-02 12:30:14
|
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 On 06/12/2011 08:19 AM, Theodore Papadopoulo wrote: > On 06/11/2011 05:46 PM, Patrik Jonsson wrote: >> Hi Theo, > >> The way I discovered it was to try to use the iterators to fill the arrays >> in the "storage" testcase, IIRC the "C" array is a 2D, mixed >> ascending/descending array and if you create begin() and increment it, you >> get an out of bounds. Try this: > >> GeneralArrayStorage<2> storage; >> storage.ordering() = firstRank, secondRank; >> storage.base() = 0,0; >> storage.ascendingFlag() = false,true; > >> Array<int,2> C; >> C.setStorage(storage); >> C.resize(2,2); >> C(0,0)=0; >> C(0,1)=1; >> C(1,0)=2; >> C(1,1)=3; >> a2::iterator i=C.begin(); >> cout << *i << '\t' << endl; ++i; >> cout << *i << '\t' << endl; ++i; >> cout << *i << '\t' << endl; ++i; >> cout << *i << '\t' << endl; ++i; >> assert(i==C.end()); > > I'll have a look.... Sorry, it took me much more time than I expected to have a look. - From your code, I created the attached test case... I see no problem with the assert in the just updated blitz repo (from cvs). Indeed, as far as I remember, the end() is computed from the begin(), the ordereing, strides, ... It indeed provides an out-of-bound iterator (which may vary depending on the array characteristics), but which is always consistant with the iteration... Maybe the example you gave me is not the one you were thinking of, or maybe I have modified it in a way that makes the pb disppear (I had to as the code snippet above does not compile...). Please provide me with a complete example showing the problem if the problem persists and if you want me to have a look on it. Again, sorry to have taken that long to examin the issue. All the best, Theo. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Fedora - http://enigmail.mozdev.org/ iEYEARECAAYFAk4PD0wACgkQEr8WrU8nPV2XugCghZn+9+8R46ZYBnTavhIVenU8 4pcAn3NFzbhU0ajVdGekj0GTUOiZyv1W =N6JE -----END PGP SIGNATURE----- |
From: Patrik J. <co...@fa...> - 2011-07-01 04:29:47
|
Hi everyone, There is a new page with loop plots for r1845 that I just pushed at http://governator.ucsc.edu/filer/blitzbench_r1845/blitzcomp.html. There are descriptions on those pages but the short summary is that blitz performance with icpc is now pretty much comparable to Fortran (compiled with ifort) across all sizes. It's still a bit slower for small arrays, but not outrageously so. I have also successfully used the vectorized version in my radiation-transfer code and the results are the same, so it's passing some decidedly nontrivial tests in addition to the test suite. I'm about to go on vacation, so this will be the last update from me for a while. cheers, /Patrik |
From: Julian C. <cum...@ca...> - 2011-06-23 18:22:53
|
The three repositories are listed there because I recently enabled both the svn repository and the mercurial repository service for blitz. However, the svn repository is currently empty, and the mercurial repository is somewhat ahead of the cvs repository contents. I am also concerned about this issue of having multiple blitz repository services going at sourceforge simultaneously and trying to keep them all going and synchronized. It seems like a nightmare to me. Earlier you were asking about continuing to use cvs. Were you mainly concerned with using the cvs repository or a cvs client? It appears that there are now many clients available that can work from more than one type of repository server. For example, there is a package called "hgsvn" that lets you checkout code from an svn repository and then work locally in mercurial before pushing your changes back into the svn server. There is also the git-cvs package for working locally in git from a cvs server. The existence of these packages makes me question the value of migrating the repository contents to more than one type of VCS server. If I had to pick just one repository server type, I would go with svn. It seems like cvs repositories can be cleanly converted to svn, and then all the other types of VCS clients can be served from an svn repository. I am not convinced that the developer pool for blitz is large enough to warrant the use of a distributed VCS, but we need to come to a group consensus on that issue (soon!). Regards, Julian C. On 06/23/2011 10:44 AM, Patrick Guio wrote: > Hej Patrik > > I understand but if you check the sourceforge site of Blitz, in the menu > code, there are instructions to access the code for mercurial, cvs and > svn. Does that mean that sourceforge internally maintain access to blitz > through the three version control systems? It is not obvious from what I > can see. > > Cheers, > Patrick > > > > On 06/23/2011 04:04 PM, Patrik Jonsson wrote: >> Hej Patrick, >> >> As far as I know, there are no tools converting TO cvs. Everyone's >> concerned with getting stuff out of cvs. >> >> cheers, >> >> /Patrik >> >> On Wed, Jun 22, 2011 at 3:59 AM, Patrick Guio<pa...@mn...> wrote: >>> Dear all, >>> >>> I just wonder to know whether moving from cvs to mercurial obsolete >>> definitely cvs? Is there a way to mirror changes in mercurial into cvs >>> automatically so that the head development can still be accessed via cvs? >>> >>> Best, >>> Patrick >>> > > ------------------------------------------------------------------------------ > Simplify data backup and recovery for your virtual environment with vRanger. > Installation's a snap, and flexible recovery options mean your data is safe, > secure and there when you need it. Data protection magic? > Nope - It's vRanger. Get your free trial download today. > http://p.sf.net/sfu/quest-sfdev2dev > _______________________________________________ > Blitz-devel mailing list > Bli...@li... > https://lists.sourceforge.net/lists/listinfo/blitz-devel > |
From: Patrik J. <co...@fa...> - 2011-06-23 18:22:39
|
On Thu, Jun 23, 2011 at 1:44 PM, Patrick Guio <pa...@mn...> wrote: > > Hej Patrik > > I understand but if you check the sourceforge site of Blitz, in the menu > code, there are instructions to access the code for mercurial, cvs and > svn. Does that mean that sourceforge internally maintain access to blitz > through the three version control systems? It is not obvious from what I > can see. No, those are totally separate, independent repositories. cheers, /Patrik |
From: Patrick G. <pa...@mn...> - 2011-06-23 17:45:03
|
Hej Patrik I understand but if you check the sourceforge site of Blitz, in the menu code, there are instructions to access the code for mercurial, cvs and svn. Does that mean that sourceforge internally maintain access to blitz through the three version control systems? It is not obvious from what I can see. Cheers, Patrick On 06/23/2011 04:04 PM, Patrik Jonsson wrote: > Hej Patrick, > > As far as I know, there are no tools converting TO cvs. Everyone's > concerned with getting stuff out of cvs. > > cheers, > > /Patrik > > On Wed, Jun 22, 2011 at 3:59 AM, Patrick Guio <pa...@mn...> wrote: >> >> Dear all, >> >> I just wonder to know whether moving from cvs to mercurial obsolete >> definitely cvs? Is there a way to mirror changes in mercurial into cvs >> automatically so that the head development can still be accessed via cvs? >> >> Best, >> Patrick >> |
From: Patrik J. <co...@fa...> - 2011-06-23 15:04:37
|
Hej Patrick, As far as I know, there are no tools converting TO cvs. Everyone's concerned with getting stuff out of cvs. cheers, /Patrik On Wed, Jun 22, 2011 at 3:59 AM, Patrick Guio <pa...@mn...> wrote: > > Dear all, > > I just wonder to know whether moving from cvs to mercurial obsolete > definitely cvs? Is there a way to mirror changes in mercurial into cvs > automatically so that the head development can still be accessed via cvs? > > Best, > Patrick > |
From: Patrick G. <pa...@mn...> - 2011-06-22 08:24:21
|
Dear all, I just wonder to know whether moving from cvs to mercurial obsolete definitely cvs? Is there a way to mirror changes in mercurial into cvs automatically so that the head development can still be accessed via cvs? Best, Patrick |
From: Patrik J. <co...@fa...> - 2011-06-18 02:59:57
|
Hi all, I've made some optimizations that lowered the setup overhead for the evaluation loops. This improved the performance for small array sizes compared to the version from June 15. There is a new set of graphs: http://governator.ucsc.edu/filer/blitzbench2/blitzcomp.html I experimented with cutting out different parts of the evaluation setup. For only the evaluation loop, performance is very good all the way down to the vector width (so 2 for double). Essentially the sole cause for the drop in performance for <100 element arrays is the overhead in actually getting to the evaluation. This bodes well for the potential of a dedicated lightweight "Vector" class that is always unit stride and base-0, as that could bypass a lot of the runtime evaluation overhead. I think that's why the old Vector class is so fast. Such a class would be more useful now in that it would interoperate with Arrays just like the TinyVector now does, falling back to the general evaluation if necessary but still being fast in Vector-only expressions. It should not be much work to get such a class basically working, but I'm swiftly reaching the limit on how much time I can spend on this. cheers, /Patrik |
From: Patrik J. <co...@fa...> - 2011-06-17 00:57:34
|
One more thing: do not include any files except blitz/array.h (for example tinyvec-et.h). They will pull in the old ET machinery and things will clash horribly. > * At this point, *** arrays with lowest-rank dimension that is not an > even multiplier of the simd width for the array type DO NOT WORK ***. > This dimension needs to be padded to an even multiplier, otherwise the > storage gets all screwed up. You'll likely get an unaligned access > exception if you try this, or if the evaluation function can detect > the problem it just won't use the vectorized path. About this change: This WILL break existing functionality. Arrays of uneven length will now *never* be contiguous, and they will not have second-lowest rank stride equal to lowest rank length. (Though if you assert isStorageContiguous() you'll catch the problem.) The trickiest thing is that if you pass in pre-existing data using the neverDeletData/deleteDataWhenDone constructors, you have to make sure those data are similarly padded. (And there is no way to assert this since the length of the allocated data block is not specified, only the array shape.) This is unfortunate, but there is simply no way around it. Without aligned access, it's impossible to efficiently vectorize. However, it will still work as previously if you don't specify a simd width, so those applications will not be any worse off than now. If anyone has a better idea, I'm all ears. /Patrik |
From: Patrik J. <co...@fa...> - 2011-06-16 22:09:04
|
Hi everyone, If you want to give my thing a try, I've pushed my Mercurial repo to sourceforge now. A few notes: * enable the simd functionality with ./configure --enable-simd-width=16 (or 32 if you happen to have an avx processor) * At this point, *** arrays with lowest-rank dimension that is not an even multiplier of the simd width for the array type DO NOT WORK ***. This dimension needs to be padded to an even multiplier, otherwise the storage gets all screwed up. You'll likely get an unaligned access exception if you try this, or if the evaluation function can detect the problem it just won't use the vectorized path. * I haven't cleaned up stuff yet. There are many dead files that are no longer used (this was the case before, too). let me know how it goes, /Patrik |
From: Julian C. <cum...@ca...> - 2011-06-16 20:27:02
|
Hi Patrik, I went ahead and enabled both the svn and mercurial source control services for the blitz project. Everyone who is listed as a blitz developer (7 people) has write access to these repositories, but everyone has read access, of course. It seems like svn is better supported in terms of sourceforge because it provides for email notification of commits similar to the cvs repo. I have created a new bli...@li... mailing list for this purpose, although it will take a few hours for the system to recognize this list and initiate archiving support. By the way, sourceforge also has a git repository feature that may be enabled for hosted projects, but I have not yet done this for blitz. I am a bit concerned about having so many repositories flying around without any proper mechanism for synchronization. If someone can suggest a good way out of this "my favorite VCS is the one we should use" mess, I am all ears. I do like having some update notification mechanism available, so that the "casual" blitz developer can stay informed of changes as they occur. I am not sure if sourceforge supports doing this whenever an "hg push" to their mercurial repository occurs. I have the same concern regarding git. What I would suggest is that Patrik go forward with his setup of the blitz mercurial repository from his own repository, so that others can at least test drive the changes to support vectorization and suggest any further improvements. As for the svn repository, I guess that I can go through the process of importing our existing cvs repo contents using the cvs2svn dump file tool. Patrik, would you then be able to use "hg convert" to bring your changes into the svn repository? I am not sure what your starting point was for the changes you made. Further thoughts or suggestions? Does this seem like a reasonable plan? I am hoping to avoid any loss of provenance data here and minimize the pain of bringing in the new work that has been done. Regards, Julian C. From: Patrik Jonsson [mailto:co...@fa...] Sent: Thursday, June 16, 2011 7:39 AM To: Paul Hilscher Cc: bli...@li... Subject: Re: [Blitz-devel] Vectorization success! On Thu, Jun 16, 2011 at 9:58 AM, Paul Hilscher <pph...@gm...> wrote: Hi Patrick, amazing job :) thanks a lot for it. I am looking forward to try your improvements on my machine too. Thanks! Is there any update for the cvs to svn/mercurial or git (see Andres mail) migration. I am not an expert on this issue, but Andre made some good points in favor of git. Maybe we should give it a try. Could you upload your changes to his https://gitorious.org/blitz-experimental/blitz-experimental repository ? Once we decided, we could than transfer back to the sourceforge repository. So I did these changes while test driving my mercurial blitz repository, so it's not straightforward to get the changes back to other version systems (except svn, which I gather hg convert can create). I've suggested to Julian that he turn on the sourceforge mercurial repo so I can push my version up there for now. (sourceforge supports running several VCSs at once.) But yeah, hopefully we can move on with the vcs migration soon. cheers, /Patrik |
From: Patrik J. <co...@fa...> - 2011-06-16 14:39:44
|
On Thu, Jun 16, 2011 at 9:58 AM, Paul Hilscher <pph...@gm...>wrote: > Hi Patrick, > > amazing job :) thanks a lot for it. I am looking forward to try your > improvements on my machine too. > Thanks! > > Is there any update for the cvs to svn/mercurial or git (see Andres mail) > migration. I am not an expert > on this issue, but Andre made some good points in favor of git. Maybe we > should give it a try. > Could you upload your changes to his > https://gitorious.org/blitz-experimental/blitz-experimental repository ? > Once we decided, we could than transfer back to the sourceforge repository. > So I did these changes while test driving my mercurial blitz repository, so it's not straightforward to get the changes back to other version systems (except svn, which I gather hg convert can create). I've suggested to Julian that he turn on the sourceforge mercurial repo so I can push my version up there for now. (sourceforge supports running several VCSs at once.) But yeah, hopefully we can move on with the vcs migration soon. cheers, /Patrik |
From: André A. <and...@id...> - 2011-06-16 14:14:34
|
Hello, If you would like to try that out, you will need to register to Gitorious and send me your login names so I can give you commit rights. André On Thu, Jun 16, 2011 at 3:58 PM, Paul Hilscher <pph...@gm...> wrote: > Hi Patrick, > amazing job :) thanks a lot for it. I am looking forward to try your > improvements on my machine too. > Is there any update for the cvs to svn/mercurial or git (see Andres mail) > migration. I am not an expert > on this issue, but Andre made some good points in favor of git. Maybe we > should give it a try. > Could you upload your changes to > his https://gitorious.org/blitz-experimental/blitz-experimental repository > ? > Once we decided, we could than transfer back to the sourceforge repository. > Cheers, > Paul >> > > > ------------------------------------------------------------------------------ > EditLive Enterprise is the world's most technically advanced content > authoring tool. Experience the power of Track Changes, Inline Image > Editing and ensure content is compliant with Accessibility Checking. > http://p.sf.net/sfu/ephox-dev2dev > _______________________________________________ > Blitz-devel mailing list > Bli...@li... > https://lists.sourceforge.net/lists/listinfo/blitz-devel > > -- Dr. André Anjos Idiap Research Institute Centre du Parc - rue Marconi 19 CH-1920 Martigny, Suisse Phone: +41 27 721 7763 Fax: +41 27 721 7712 http://andreanjos.org |