You can subscribe to this list here.
2001 
_{Jan}

_{Feb}

_{Mar}

_{Apr}

_{May}

_{Jun}

_{Jul}

_{Aug}

_{Sep}

_{Oct}
(8) 
_{Nov}
(8) 
_{Dec}
(4) 

2002 
_{Jan}
(53) 
_{Feb}
(15) 
_{Mar}
(51) 
_{Apr}
(54) 
_{May}
(41) 
_{Jun}
(48) 
_{Jul}
(32) 
_{Aug}
(22) 
_{Sep}
(61) 
_{Oct}
(31) 
_{Nov}
(31) 
_{Dec}
(27) 
2003 
_{Jan}
(45) 
_{Feb}
(18) 
_{Mar}
(25) 
_{Apr}
(39) 
_{May}
(34) 
_{Jun}
(20) 
_{Jul}
(13) 
_{Aug}
(16) 
_{Sep}
(18) 
_{Oct}
(14) 
_{Nov}
(17) 
_{Dec}
(13) 
2004 
_{Jan}
(53) 
_{Feb}
(12) 
_{Mar}
(38) 
_{Apr}
(29) 
_{May}
(72) 
_{Jun}
(38) 
_{Jul}
(41) 
_{Aug}
(11) 
_{Sep}
(21) 
_{Oct}
(30) 
_{Nov}
(35) 
_{Dec}
(14) 
2005 
_{Jan}
(66) 
_{Feb}
(14) 
_{Mar}
(24) 
_{Apr}
(50) 
_{May}
(40) 
_{Jun}
(29) 
_{Jul}
(37) 
_{Aug}
(27) 
_{Sep}
(26) 
_{Oct}
(58) 
_{Nov}
(43) 
_{Dec}
(23) 
2006 
_{Jan}
(84) 
_{Feb}
(36) 
_{Mar}
(24) 
_{Apr}
(42) 
_{May}
(20) 
_{Jun}
(41) 
_{Jul}
(40) 
_{Aug}
(42) 
_{Sep}
(23) 
_{Oct}
(38) 
_{Nov}
(31) 
_{Dec}
(28) 
2007 
_{Jan}
(11) 
_{Feb}
(34) 
_{Mar}
(14) 
_{Apr}
(29) 
_{May}
(45) 
_{Jun}
(5) 
_{Jul}
(10) 
_{Aug}
(6) 
_{Sep}
(38) 
_{Oct}
(44) 
_{Nov}
(19) 
_{Dec}
(22) 
2008 
_{Jan}
(37) 
_{Feb}
(24) 
_{Mar}
(29) 
_{Apr}
(14) 
_{May}
(24) 
_{Jun}
(47) 
_{Jul}
(26) 
_{Aug}
(4) 
_{Sep}
(14) 
_{Oct}
(45) 
_{Nov}
(25) 
_{Dec}
(16) 
2009 
_{Jan}
(33) 
_{Feb}
(34) 
_{Mar}
(45) 
_{Apr}
(45) 
_{May}
(30) 
_{Jun}
(47) 
_{Jul}
(37) 
_{Aug}
(19) 
_{Sep}
(15) 
_{Oct}
(16) 
_{Nov}
(24) 
_{Dec}
(31) 
2010 
_{Jan}
(32) 
_{Feb}
(25) 
_{Mar}
(12) 
_{Apr}
(5) 
_{May}
(2) 
_{Jun}
(9) 
_{Jul}
(31) 
_{Aug}
(10) 
_{Sep}
(12) 
_{Oct}
(20) 
_{Nov}
(6) 
_{Dec}
(41) 
2011 
_{Jan}
(23) 
_{Feb}
(8) 
_{Mar}
(41) 
_{Apr}
(8) 
_{May}
(15) 
_{Jun}
(10) 
_{Jul}
(8) 
_{Aug}
(14) 
_{Sep}
(16) 
_{Oct}
(13) 
_{Nov}
(15) 
_{Dec}
(8) 
2012 
_{Jan}
(6) 
_{Feb}
(14) 
_{Mar}
(22) 
_{Apr}
(40) 
_{May}
(27) 
_{Jun}
(18) 
_{Jul}
(2) 
_{Aug}
(6) 
_{Sep}
(10) 
_{Oct}
(32) 
_{Nov}
(5) 
_{Dec}
(2) 
2013 
_{Jan}
(14) 
_{Feb}
(2) 
_{Mar}
(15) 
_{Apr}
(2) 
_{May}
(6) 
_{Jun}
(7) 
_{Jul}
(25) 
_{Aug}
(6) 
_{Sep}
(3) 
_{Oct}

_{Nov}
(8) 
_{Dec}

2014 
_{Jan}
(3) 
_{Feb}
(3) 
_{Mar}
(3) 
_{Apr}

_{May}
(19) 
_{Jun}
(6) 
_{Jul}
(1) 
_{Aug}
(4) 
_{Sep}
(18) 
_{Oct}
(5) 
_{Nov}
(1) 
_{Dec}

2015 
_{Jan}
(2) 
_{Feb}
(4) 
_{Mar}
(2) 
_{Apr}
(1) 
_{May}
(17) 
_{Jun}
(1) 
_{Jul}

_{Aug}
(2) 
_{Sep}

_{Oct}

_{Nov}
(1) 
_{Dec}
(11) 
2016 
_{Jan}
(10) 
_{Feb}
(6) 
_{Mar}
(14) 
_{Apr}

_{May}
(2) 
_{Jun}
(5) 
_{Jul}

_{Aug}

_{Sep}
(3) 
_{Oct}
(1) 
_{Nov}
(1) 
_{Dec}

S  M  T  W  T  F  S 



1

2
(4) 
3

4
(1) 
5

6

7

8

9

10

11

12

13

14
(4) 
15

16
(11) 
17
(4) 
18

19

20

21

22

23

24

25

26

27

28

29
(2) 
30

31
(1) 


From: Gregory Brunner <GB<runner@es...>  20120516 18:29:21

I exported a PLY file from a boxm2 model using "boxm2ExtractPointCloudProcess" and "boxm2ExportOrientedPointCloudProcess". As it passed over the blocks, it appeared that the process got slower. My model had ~800 blocks and exporting a PLY file took ~8 hours. I just thought it was suspicious that the process appeared to slow down as it traversed the blocks. Could there perhaps be a memory leak somewhere in these routines? Greg Gregory Brunner  Imagery Scientist Esri  3060 Little Hills Expressway  St. Charles, MO 633013751  USA T 6369496620, ext. 8557  F 6369496735  M 6362223818 gbrunner@...<mailto:gbrunner@...>  http://www.esri.com<http://www.esri.com/>; 
From: Ian Scott <scottim@im...>  20120516 15:12:42

On 16/05/2012 15:50, Friedmann Y. wrote: > I am wondering why vectorized calcs in Matlab are faster that the same > calculations > in VXL, even using the pointerwisw code ? Ahh. I understand your question now. Probably due to nonaliasing assumptions and use of SSE SIMD extensions on x86. I also believe recent versions of Matlab can use the GPU for some work. I and an intern tried to get the vnl to use SSE2 intrinsics, but we gave up before it worked reliably across the variety of compilers & platforms in use on the dashboard. I've also had a go at adding nonalias directives  but without any detectable improvement. If all your work can be easily be coded in Matlab vectorised format then use Matlab. If you need to use C++/VXL for other reasons, and you want to have a go at finishing the vnl/SSE stuff  I'll happily point you at the right direction. Ian. 
From: Friedmann Y. <Y.F<riedmann@sw...>  20120516 15:05:59

Thanks Rasmus, I will give it a try... Original Message From: Rasmus Reinhold Paulsen [mailto:rrp@...] Sent: Wed 16/05/2012 15:37 To: VxlUsers Subject: Re: [Vxlusers] vectorise image I cannot say if that is true  it would a big surprise for me. In particular if you are a little careful in your C++ calling. However, GPU optimized code etc can do a lot today. Perhaps certain underlying routines in Matlab are optimized for multicores/GPUs etc. What I do know is that you have to be careful how you measure your time. In particular clock() is not good to use in small loops (granularity of 10 ms, if I remember correctly). Either call you your routine 1000 times an measure time outside or use a better timer. I think there is one called something like QueryPerformanceCounter() However, I am not following the recent trends...so my information might be a little stale... Cheers, Rasmus From: Friedmann Y. [mailto:Y.Friedmann@...] Sent: 16. maj 2012 15:55 To: Ian Scott Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image so how is it that the vectorised calcs are so much faster in matlab? Original Message From: Ian Scott [mailto:scottim@...] Sent: Wed 16/05/2012 14:15 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Friedmann Y. <Y.F<riedmann@sw...>  20120516 14:53:11

I am wondering why vectorized calcs in Matlab are faster that the same calculations in VXL, even using the pointerwisw code ? Original Message From: Wheeler, Frederick W (GE Global Research) [mailto:wheeler@...] Sent: Wed 16/05/2012 15:32 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image Are you wondering why vectorized calcs in Matlab are faster than nonvectorized calculations in Matlab? Or are you wondering why vectorized calculations in Matlab are faster that the same calculations in VXL? From: Friedmann Y. [mailto:Y.Friedmann@...] Sent: Wednesday, May 16, 2012 9:55 AM To: Ian Scott Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image so how is it that the vectorised calcs are so much faster in matlab? Original Message From: Ian Scott [mailto:scottim@...] Sent: Wed 16/05/2012 14:15 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Rasmus Reinhold Paulsen <rrp@im...>  20120516 14:37:50

I cannot say if that is true  it would a big surprise for me. In particular if you are a little careful in your C++ calling. However, GPU optimized code etc can do a lot today. Perhaps certain underlying routines in Matlab are optimized for multicores/GPUs etc. What I do know is that you have to be careful how you measure your time. In particular clock() is not good to use in small loops (granularity of 10 ms, if I remember correctly). Either call you your routine 1000 times an measure time outside or use a better timer. I think there is one called something like QueryPerformanceCounter() However, I am not following the recent trends...so my information might be a little stale... Cheers, Rasmus From: Friedmann Y. [mailto:Y.Friedmann@...] Sent: 16. maj 2012 15:55 To: Ian Scott Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image so how is it that the vectorised calcs are so much faster in matlab? Original Message From: Ian Scott [mailto:scottim@...] Sent: Wed 16/05/2012 14:15 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Wheeler, Frederick W (GE Global Research) <wheeler@ge...>  20120516 14:33:21

Are you wondering why vectorized calcs in Matlab are faster than nonvectorized calculations in Matlab? Or are you wondering why vectorized calculations in Matlab are faster that the same calculations in VXL? From: Friedmann Y. [mailto:Y.Friedmann@...] Sent: Wednesday, May 16, 2012 9:55 AM To: Ian Scott Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image so how is it that the vectorised calcs are so much faster in matlab? Original Message From: Ian Scott [mailto:scottim@...] Sent: Wed 16/05/2012 14:15 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Ian Scott <scottim@im...>  20120516 14:26:41

On 16/05/2012 14:55, Friedmann Y. wrote: > so how is it that the vectorised calcs are so much faster in matlab? Taking the example of a = some_large_matrix .^ 2 This is cheap to parse and dispatch. The only interpreterlevel operations are 1. Parse line. 2. look up variable "large_matrix" and store in internal register V1. 3. look up/create variable "a" once store in internal register V2. 4. Call internal function per_element_power(V1, 2, V2). The internal function per_element_power will have been written in C (or C++, FORTRAN, or assembler) and will have been optimised by which ever compiler they used to compile MATLAB. If you wrote the full loop version for i=1:size(some_large_matrix,1) for j=1:size(some_large_matrix,2) a(i,j) = some_large_matrix(i,j) ^ 2; then the interpreter would repeated be asking 19: Loop stuff 10. Parse loop internals. 11. lookup variable some_large_matrix and store in v1 12. lookup variable i and store in v2 13. lookup variable j and store in v3 14. call dereference_matrix(v1, v2, v3, v4) 1518  same again for a into v8 19. Call internal function full_power(v4, 2, v8) 2025 Some more loop stuff  jump back to line 4ish several thousand times. I'm afraid these questions are getting a little far from VXL. I'd suggest reading a book, or taking a course on compilers and interpreters, if you want to know more. Ian. 
From: Friedmann Y. <Y.F<riedmann@sw...>  20120516 13:57:01

so how is it that the vectorised calcs are so much faster in matlab? Original Message From: Ian Scott [mailto:scottim@...] Sent: Wed 16/05/2012 14:15 To: Friedmann Y. Cc: VxlUsers Subject: Re: [Vxlusers] vectorise image On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Ian Scott <scottim@im...>  20120516 13:15:50

On 16/05/2012 13:49, Friedmann Y. wrote: > > So is it right to assume that when using vectors in MATLAB to do the > same calculations, their 10 times higher efficiency is due to compiler > optimization? > > Yasmin It is long time since I used matlab proper, but at that time it was a not a compiled language. Octave, the GPL matlabclone behaves that way now. Everything was looked up on demand. Not just indexing, but even variable name dereferencing. Loop content was evaluated (and possibly even parsed) afresh every iteration. No compilation  therefore no opportunity for any optimisation. Ian. 
From: Friedmann Y. <Y.F<riedmann@sw...>  20120516 12:52:43

So is it right to assume that when using vectors in MATLAB to do the same calculations, their 10 times higher efficiency is due to compiler optimization? Yasmin On 16/05/2012 12:29, Friedmann Y. wrote: > Thanks Ian, Thanks very much for your tips! To make my question clearer, > here is a bit of my first and second attempts: > in the first attempt i am accessing the pixels u(i,j) and in the second > attempt I am using pointers to the pixels. > one iteration in the first attempt takes 0.9 secs, and in second attempt >  0.7 secs. Is this reasonable? I would have thought that the second way > would be much much faster? The ratio of those numbers is not out of line with what I would expect. I can't comment on their absolute value since I don't know the size of your image, the complexity of your perpixel calculation, or even the speed of your machine. As I said previously, indexing is not that expensive if you use a good optimising compiler. Still, you can usually extract slightly higher performance by doing some pointer arithmetic yourself, which is what you found, and what vil_math tries to do. I would have expected that the costs of your loop are dominated by the floating point calculations, not the loop variant, or the integermultiplication used during the indexing. These integer operations are cheap, and the compiler's optimiser might well be able to spot the similarity of all the pixel dereferences and avoid unnecessary indexing calculations. If the 20% improvement in speed is an issue for you, then you can implement your code that way. But as Tony Hoare (allegedy) said "Premature optimisation is the root of all evil." For example, in my world  large (>1Gb) 3D volume images and effectively randomaccesslookup to the image  the bottleneck often appears to be copying the data from main memory to and from the CPU's L3 cache. Fiddling around with the loop structure often has little effect other than at best obscuring the meaning of the code. Often it introduces hardtofind errors. Ian. PS Please keep these discussions on vxlusers. See the vxluserspolicy for reasons. > > many thanks again for taking the time with this! > > Yasmin: > > first attempt: > > vil_image_view<float> func1(vil_image_view<float> g_image,int itot) { > > float BS,CS(0.),DS(0.),ES(0.); > float FS(0.),GS(0.),HS(0.),IS(0.); > float JS(0.),KS(0.),LS(0.),MS(0.); > > vil_image_view<float> u0,cog; > vil_copy_deep(g_image,u0); > vil_copy_deep(g_image,cog); > > > > for (int it=1;it<itot+1;it++) > { > clock_t tStart = clock(); > for (unsigned int j=1;j<g_image.nj()1; j++) > for (unsigned int i=1; i<g_image.ni()1; i++) > { > BS=(u0(i+1,j)+u0(i1,j))/2.; > CS=(u0(i,j+1)+u0(i,j1))/2.; > DS=(u0(i+1,j1)+u0(i1,j+1))/2.; > ES=(u0(i1,j1)+u0(i+1,j+1))/2.; > //etc > } > } > double timef = (double)(clock()  tStart)/CLOCKS_PER_SEC; > vcl_cout<< it << '\t'<< timef << " seconds" <<vcl_endl; > } > > return u0; > } > > > second attempt (using the math.h way of going over an image) > > void func2(vil_image_view<float>& imB,vil_image_view<float>& im_sum,int > itot) { > > vil_image_view<float> imA; > vil_copy_deep(imB,imA); > unsigned ni = imA.ni();//width > unsigned nj = imA.nj(); //height > unsigned np = imA.nplanes(); > im_sum.set_size(ni,nj,np); > vcl_ptrdiff_t istepA=imA.istep(),jstepA=imA.jstep(),pstepA = > imA.planestep(); > vcl_ptrdiff_t istepS=im_sum.istep(),jstepS=im_sum.jstep(),pstepS = > im_sum.planestep(); > vcl_ptrdiff_t istepB=imB.istep(),jstepB=imB.jstep(),pstepB = > imB.planestep(); > const float* planeA = imA.top_left_ptr(); > const float* u11 = planeA+1+ni; > const float* planeB = imB.top_left_ptr(); > const float* b11 = planeB+1+ni; > float* planeS = im_sum.top_left_ptr(); > float* cog11 = planeS+1+ni; > > float* > BS=imA.top_left_ptr(),*CS=imA.top_left_ptr(),*DS=imA.top_left_ptr(),*ES=imA.top_left_ptr(); > float* > FS=imA.top_left_ptr(),*GS=imA.top_left_ptr(),*HS=imA.top_left_ptr(),*IS=imA.top_left_ptr(); > float* > JS=imA.top_left_ptr(),*KS=imA.top_left_ptr(),*LS=imA.top_left_ptr(),*MS=imA.top_left_ptr(); > // vil_image_view<float> u0,cog; > const float two = 2.0; > //vil_copy_deep(imA,u0); > //vil_copy_deep(imA,cog); > > for (int it=1;it<itot+1;it++) > { > clock_t tStart = clock(); > for (unsigned p=0;p<np;++p,planeA += pstepA,planeB += pstepB,planeS += > pstepS){ > const float* rowA = u11; > const float* rowB = b11; > float* rowS = cog11; > for (unsigned j=0;j<nj2;++j,rowA += jstepA,rowB += jstepB,rowS += jstepS){ > const float* pixelA = rowA; > const float* pixelB = rowB; > float* pixelS = rowS; > for (unsigned i=0;i<ni2;++i,pixelA+=istepA,pixelB+=istepB,pixelS+=istepS){ > float aa = float(*(pixelA+1)); > float bb = float(*(pixelA1)); > > *BS=(aa+bb)/two; > *CS=(float(*(pixelA+ni))+float(*(pixelAni)))/two; > *DS=(float(*(pixelA+1ni))+float(*(pixelA1+ni)))/2.0f; > //DS=(u0(i+1,j1)+u0(i1,j+1))/2.; > *ES=(float(*(pixelA1ni))+float(*(pixelA+1+ni)))/2.0f; > //ES=(u0(i1,j1)+u0(i+1,j+1))/2.; > > //etc... > } 
From: Ian Scott <scottim@im...>  20120516 12:29:27

On 16/05/2012 12:29, Friedmann Y. wrote: > Thanks Ian, Thanks very much for your tips! To make my question clearer, > here is a bit of my first and second attempts: > in the first attempt i am accessing the pixels u(i,j) and in the second > attempt I am using pointers to the pixels. > one iteration in the first attempt takes 0.9 secs, and in second attempt >  0.7 secs. Is this reasonable? I would have thought that the second way > would be much much faster? The ratio of those numbers is not out of line with what I would expect. I can't comment on their absolute value since I don't know the size of your image, the complexity of your perpixel calculation, or even the speed of your machine. As I said previously, indexing is not that expensive if you use a good optimising compiler. Still, you can usually extract slightly higher performance by doing some pointer arithmetic yourself, which is what you found, and what vil_math tries to do. I would have expected that the costs of your loop are dominated by the floating point calculations, not the loop variant, or the integermultiplication used during the indexing. These integer operations are cheap, and the compiler's optimiser might well be able to spot the similarity of all the pixel dereferences and avoid unnecessary indexing calculations. If the 20% improvement in speed is an issue for you, then you can implement your code that way. But as Tony Hoare (allegedy) said "Premature optimisation is the root of all evil." For example, in my world  large (>1Gb) 3D volume images and effectively randomaccesslookup to the image  the bottleneck often appears to be copying the data from main memory to and from the CPU's L3 cache. Fiddling around with the loop structure often has little effect other than at best obscuring the meaning of the code. Often it introduces hardtofind errors. Ian. PS Please keep these discussions on vxlusers. See the vxluserspolicy for reasons. > > many thanks again for taking the time with this! > > Yasmin: > > first attempt: > > vil_image_view<float> func1(vil_image_view<float> g_image,int itot) { > > float BS,CS(0.),DS(0.),ES(0.); > float FS(0.),GS(0.),HS(0.),IS(0.); > float JS(0.),KS(0.),LS(0.),MS(0.); > > vil_image_view<float> u0,cog; > vil_copy_deep(g_image,u0); > vil_copy_deep(g_image,cog); > > > > for (int it=1;it<itot+1;it++) > { > clock_t tStart = clock(); > for (unsigned int j=1;j<g_image.nj()1; j++) > for (unsigned int i=1; i<g_image.ni()1; i++) > { > BS=(u0(i+1,j)+u0(i1,j))/2.; > CS=(u0(i,j+1)+u0(i,j1))/2.; > DS=(u0(i+1,j1)+u0(i1,j+1))/2.; > ES=(u0(i1,j1)+u0(i+1,j+1))/2.; > //etc > } > } > double timef = (double)(clock()  tStart)/CLOCKS_PER_SEC; > vcl_cout<< it << '\t'<< timef << " seconds" <<vcl_endl; > } > > return u0; > } > > > second attempt (using the math.h way of going over an image) > > void func2(vil_image_view<float>& imB,vil_image_view<float>& im_sum,int > itot) { > > vil_image_view<float> imA; > vil_copy_deep(imB,imA); > unsigned ni = imA.ni();//width > unsigned nj = imA.nj(); //height > unsigned np = imA.nplanes(); > im_sum.set_size(ni,nj,np); > vcl_ptrdiff_t istepA=imA.istep(),jstepA=imA.jstep(),pstepA = > imA.planestep(); > vcl_ptrdiff_t istepS=im_sum.istep(),jstepS=im_sum.jstep(),pstepS = > im_sum.planestep(); > vcl_ptrdiff_t istepB=imB.istep(),jstepB=imB.jstep(),pstepB = > imB.planestep(); > const float* planeA = imA.top_left_ptr(); > const float* u11 = planeA+1+ni; > const float* planeB = imB.top_left_ptr(); > const float* b11 = planeB+1+ni; > float* planeS = im_sum.top_left_ptr(); > float* cog11 = planeS+1+ni; > > float* > BS=imA.top_left_ptr(),*CS=imA.top_left_ptr(),*DS=imA.top_left_ptr(),*ES=imA.top_left_ptr(); > float* > FS=imA.top_left_ptr(),*GS=imA.top_left_ptr(),*HS=imA.top_left_ptr(),*IS=imA.top_left_ptr(); > float* > JS=imA.top_left_ptr(),*KS=imA.top_left_ptr(),*LS=imA.top_left_ptr(),*MS=imA.top_left_ptr(); > // vil_image_view<float> u0,cog; > const float two = 2.0; > //vil_copy_deep(imA,u0); > //vil_copy_deep(imA,cog); > > for (int it=1;it<itot+1;it++) > { > clock_t tStart = clock(); > for (unsigned p=0;p<np;++p,planeA += pstepA,planeB += pstepB,planeS += > pstepS){ > const float* rowA = u11; > const float* rowB = b11; > float* rowS = cog11; > for (unsigned j=0;j<nj2;++j,rowA += jstepA,rowB += jstepB,rowS += jstepS){ > const float* pixelA = rowA; > const float* pixelB = rowB; > float* pixelS = rowS; > for (unsigned i=0;i<ni2;++i,pixelA+=istepA,pixelB+=istepB,pixelS+=istepS){ > float aa = float(*(pixelA+1)); > float bb = float(*(pixelA1)); > > *BS=(aa+bb)/two; > *CS=(float(*(pixelA+ni))+float(*(pixelAni)))/two; > *DS=(float(*(pixelA+1ni))+float(*(pixelA1+ni)))/2.0f; > //DS=(u0(i+1,j1)+u0(i1,j+1))/2.; > *ES=(float(*(pixelA1ni))+float(*(pixelA+1+ni)))/2.0f; > //ES=(u0(i1,j1)+u0(i+1,j+1))/2.; > > //etc... > } 