Re: [Valgrind-users] 答复: 答复: 答复: [Help] Valgrind sometime run the program very slowly sometimes , i

SourceForge Headquarters 1320 Columbia Street Suite 310 San Diego, CA 92101 +1 (858) 422-6466

On 01/25/2018 15:37 UTC, Wuweijia wrote:

> 	Function1:
> bool CDynamicScheduling::GetProcLoop(
>          int& nBegin,
>          int& nEndPlusOne)
> {
>          int curr = __sync_fetch_and_add(&m_nCurrent, m_nStep);

How large is 'm_nStep'?  [Are you sure?]
The overhead expense of switching threads in valgrind would be reduced
by making m_nStep as large as possible.  It looks like the code
in Function2 would produce the same values regardless.

>          if (curr > m_nEnd)
>          {
>                  return false;
>          }
> 
>          nBegin = curr;

>          int limit = m_nEnd + 1;

Local variable 'limit' is unused.  By itself this is unimportant,
but it might be a clue to something that is not shown here.

>          nEndPlusOne = curr + m_nStep;
>          return true;
> }
> 	
> 	
> 	Function2:
> 	....
> 	int beginY, endY;
>    while (pDS->GetProcLoop(beginY, endY)){
>      for (y = beginY; y < endY; y++){
>        for(x = 0; x < dstWDiv2-7; x+=8){
>          vtmp0 = vld2q_u16(&pSrc[(y<<1)*srcStride+(x<<1)]);
>          vtmp1 = vld2q_u16(&pSrc[((y<<1)+1)*srcStride+(x<<1)]);

I hope the actual source contains a comment such as:
     Compute pDst[] as the rounded average of non-overlapping 2x2 blocks of pixels in pSrc[].

>          vst1q_u16(&pDst[y*dstStride+x], (vtmp0.val[0] + vtmp0.val[1] + vtmp1.val[0] + vtmp1.val[1] + vdupq_n_u16(2)) >> vdupq_n_u16(2));
>        }
>        for(; x < dstWDiv2; x++){
>          pDst[y*dstStride+x] = (pSrc[(y<<1)*srcStride+(x<<1)] + pSrc[(y<<1)*srcStride+(x<<1)+1] + pSrc[((y<<1)+1)*srcStride+(x<<1)] + pSrc[((y<<1)+1)*srcStride+((x<<1)+1)] + 2) >> 2;
>        }
>      }
>    }
> 
>    return;
> }	

Re: [Valgrind-users] 答复: 答复: 答复: [Help] Valgrind sometime run the program very slowly sometimes , i

Re: [Valgrind-users] 答复: 答复: 答复: [Help] Valgrind sometime run the program very slowly sometimes , it last at least one hour. can you show me why or some way to analyze it?