From: Gert V. <ger...@hc...> - 2005-03-29 17:12:01
|
Stephen Beahm wrote: >Hello, > >Both the C-reference and altivec versions of bsumsq_sub22_mmx() implement the >following: > v = ((p1f[i]+p1b[i]+1)>>1) - p2[i]; > >The mmx version does not add the one (+1). > >I am rather new to the mmx instruction set, so maybe someone else can clean >this patch up? Thanks. > > >Index: utils/mmxsse/mblock_sumsq_mmx.c >=================================================================== >RCS file: /cvsroot/mjpeg/mjpeg_play/utils/mmxsse/mblock_sumsq_mmx.c,v >retrieving revision 1.1 >diff -u -r1.1 mblock_sumsq_mmx.c >--- utils/mmxsse/mblock_sumsq_mmx.c 26 Dec 2004 05:29:39 -0000 1.1 >+++ utils/mmxsse/mblock_sumsq_mmx.c 17 Mar 2005 04:54:32 -0000 >@@ -409,6 +409,14 @@ > { > int sum,sum1,sum2; > >+ /* Load 1 into one_q */ >+ int one = 1; >+ uint64_t one_q; >+ movd_g2r(one, mm7); >+ punpcklwd_r2r(mm7, mm7); >+ punpckldq_r2r(mm7, mm7); >+ movq_r2m(mm7, one_q); >+ > pxor_r2r(mm5, mm5); > > This could be replaced with a constant containing 4 words with the value 1. See patch below. Gert diff -u -r1.1 mblock_sumsq_mmx.c --- mblock_sumsq_mmx.c 26 Dec 2004 05:29:39 -0000 1.1 +++ mblock_sumsq_mmx.c 29 Mar 2005 17:02:14 -0000 @@ -407,6 +407,7 @@ int bsumsq_sub22_mmx(uint8_t *blk1f, uint8_t *blk1b, uint8_t *blk2, int lx, int h) { + static const uint16_t ones[4]={1,1,1,1}; int sum,sum1,sum2; pxor_r2r(mm5, mm5); @@ -431,10 +432,12 @@ punpckhbw_r2r(mm7, mm3); paddw_r2r(mm4, mm0); + paddw_m2r(*ones, mm0); psrlw_i2r(1, mm0); psubw_r2r(mm2, mm0); pmaddwd_r2r(mm0, mm0); paddw_r2r(mm6, mm1); + paddw_m2r(*ones, mm1); psrlw_i2r(1, mm1); psubw_r2r(mm3, mm1); pmaddwd_r2r(mm1, mm1); |