Re: [Libjpeg-turbo-devel] xmm6 & xmm7 corruption on win64 after calling jpeg_write_scanlines (from
SIMD-accelerated libjpeg-compatible JPEG codec library
Brought to you by:
dcommander
From: Stefanos K. M. P. <sk...@em...> - 2012-04-26 06:02:18
|
I found out after debugging corrupted image output from some processing code (range testing) that is mixed with calls to jpeg_write_scanlines. For the bug to manifest you need (sse-enabled) code (usually intrinsics) that allocates variables to registers xmm6 or xmm7 over a call to one of the problematic functions -- the bug doesn't affect the logic of libjpeg-turbo itself, but of the code that calls it (its an ABI/calling convention violation). In our case xmm7 had thresholding constants and the corruption made vertical stripes appear as a jpeg_write_scanlines call (http://code.google.com/p/intrael/source/browse/trunk/src/intrael.c#640) corrupted the local (constant) variable rmZ which was set outside of the loop (http://code.google.com/p/intrael/source/browse/trunk/src/intrael.c#627) and was allocated to xmm7 by the compiler. On Thu, Apr 26, 2012 at 7:55 AM, DRC <dco...@us...> wrote: > I'm not arguing about the patch. It's obvious that it fixes a low-level > issue, and committing it is a no-brainer. I'm asking how that issue > manifests itself at the user level so I can document the symptoms, if > necessary. This is important when drafting change log messages, etc. > > > On 4/25/12 10:25 PM, Justin Lebar wrote: >> I think he's saying, set xmm6, xmm7 to some value, then call >> jsimd_convsamp_sse2 or one of the other relevant routines, then >> observe that only the bottom 64 bits of the registers' values are >> restored. >> >> Do you have a reason to think this code should work as written? Like, >> are you taking care not to clobber the top halves of the registers? >> >> On Wed, Apr 25, 2012 at 11:21 PM, DRC <dco...@us...> wrote: >>> Can you provide more information about how to reproduce the bug? A test >>> image, for instance? >>> >>> >>> On 4/25/12 9:29 PM, Stefanos Kornilios Mitsis Piitidis wrote: >>>> Hello, >>>> >>>> jsimd_convsamp_sse2 & related assembly functions that use the macros >>>> collect_args & uncollect_args don't properly save and restore all 128 >>>> bits of registers xmm6 and xmm7 as required by the win64 ABI (source: >>>> http://msdn.microsoft.com/en-us/library/9z1stfyw.aspx -> " XMM6:XMM15, >>>> Nonvolatile, Must be preserved as needed by callee" ). While the >>>> macros collect_args & uncollect_args try to preserve xmm6 and xmm7 >>>> movlpd is used instead of movaps resulting in corruption on the high >>>> half of the registers. When libjpeg is used with simd-optimised code >>>> this can lead to wrong calculations (we actually run into this problem >>>> while developing http://code.google.com/p/intrael/ ). >>>> >>>> >>>> The bug is fixed by replacing movlpd with movaps which moves all 128 bits ! >>>> >>>> Index: simd/jsimdext.inc >>>> =================================================================== >>>> --- simd/jsimdext.inc (revision 826) >>>> +++ simd/jsimdext.inc (working copy) >>>> @@ -322,15 +322,15 @@ >>>> push rsi >>>> push rdi >>>> sub rsp, SIZEOF_XMMWORD >>>> - movlpd XMMWORD [rsp], xmm6 >>>> + movaps XMMWORD [rsp], xmm6 >>>> sub rsp, SIZEOF_XMMWORD >>>> - movlpd XMMWORD [rsp], xmm7 >>>> + movaps XMMWORD [rsp], xmm7 >>>> %endmacro >>>> >>>> %imacro uncollect_args 0 >>>> - movlpd xmm7, XMMWORD [rsp] >>>> + movaps xmm7, XMMWORD [rsp] >>>> add rsp, SIZEOF_XMMWORD >>>> - movlpd xmm6, XMMWORD [rsp] >>>> + movaps xmm6, XMMWORD [rsp] >>>> add rsp, SIZEOF_XMMWORD >>>> pop rdi >>>> pop rsi >>>> >>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> Live Security Virtual Conference >>> Exclusive live event will cover all the ways today's security and >>> threat landscape has changed and how IT managers can respond. Discussions >>> will include endpoint security, mobile security and the latest in malware >>> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >>> _______________________________________________ >>> Libjpeg-turbo-devel mailing list >>> Lib...@li... >>> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-devel >> >> ------------------------------------------------------------------------------ >> Live Security Virtual Conference >> Exclusive live event will cover all the ways today's security and >> threat landscape has changed and how IT managers can respond. Discussions >> will include endpoint security, mobile security and the latest in malware >> threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ >> _______________________________________________ >> Libjpeg-turbo-devel mailing list >> Lib...@li... >> https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-devel > > ------------------------------------------------------------------------------ > Live Security Virtual Conference > Exclusive live event will cover all the ways today's security and > threat landscape has changed and how IT managers can respond. Discussions > will include endpoint security, mobile security and the latest in malware > threats. http://www.accelacomm.com/jaw/sfrnl04242012/114/50122263/ > _______________________________________________ > Libjpeg-turbo-devel mailing list > Lib...@li... > https://lists.sourceforge.net/lists/listinfo/libjpeg-turbo-devel -- ~skmp |