|
From: Julian S. <js...@ac...> - 2003-04-28 22:18:36
|
Hi I wonder if anyone can cast any light on the following. I am completely mystified, not to mention stuck. I'm doing stuff to add SSE/SSE2 support. This means changing various fsave/frstor instructions, which move the FPU state back and forth between the simulated and real cpus, into their SSE equivalents, fxsave and fxrstor. The SSE state is larger than the FPU/MMX state so various structures have had their size increased. There is also a 16-byte alignment constraint on addresses in fxsave/fxrstor, so I've ensured that too. Now, I think I've done everything right. Nevertheless, one of my fxrstor's, in vg_syscalls.S, is segfaulting. I have no idea why. It's this fxrstor VG_(m_state_static)+64 (previously of course) frstor VG_(m_state_static)+64 and the address is duly 16-byte aligned, and the memory from that address for 512 bytes (the SSE state size) appears suitably accessible. A few lines earlier there is fxrstor VG_(real_sse_state_saved_over_syscall) and that works fine. Help! I'm stuck. Is there some other magic constraints on fxrstor I need to know about? I read the fine print in the P4 documentation carefully, but I cannot see anything other than the 16-byte-alignment constraint. J |
|
From: Nicholas N. <nj...@ca...> - 2003-04-29 08:33:39
|
On Mon, 28 Apr 2003, Julian Seward wrote: > Help! I'm stuck. Is there some other magic constraints on > fxrstor I need to know about? I read the fine print in the P4 > documentation carefully, but I cannot see anything other than the > 16-byte-alignment constraint. I know nothing about fxrstor, but the docs do also say: Bit 6 and bits 16 through 32 of the MXCSR register are defined as reserved and should be set to 0. Attempting to write a 1 in any of these bits from the saved state image will result in a general protection exception (#GP) being generated. Worth checking? N |
|
From: Philippe E. <ph...@wa...> - 2003-04-29 17:17:37
|
Nicholas Nethercote wrote:
> On Mon, 28 Apr 2003, Julian Seward wrote:
>
>
>>Help! I'm stuck. Is there some other magic constraints on
>>fxrstor I need to know about? I read the fine print in the P4
>>documentation carefully, but I cannot see anything other than the
>>16-byte-alignment constraint.
>
>
> I know nothing about fxrstor, but the docs do also say:
>
> Bit 6 and bits 16 through 32 of the MXCSR register are defined as
> reserved and should be set to 0. Attempting to write a 1 in any of
> these bits from the saved state image will result in a general
> protection exception (#GP) being generated.
Probably an error in P4 documentation, see Intel documentation
Volume 1 11.6.6. On PIII or below setting this bit segfault
but on P4 it would work.
Someone can try this (very ugly) piece of code on a P4 ?
int main(void)
{
char save_area[512 + 15];
char * ptr = save_area + (16 - ((int)save_area & 15));
__asm__ ("fxsave (%0)" : : "r" (ptr) );
ptr[24] |= 1 << 6;
__asm__ ("fxrstor (%0)" : : "r" (ptr) );
return 0;
}
regards,
Philippe Elie
|
|
From: Philippe E. <ph...@wa...> - 2003-04-29 16:59:16
|
Julian Seward wrote: > Hi > > I wonder if anyone can cast any light on the following. I am > completely mystified, not to mention stuck. > > I'm doing stuff to add SSE/SSE2 support. This means changing various > fsave/frstor instructions, which move the FPU state back and forth > between the simulated and real cpus, into their SSE equivalents, > fxsave and fxrstor. The SSE state is larger than the FPU/MMX > state so various structures have had their size increased. > There is also a 16-byte alignment constraint on addresses in > fxsave/fxrstor, so I've ensured that too. > > Now, I think I've done everything right. Nevertheless, one > of my fxrstor's, in vg_syscalls.S, is segfaulting. I have no > idea why. It's this > > fxrstor VG_(m_state_static)+64 > > (previously of course) > > frstor VG_(m_state_static)+64 Are you sure the state is not corrupted between this pair of call ? > and the address is duly 16-byte aligned, and the memory from > that address for 512 bytes (the SSE state size) appears > suitably accessible. A few lines earlier there is > > fxrstor VG_(real_sse_state_saved_over_syscall) > > and that works fine. > > Help! I'm stuck. Is there some other magic constraints on > fxrstor I need to know about? I read the fine print in the P4 > documentation carefully, but I cannot see anything other than the > 16-byte-alignment constraint. there is other constraint to write mxcsr register see 11.6.6 in intel vol 1 doc, usefull only if you manipulate state directly (actual code doesn't do that). Note than fxrstor doc says than bit 6 is reserved but the documentation is wrong, bit 6 can be set if you take enough care. Also an errata on position of floating point tag word (offset 4 not 5) but ditto as above if you don't touch m_state_static+64 + ... it doesn't matter Since the current code doesn't touch (m_state_static+64)[] the most probable is a corruption of the state. If you didn't already fix the problem what about posting cvs diff -u ? regards, Philippe Elie |