|
From: Peter J. <pe...@to...> - 2007-09-20 16:08:43
|
On Wed, 19 Sep 2007 11:42:19 -0700, H. Peter Anvin wrote > Peter Johnson wrote: > > In going through the new rip relative stuff in NASM, and noticed a few > > things: > > > > 1. It seems ugly to make an exception for just FS and GS to have an > > exception to default-RIP-rel (and the current wording in the NASM manual > > is confusing too)... wouldn't it be better to just turn off > > default-RIP-rel for ALL segment registers? Particularly seeing as an es > > override is ignored with a warning. I would think that if a user is using > > *any* segment register, having default RIP-relative would be a surprise. > > > > No, that really makes sense, as hideously non-orthogonal as it may be > > (there are some ugly hacks in x86-64). nasm64developer pointed out > to me that on some CPUs, it is possible to enable limit checks for segment > registers even in 64-bit mode, but only FS and GS have bases. Okay, I see your point here, and nasm64developer just pointed out the limits thing to me as well. Withdrawn. I would suggest fixing the wording in the manual, however (it uses double-excepts, which is rather hard to follow). > > 2. It appears handling of 64-bit movoffs values in default rel mode is > > broken in NASM: > > > > default rel > > mov rax, [123456789abcdef0h] > > mov rax, [qword 123456789abcdef0h] > > > > both mov's generate 32-bit RIP-relative values with no warning. This > > feels like a bug... but see #3 for more on what's the *real* issue here. > > No, that's the right thing, because you don't know a priori that that > isn't a valid address in that range. [abs 123456789abcdef0h] should > do the movoffs form. How is silently truncating plain [123456789abcdef0h] *ever* a good thing? If you've defaulted to rel, NASM has no way of knowing what RIP the program is running at, and thus has no reasonable method of finding a 32-bit relative address that will access that large 64-bit address. Instead, it just truncates it to garbage; silently no less! At least give a warning, but I would argue that it's obviously a 64-bit constant :). > > 3. I believe that due to 64-bit movoffs forms being "special" (in allowing > > a 64-bit displacement), the use of qword in the [] should force the > > displacement size, not the address size (and likewise for dword, which > > should force a 32-bit displacement size). Note displacement size can be > > different than address size in 64-bit mode (okay, only for the movoffs > > form, but consistency is good across the board)! Is there a current way > > in NASM's world to force either a 32-bit or 64-bit displacement size (NOT > > address size!) for mov rax, [...]? > > There isn't at the moment. See below, though. > > However, I'm adamant that "mov rax,[abs 123456789abtcdef0h]" should use > the 64-bit form by default; the optimizer should be allow to reduce > it to 32 bits if the address fits. What about for external labels? Constants get confusing to talk about because they're obviously 32-bit or not, so it can be optimized either way. I feel that 32-bit be the default for consistency reasons, as 64-bit is a special case, and only works for rax, so it's inconsistent for 64-bit to be the default for the rax case: default abs mov rbx, [var] ; 32-bit displacement mov rcx, [var] ; 32-bit displacement mov rax, [var] ; 64-bit displacement (!?) Maybe this isn't a problem for binary targets, but it is a lurking surprise for targets with relocation such as ELF and Win64, as rax uses may mysteriously break. Compare to: mov rbx, [var] ; 32-bit mov rbx, [var] ; 32-bit mov rax, [var] ; 32-bit mov rax, [qword var] ; 64-bit displacement (note GAS follows this rule relative to mov and movabs; the latter is required for 64-bit displacement size, while the former uses 32-bit displacement size in 64-bit mode... indicating you have to know to explicitly request the 64-bit size) And we don't want to be inconsistent for constants versus labels, so that means mov rax,[xxx] should default to 32-bit for constants as well, until they get larger than the 32-bit displacement size will allow (so still optimize, just in the other direction). Also, what about consistency with mov imm? My proposed: mov rbx, var ; 32-bit mov rax, var ; 32-bit mov rbx, qword var ; 64-bit mov rax, qword var ; 64-bit > > Note that the default in 64-bit mode should be NOT to use the movoffs > > form, as this is 4 bytes longer than the modrm-version. Plus a number of > > object formats much prefer 32-bit relative offsets rather than the 64-bit > > one (win64 for example chokes heavily on 64-bit relocs in the linker > > stage, so you'll end up with broken behavior with the current NASM > > output). > > That's bogus. That's what "default rel" is for. Okay, but see my consistency argument above when you're talking about abs (either explicit or implicit). > The real question is how much value it is in the 32-bit displacement > forms as opposed to the a32 form. Remember, we're only talking absolute > addresses here (not relative addresses nor ); the 32-bit displacement > form can produce addresses in the range ±2 GB whereas the a32 form > produces addresses in the range 0-4 GB. So the only consumers of the > former would be something like an OS kernel which doesn't use > RIP-relative addressing... a pretty rare beast. Point taken. Since they're absolute addresses and the negative form is rare, the A0-A3 forms should be fine to use. However, it's another inconsistency, as it makes rax "special" (as rbx, etc. take signed values...). > > mov rax, [dword foo] makes the displacement 4 bytes, but generates an a32 > > prefix on what is still a movoffs form instruction. Not exactly what we > > want. > > No, I think that is exactly what we want. > > I have mentioned in the past that I'd like to use the syntax: > > mov rax,[abs a64 dword bluttan] > > ... to produce the 32-bit displacement form. Yes, it's heavy on > syntax, but it is such a rare corner case. > > This isn't implemented yet, though, nor am I really convinced it's > the right thing. > > > This is why in yasm I use [dword ...] and [qword ...] to force the > > displacement size, and a32 and a64 to force the address size, to separate > > these two concepts. > (pulled from above for clarity of discussion) > In current NASM, word, dword and qword on an addressing operand applies > to the address size whereas byte specifies the displacement size. This > is inconsistent, yes, but it is the established behaviour and would > definitely have to be maintained for non-64-bit code. Introducing > new behaviour for 64-bit mode would have to be balanced against the other > tradeoffs. (end pull from above) > That's fine in many ways, but it is a much bigger departure from > historical NASM syntax. Even though we could do this in 64-bit mode > (ONLY!), I'm a bit leery of doing so for the benefit of one single > instruction. Actually, it depends on how you look at it. In 32-bit and 16-bit mode, the address size and displacement size always matched, thus to get a 16-bit displacement size you had to have a 16-bit address size. My interpretation has always been that [word X] or [dword X] set the displacement size, and the fact the address size was set to 16-bit and 32-bit respectively was an artifact of getting achieving displacement size. This is confirmed by the fact [byte X] is accepted, sets the displacement size, and doesn't set the address size. This distinction between address size and displacement size becomes important in 64-bit mode, where the address size is 64-bit but most displacements are 32-bit. I think this interpretation is clearer as it matches with how immediates are handled size-wise. How often are people *really* wanting to override the address size to 32-bit in 64-bit mode? Peter |