From: Frank K. <fbk...@co...> - 2006-04-06 02:05:07
|
cerise-nasm@l.armory.com wrote: > I'm using nasm 0.98.39 and I'm trying to assemble plain binary > which will spit out a string when plopped onto a floppy disk and made to > boot. > > I've done this a couple of different ways, but what I'd really like to do > is have the string pointed to by ds:si. > > The string is defined as: > message db "Hello!", 0x0D, 0x0A, 0x00 > > I've tried: > lds si, message (invalid comination of opcode and operands) Right. "lds" needs a "[memory]" operand. "message" is just an "immediate" number. > lds si, [message] (which gives ds:si f000:ff56, not 0000:7c03) The problem here is that, even though this loads ds, it implicitly loads it from [ds:message] - and ds doesn't yet have a meaningful value. (it'll be the same every time on *your* machine, of course, but will likely differ on another machine). "[message]" would want to be "[ptr]" here, anyway... > The latter struck me as odd because I took [message] to be accessing the > memory address contained in message, not the address of message -- which > isn't what I want. Well... "lea reg, [message]" would load the address into "reg" (as would the shorter "mov reg, message"). "lds", "les", "lss" etc. look similar, but they load a segreg *and* "reg" with the contents of a "far pointer". > I also attempted defining a pointer (ptr dw message, 0) and using > lds si, [ptr] which inexplicably gives me f000:ff53. You're on the right track here. *Now* the problem is that ds is involved (I spoke too soon, above) - "lds si, [ds:ptr]" is what is implied here (you wouldn't want to write it that way, or Nasm would emit a redundant segment prefix, but that's what it "means"). *All* addresses on x86 involve a segment register. (e)ip uses cs, of course, (e)sp and (e)bp default to ss, the "destination" on the "string instructions" is es:(e)di. Otherwise ds:??? is the default. This is true even in 32-bit pmode - segregs are used differently, and most OSen allow us to ignore 'em entirely (which is a *great* blessing!), but they're still involved in *every* address. > I also tried setting it through eax (mov eax, message ; mov si, ax ; > shr eax, 16 ; mov ds, ax ) with no luck. I tried setting ds to seg message > and si to message, but I can't use seg in plain binary. Not sure why you'd "shr eax, 16" - "shr eax, 4" would've been closer... but that won't really do it... might if you're lucky... and "seg" is for linkable object formats... "-f bin" doesn't have "segments" in the same sense. We can say "segment .text" or "section .text" - exactly the same thing to Nasm, but I prefer the latter. Nasm re-arranges our "sections" (.text first, .data next, .bss last), but they aren't put in separate "segments"... > Using the tremendous power of mathematics, I can do > mov eax, 0x00007c03 > mov si, ax Okay... so far you've got the same thing as "mov si, 7C03h". > shr eax, 16 Now we make ax 0. > mov ds, ax And put it in ds. Sure enough, 0000:7C03 is where the message lives! "message" just evaluates to 3, since Nasm defaults to "org 0" if you don't specify. That would be okay, too - but we would want to load ds with 7C0h in that case. 07C0:0003 is the *same* address! > call putstr > > and I get my string AOK, but there is seemingly no symbolic form I can use. > What is the correct form to make this work? I'd prefer to let the computer > do the adding. 8) Yes. > -Phil/CERisE > > P.S. If you really want to see the rather amateurish code, Yes, I do! I want to see what you've got for an "org", if any, and what you do with segregs on startup... at least... > here it is. I used > to have message right after the hlt command, but when plain arithmetic became > a necessity, I used the more conventional format. I've also tried all my > labels with :s -- odd that it's optional, but kinda cool. I imagine it'll > bite me hard some day when I mistype an identifier in a ton of code though. 8) Yeah. Argueably, "-w+orphan-labels" should have been the default. Instead, it's "off" by default. One guy thought he'd discovered a bug in Nasm - he mis-spelled some floating-point instruction, one with no parameters, and Nasm quietly accepted it (as a label). He observerd that *anything* starting with 'f' was okay (in fact, *anything* is, unless it's an instruction that expects parameters), and thought Nasm was assembling "fgarbage" as a floating-point instruction! As you say, it's kinda cool to have colons optional, but for "human clarity", I think it's best to use 'em... and maybe turn that warning on. Okay... no "org", so we've got "org 0". The "other way" (more common, perhaps) is "org 7C00h"... we just need to do something different with ds. The bios does *not* set it up for us! > jmp start Okay... you may have done yourself a favor here. The "canonical" bootsector starts with either a "jmp" (near), or "jmp short ..."/"nop", and the "oem string" starts after that. I've never seen it, but I've heard of a bios that refuses to boot without it. What you jump over is up to you - there are advantages to making your disk FAT12 compatible... > message db "Hello!", 0x0D, 0x0A, 0x00 > ptr dw message, 0 > > start mov ax, 7C0h mov ds, ax ... Maybe want to load other segregs here too. In particular, you'll want to know where your stack is! Presumably, the bios has got ss:sp pointed someplace "sane" - interrupts, the timer and stuff, use the stack, so we *can't* have it pointed where it'll scribble on important stuff, even if we don't use it. Best get it under our control, ASAP. (but for this example... ignore the issue :) If you'd used "org 7C00h", we'd want 0 in ds here, so "xor ax, ax" instead... > lds si, [ptr] This would work now, but we don't need it (we'd want the segment part of the far pointer to be 7C0h, not 0). > ;mov eax, 0x00007c03 > ;mov si, ax > ;shr eax, 16 > ;mov ds, ax mov si, message > call putstr > hlt This only stays "hlt"ed until an interrupt occurs. You might want to do: stop: hlt jmp stop Another possibility... "int 19h" reloads the bootsector... you might want to "wait for a key", and then "int 19h" to "see it again" for debugging purposes (doesn't help that much...). > putstr > mov ah, 0x0E > xor ebx, ebx *Some* video biosen (I'm told) use bl as the "attribute" (color) to use. Most use the default white-on-black (7). More commonly, bh is used as the "video page" to write to - other biosen write to whatever page is "current" (0, ... "always", AFAIK). My bios writes white-on-black to the current page regardless of what's in bx. For "maximum safety", I'd go with "mov bx, 7". No need to use ebx. > putloop > lodsb > or al, al > jz putend > > int 0x10 > jmp putloop > > putend > ret ... and I think that should make it work... Happy bootin', Frank |