Re: [Nasm-users] Complete newbie question

SourceForge Headquarters 225 Broadway Suite 1600 San Diego, CA 92101 +1 (858) 422-6466

cerise-nasm@l.armory.com wrote:
> I'm using nasm 0.98.39 and I'm trying to assemble plain binary
> which will spit out a string when plopped onto a floppy disk and made to
> boot.
> 
> I've done this a couple of different ways, but what I'd really like to do
> is have the string pointed to by ds:si.
> 
> The string is defined as:
> message db "Hello!", 0x0D, 0x0A, 0x00
> 
> I've tried:
> lds si, message (invalid comination of opcode and operands)

Right. "lds" needs a "[memory]" operand. "message" is just an 
"immediate" number.

> lds si, [message] (which gives ds:si f000:ff56, not 0000:7c03)

The problem here is that, even though this loads ds, it implicitly loads 
it from [ds:message] - and ds doesn't yet have a meaningful value. 
(it'll be the same every time on *your* machine, of course, but will 
likely differ on another machine). "[message]" would want to be "[ptr]" 
here, anyway...

> The latter struck me as odd because I took [message] to be accessing the 
> memory address contained in message, not the address of message -- which
> isn't what I want.

Well... "lea reg, [message]" would load the address into "reg" (as would 
the shorter "mov reg, message"). "lds", "les", "lss" etc. look similar, 
but they load a segreg *and* "reg" with the contents of a "far pointer".

> I also attempted defining a pointer (ptr dw message, 0) and using
> lds si, [ptr] which inexplicably gives me f000:ff53.

You're on the right track here. *Now* the problem is that ds is involved 
(I spoke too soon, above) - "lds si, [ds:ptr]" is what is implied here 
(you wouldn't want to write it that way, or Nasm would emit a redundant 
segment prefix, but that's what it "means").

*All* addresses on x86 involve a segment register. (e)ip uses cs, of 
course, (e)sp and (e)bp default to ss, the "destination" on the "string 
instructions" is es:(e)di. Otherwise ds:??? is the default. This is true 
even in 32-bit pmode - segregs are used differently, and most OSen allow 
us to ignore 'em entirely (which is a *great* blessing!), but they're 
still involved in *every* address.

> I also tried setting it through eax (mov eax, message ; mov si, ax ; 
> shr eax, 16 ; mov ds, ax ) with no luck.  I tried setting ds to seg message
> and si to message, but I can't use seg in plain binary.

Not sure why you'd "shr eax, 16" - "shr eax, 4" would've been closer... 
but that won't really do it... might if you're lucky... and "seg" is for 
linkable object formats... "-f bin" doesn't have "segments" in the same 
sense. We can say "segment .text" or "section .text" - exactly the same 
thing to Nasm, but I prefer the latter. Nasm re-arranges our "sections" 
(.text first, .data next, .bss last), but they aren't put in separate 
"segments"...

> Using the tremendous power of mathematics, I can do 
> mov eax, 0x00007c03
> mov si, ax

Okay... so far you've got the same thing as "mov si, 7C03h".

> shr eax, 16

Now we make ax 0.

> mov ds, ax

And put it in ds. Sure enough, 0000:7C03 is where the message lives! 
"message" just evaluates to 3, since Nasm defaults to "org 0" if you 
don't specify. That would be okay, too - but we would want to load ds 
with 7C0h in that case. 07C0:0003 is the *same* address!

> call putstr
> 
> and I get my string AOK, but there is seemingly no symbolic form I can use.
> What is the correct form to make this work?  I'd prefer to let the computer
> do the adding.  8)

Yes.

> -Phil/CERisE
> 
> P.S.  If you really want to see the rather amateurish code,

Yes, I do! I want to see what you've got for an "org", if any, and what 
you do with segregs on startup... at least...

> here it is.  I used
> to have message right after the hlt command, but when plain arithmetic became
> a necessity, I used the more conventional format.  I've also tried all my
> labels with :s -- odd that it's optional, but kinda cool.  I imagine it'll
> bite me hard some day when I mistype an identifier in a ton of code though.  8)

Yeah. Argueably, "-w+orphan-labels" should have been the default. 
Instead, it's "off" by default. One guy thought he'd discovered a bug in 
Nasm - he mis-spelled some floating-point instruction, one with no 
parameters, and Nasm quietly accepted it (as a label). He observerd that 
*anything* starting with 'f' was okay (in fact, *anything* is, unless 
it's an instruction that expects parameters), and thought Nasm was 
assembling "fgarbage" as a floating-point instruction! As you say, it's 
kinda cool to have colons optional, but for "human clarity", I think 
it's best to use 'em... and maybe turn that warning on.

Okay... no "org", so we've got "org 0". The "other way" (more common, 
perhaps) is "org 7C00h"... we just need to do something different with 
ds. The bios does *not* set it up for us!

> jmp start

Okay... you may have done yourself a favor here. The "canonical" 
bootsector starts with either a "jmp" (near), or "jmp short ..."/"nop", 
and the "oem string" starts after that. I've never seen it, but I've 
heard of a bios that refuses to boot without it. What you jump over is 
up to you - there are advantages to making your disk FAT12 compatible...

> message db "Hello!", 0x0D, 0x0A, 0x00
> ptr dw message, 0
> 
> start 

mov ax, 7C0h
mov ds, ax
...
Maybe want to load other segregs here too. In particular, you'll want to 
know where your stack is! Presumably, the bios has got ss:sp pointed 
someplace "sane" - interrupts, the timer and stuff, use the stack, so we 
*can't* have it pointed where it'll scribble on important stuff, even if 
we don't use it. Best get it under our control, ASAP. (but for this 
example... ignore the issue :)

If you'd used "org 7C00h", we'd want 0 in ds here, so "xor ax, ax" 
instead...

> lds si, [ptr]

This would work now, but we don't need it (we'd want the segment part of 
the far pointer to be 7C0h, not 0).

> ;mov eax, 0x00007c03
> ;mov si, ax
> ;shr eax, 16
> ;mov ds, ax

mov si, message

> call putstr
> hlt

This only stays "hlt"ed until an interrupt occurs. You might want to do:

stop:
hlt
jmp stop

Another possibility... "int 19h" reloads the bootsector... you might 
want to "wait for a key", and then "int 19h" to "see it again" for 
debugging purposes (doesn't help that much...).

> putstr
> mov ah, 0x0E
> xor ebx, ebx

*Some* video biosen (I'm told) use bl as the "attribute" (color) to use. 
Most use the default white-on-black (7). More commonly, bh is used as 
the "video page" to write to - other biosen write to whatever page is 
"current" (0, ... "always", AFAIK). My bios writes white-on-black to the 
current page regardless of what's in bx. For "maximum safety", I'd go 
with "mov bx, 7". No need to use ebx.

> putloop
> lodsb
> or al, al
> jz putend
> 
> int 0x10
> jmp putloop
> 
> putend
> ret

... and I think that should make it work...

Happy bootin',
Frank