Understanding x86 bootloader assembly

I am working my way through this tutorial: Writing a Bootloader Part 1.

And I don’t understand assembly at all, so I here are some notes to make sense of it all.

First here are two references that I hope will be useful:

This is the code of the example:

bits 16 ; tell NASM this is 16 bit code
org 0x7c00 ; tell NASM to start outputting stuff at offset 0x7c00
    mov si,hello ; point si register to hello label memory location
    mov ah,0x0e ; 0x0e means 'Write Character in TTY mode'
    or al,al ; is al == 0 ?
    jz halt  ; if (al == 0) jump to halt label
    int 0x10 ; runs BIOS interrupt 0x10 - Video Services
    jmp .loop
    cli ; clear interrupt flag
    hlt ; halt execution
hello: db "Hello world!",0

times 510 - ($-$$) db 0 ; pad remaining 510 bytes with zeroes
dw 0xaa55 ; magic bootloader magic - marks this 512 byte sector bootable!

So the first instruction bits 16 makes sense.

But org 0x7c00 already gets me struggling. It means that this is where the “code” starts.

However, if I change the number it only changes a number in the output, it does not “move” the code, as I would have expected.

So I looked up the description for the org directive in the nasm docs

The bin format provides an additional directive to the list given in chapter 7: ORG. The function of the ORG directive is to specify the origin address which NASM will assume the program begins at when it is loaded into memory.

Unlike the ORG directive provided by MASM-compatible assemblers, which allows you to jump around in the object file and overwrite code you have already generated, NASM’s ORG does exactly what the directive says: origin. Its sole function is to specify one offset which is added to all internal address references within the section

To be honest I did not understand any of the explanation, so I just assume it is magic and must be in there for this thing to work.

Next is boot: it is a label as I understand, that is, it is not written to the output, but it can be referenced from other places in the code. boot: is not referenced anywhere, but there is a “local label” called .loop: below it, that is used. And local labels must be placed under a label.

mov si,hello means that the address of the “Hello World” string is put into register si, the 16 bit general purpose source register

mov ah,0x0e loads a value into the 8 high-bit accumulator register.

What this does is explained in the linked wikipedia page: BIOS interrupt call

mov ah, 0x0e    ; function number = 0Eh : Display Character
mov al, '!'     ; AL = code of character to display
int 0x10        ; call INT 10h, BIOS video service

ah is the high accumulator which contains the function and al is the low accumulator which gets the character to print. So this works like

  1. set method to invoke
  2. pass parameter
  3. perform that function call

What lodsb does is easy to google:

The LODSB loads the byte addressed by DS:[SI] into register AL. SI is then incremented (if DF=0) or decremented (if DF=1) by 1.

The last byte in a string is the zero byte, so the next part of the code

or al,al ; is al == 0 ?
jz halt  ; if (al == 0) jump to halt label

is easy to understand, it checks for zero and jumps to halt.

Then the int 0x10 instructions performs the operation

And the last part pads they binary file with zeros and adds the magic number

I have also run the resulting binary through an x86 disassembler:

0:  be 10 7c b4 0e          mov    esi,0xeb47c10
5:  ac                      lods   al,BYTE PTR ds:[esi]
6:  08 c0                   or     al,al
8:  74 04                   je     0xe
a:  cd 10                   int    0x10
c:  eb f7                   jmp    0x5
e:  fa                      cli
f:  f4                      hlt

And it appears that the 0x7c00 from the org instruction is added to the address of the “Hello World” string. So I guess it is the offset that points to the beginning of the 512 byte that this bootloader operates in. Which would only make sens