In this post, I’ll write a short bootloader. My introduction to this topic came from Operating Systems: From 0 to 1 [1], which provides a sample bootloader that does not work (on my Ubuntu 20.04 Intel system, with a current version of NASM). I’ll take the bootloader from chapter 7 and add a small fix. I’m new to this, so feel free to email me with corrections or suggestions.
This bootloader is pretty incomplete - it’s as far as the book got before the author stopped writing. It’s missing a lot of things you might want to do in a bootloader, such as setting up the stack and adding a BPB. It’s also not UEFI-compatible.
The purpose of a bootloader is to load the operating system into memory and execute it. The bootloader is the first executable code that is loaded into RAM from the storage media that is available to your computer, such as a hard disk or floppy. This post will describe the floppy approach: this is simpler as we do not need to consider disk partitions, but has the drawback that less storage is available [6].
When an Intel processor starts, it is initially in 16-bit real mode, which emulates the operation of the 8086 architecture [2]. This has a couple of implications for a bootloader program:
- Memory addressing is frequently done by combining two registers: a segment and an offset. This is a quirk of the 8086. You may see notation such as
0xFFFF:0x0000
ores:bx
. You will see this in documentation of BIOS interrupt service routines [3]. To obtain the actual address, the segment value is left shifted by four bits and added to the offset. For example,0xFFFF:0x0000
becomes0xFFFF0
. For a more detailed discussion, you can refer to [4, p.43]. - One of the responsibilities of the bootloader (or early kernel code) is to switch the processor from 16-bit real mode to protected mode [4]. I will not discuss this process here, but the implication is that in real mode all facilities of the processor are open to us: there is no protection from a malicious bootloader program writing undesirable values to the output ports which are connected to your peripheral devices using the OUT instruction. You can refer to the Intel manual for more information on protected registers such as the IDTR and GDTR and protected instructions like LIDT, LGDT, OUT, and IN [5].
The bootloader program below will execute a BIOS interrupt service routine to load a placeholder kernel program from elsewhere on the floppy disk. It will use interrupt 13 with ah = 0x02
(ah
is the upper 8 bits of the ax
register. It’s also worth noting that hexadecimal instructions in documentation such as the Intel manual or the interrupt list are denoted either with a 0x
prefix or a h
suffix. So we could say ah = 0x02
or ah = 02h
and still be referring to the same 8-bit binary number 0000 0010
).
The documentation for the interrupt can be found here. Here are the relevant parts:
AH = 02h
AL = number of sectors to read (must be nonzero)
CH = low eight bits of cylinder number
CL = sector number 1-63 (bits 0-5)
high two bits of cylinder (bits 6-7, hard disk only)
DH = head number
DL = drive number (bit 7 set for hard disk)
ES:BX -> data buffer
Return:
CF set on error
if AH = 11h (corrected ECC error), AL = burst length
CF clear if successful
AH = status (see #00234)
AL = number of sectors transferred (only valid if CF set for some
BIOSes)
It’s worth looking at a diagram of a floppy disk to understand what the parameters mean (source):
A 512 byte (usually) sector is the smallest quantity that can be read into memory at one time. A track is a ring of 18 sectors (usually) on the disk. Floppy disks have two sides: drive 0 refers to the needle used to read or write the top, while drive 1 is for the bottom. After all sectors from track 0, head 0 have been read, the next sequence of data comes from track 0, head 1 - on the bottom of the disk [1].
The bootloader program can only fill a single 512-byte sector because it is read from track 0, sector 0, head 0 by the BIOS program, stored in ROM. The program will load the “kernel” from sector 1, track 0 and execute it:
start:
cli
mov ax, 0x50
mov es, ax
xor bx, bx ; es:bx contains the memory address at which the sector(s) will be loaded by this interrupt routine. The es:bx notation means that the value in es is left shifted by 4 bits before adding to bx. Therefore es:bx = 0x500.
mov cl, 2 ; first sector to read
mov al, 1 ; number of sectors to read starting from the sector number in cl
mov ch, 0 ; lower eight bits of the cylinder number
mov dh, 0 ; head number
mov dl, 0 ; drive number
mov ah, 0x02 ; used to indicate which routine we want
int 0x13 ; interrupt 13 with ah = 02h
mov bx, 0x0500 ; the memory address of the kernel loaded into memory
jmp bx ; jump to the kernel code
times 510-($-$$) db 0 ; $ is an assembler directive referring to the current position (in bytes) within this file at the beginning of this line. $$ refers to the beginning of the current section within the file. This ensures that our binary is 512 bytes in total, and trailing bytes up to the boot signature are zeros.
dw 0xAA55 ; This is the boot signature. It is a magic number required by the BIOS to be in the bootloader program. If we do not include it, we'll get a "not a bootable disk" error.
There are a couple of things worth pointing out. xor bx, bx
clears the bx
register to zero, so that when es
is left shifted by 4 bits and added to bx
, the value is 0x500
. Aside from the 8086 memory addressing, it is mostly straightforward: the program from chapter 7 of Operating Systems: From 0 to 1 tries to perform a far jump with jmp 0x50:0x0
which induces a jump to 0x00 using the q35 chipset with QEMU 4.2.1 (Debian 1:4.2-3ubuntu6.23)
and NASM 2.14.02
. I’ll look into this further in another post by working backwards from the bytes generated by NASM to determine which instruction is being produced (GDB’s disassembly can be inconsistent, and was not clear here). In the meantime, placing the address in a register (bx
) has solved the issue.
Notice that the value placed in al
is the number of sectors to load - you can imagine how you might increase this to load a larger operating system, rather than just the second sector. Most modern operating systems will not fit on a floppy disk, but the procedure for hard disks is similar: a 512-byte sector is read from the desired partition [6, p. 116].
All of the code for this project is hosted here with a Makefile. I’ll go through the steps for compiling and debugging the program, so we can verify the code is working. It may be useful to have NASM >= 2.14 and QEMU >= 4.2 installed to match these steps exactly. I’ve included a sample kernel program with the following code:
mov ax, 0x0101
mov ax, 0x0333
marker: db "this is where the kernel ends"
This does not accomplish much, but the instructions are unique so we can step through in the debugger and determine if the kernel has been loaded into memory and set to execute.
To compile, we need to convert the assembly instructions into bytes that are interpretable by the processor:
nasm -f bin bootloader.asm -o bootloader
nasm -f bin kernel.asm -o kernel
We need to construct a floppy disk image to provide to QEMU:
dd if=/dev/zero of=disk.img bs=512 count=2880
Lastly, we need to write the bootloader to the first sector (seek=0
) and the kernel to the second (seek=1
). The conv=notrunc
option ensures that the floppy disk remains 1.44 MB in size. This is the amount of storage available on a standard floppy (two heads * 80 tracks per head * 18 sectors per track * 512 bytes per sector = 1.44MB). See this for additional explanation.
dd conv=notrunc if=bootloader of=disk.img bs=512 count=1 seek=0
dd if=kernel of=disk.img bs=512 count=1 seek=1
We can verify that all of our code up to the end of the kernel program was written to the image using hd
which displays the content of the binary and decodes any ASCII:
ooc@thinkpad:~/writeyourownos$ hd disk.img
00000000 fa b8 50 00 8e c0 31 db b1 02 b0 01 b5 00 b6 00 |..P...1.........|
00000010 b2 00 b4 02 cd 13 bb 00 05 ea 00 00 50 00 00 00 |............P...|
00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.|
00000200 b8 01 01 b8 33 03 74 68 69 73 20 69 73 20 77 68 |....3.this is wh|
00000210 65 72 65 20 74 68 65 20 6b 65 72 6e 65 6c 20 65 |ere the kernel e|
00000220 6e 64 73 00 00 00 00 00 00 00 00 00 00 00 00 00 |nds.............|
00000230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
*
00168000
We can see that the marker message from kernel.asm
was included. We inserted the marker using the db
command, which is an assembler directive, not an instruction that your processor understands: it means “define byte” (similarly, we have dw
and dd
for defining words and double words respectively). Conveniently, you do not need to type db
before every character in a string: each character is converted to its ASCII representation by the assembler.
We can execute our code on QEMU with the following:
qemu-system-i386 -machine q35 -fda disk.img -gdb tcp::26000 -S
The -S option ensures that the machine will not start until we have entered GDB and entered continue
. QEMU will greet you with a black screen until you connect with GDB using target remote localhost:26000
.
(base) ooc@thinkpad:~$ gdb
GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2
Copyright (C) 2020 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
(gdb) target remote localhost:26000
target remote localhost:26000
Remote debugging using localhost:26000
0x0000fff0 in ?? ()
(gdb) b *0x7c00
Breakpoint 1 at 0x7c00
(gdb) c
Continuing.
Breakpoint 1, 0x00007c00 in ?? ()
The bootloader will be loaded at 0x7c00 in memory by the BIOS. We set a breakpoint there and type c
to continue to where the bootloader has been loaded. To get more insight into the current state of the registers and what instructions are being executed, you can type layout reg
at the prompt:
(gdb) layout reg
┌──Register group: general─────────────────────────────────────────────────────┐
│eax 0xaa55 43605 │
│ecx 0x0 0 │
│edx 0x0 0 │
│ebx 0x0 0 │
│esp 0x6f00 0x6f00 │
│ebp 0x0 0x0 │
└──────────────────────────────────────────────────────────────────────────────┘
│B+>0x7c00 cli │
│ 0x7c01 mov $0xc08e0050,%eax │
│ 0x7c06 xor %ebx,%ebx │
│ 0x7c08 mov $0x2,%cl │
│ 0x7c0a mov $0x1,%al │
│ 0x7c0c mov $0x0,%ch │
└──────────────────────────────────────────────────────────────────────────────┘
remote Thread 1.1 In: L?? PC: 0x7c00
(gdb)
You can see the first instruction of our bootloader, where we clear the interrupt flag, is at the current breakpoint. Step through all of the interrupt parameters using the ni
command until you reach the int
instruction. You can type set disassembly-flavor intel
to get the familiar assembly format from the .asm
files we compiled using NASM.
┌──Register group: general─────────────────────────────────────────────────────┐
│eax 0x201 513 │
│ecx 0x2 2 │
│edx 0x0 0 │
│ebx 0x0 0 │
│esp 0x6f00 0x6f00 │
│ebp 0x0 0x0 │
│esi 0x0 0 │
│edi 0x0 0 │
│eip 0x7c14 0x7c14 │
│eflags 0x46 [ IOPL=0 ZF PF ] │
┌──────────────────────────────────────────────────────────────────────────────┐
│ 0x7c04 mov es,eax │
│ 0x7c06 xor ebx,ebx │
│ 0x7c08 mov cl,0x2 │
│ 0x7c0a mov al,0x1 │
│ 0x7c0c mov ch,0x0 │
│ 0x7c0e mov dh,0x0 │
│ 0x7c10 mov dl,0x0 │
│ 0x7c12 mov ah,0x2 │
│ >0x7c14 int 0x13 │
│ 0x7c16 mov ebx,0xea0500 │
│ 0x7c1b add BYTE PTR [eax+0x0],dl │
└──────────────────────────────────────────────────────────────────────────────┘
remote Thread 1.1 In: L?? PC: 0x7c14
(gdb) ni
0x00007c12 in ?? ()
(gdb) set disassembly-flavor intel
(gdb) ni
0x00007c14 in ?? ()
(gdb)
You can use ni
to step over function calls, but this does not work for interrupt service routines. To step over without getting stuck in a long routine, set a breakpoint at the following instruction and continue with c
.
┌──Register group: general─────────────────────────────────────────────────────┐
│eax 0x1 1 │
│ecx 0x2 2 │
│edx 0x0 0 │
│ebx 0x0 0 │
│esp 0x6f00 0x6f00 │
│ebp 0x0 0x0 │
│esi 0x0 0 │
│edi 0x0 0 │
│eip 0x7c16 0x7c16 │
│eflags 0x46 [ IOPL=0 ZF PF ] │
┌──────────────────────────────────────────────────────────────────────────────┐
│B+>0x7c16 mov ebx,0xea0500 │
│ 0x7c1b add BYTE PTR [eax+0x0],dl │
│ 0x7c1e add BYTE PTR [eax],al │
│ 0x7c20 add BYTE PTR [eax],al │
│ 0x7c22 add BYTE PTR [eax],al │
│ 0x7c24 add BYTE PTR [eax],al │
│ 0x7c26 add BYTE PTR [eax],al │
│ 0x7c28 add BYTE PTR [eax],al │
│ 0x7c2a add BYTE PTR [eax],al │
│ 0x7c2c add BYTE PTR [eax],al │
│ 0x7c2e add BYTE PTR [eax],al │
└──────────────────────────────────────────────────────────────────────────────┘
remote Thread 1.1 In: L?? PC: 0x7c16
(gdb) ni
0x00007c12 in ?? ()
(gdb) ni
0x00007c14 in ?? ()
(gdb) set disassembly-flavor intel
(gdb) break *0x7c16
Breakpoint 2 at 0x7c16
(gdb) c
Continuing.
Breakpoint 2, 0x00007c16 in ?? ()
(gdb)
You may notice when stepping through the code in GDB that mov bx, 0x500
becomes mov ebx,0xea0500
. I think this is because GDB does not know we are simulating a 16-bit processor mode. It might also be helped by compiling our assembly in ELF format and supplying GDB with some debugging information [7]. Regardless, the above instruction sets the lower 16 bits of bx
to 0x0500
as desired. Type ni
once more to hit the jmp bx
command. You can see that the kernel has been loaded into memory at 0x500
:
┌──Register group: general─────────────────────────────────────────────────────┐
│eax 0x1 1 │
│ecx 0x2 2 │
│edx 0x0 0 │
│ebx 0x500 1280 │
│esp 0x6f00 0x6f00 │
│ebp 0x0 0x0 │
│esi 0x0 0 │
│edi 0x0 0 │
│eip 0x500 0x500 │
│eflags 0x46 [ IOPL=0 ZF PF ] │
┌──────────────────────────────────────────────────────────────────────────────┐
│ >0x500 mov eax,0x33b80101 |
│ 0x503 mov eax,0x68740333 │
│ |
│ 0x505 add esi,DWORD PTR [eax+ebp*2+0x69] │
│ 0x509 jae 0x52b │
│ 0x50b imul esi,DWORD PTR [ebx+0x20],0x72656877 │
│ 0x512 and BYTE PTR gs:[eax+ebp*2+0x65],dh │
│ 0x517 and BYTE PTR [ebx+0x65],ch │
│ 0x51a jb 0x58a │
│ 0x51c gs ins BYTE PTR es:[edi],dx │
│ 0x51e and BYTE PTR [ebp+0x6e],ah │
│ 0x521 fs jae 0x524 │
│ 0x524 add BYTE PTR [eax],al │
└──────────────────────────────────────────────────────────────────────────────┘
remote Thread 1.1 In: L?? PC: 0x500
(gdb) ni
0x00007c19 in ?? ()
(gdb) ni
0x00000500 in ?? ()
(gdb)
These are the two placeholder mov
commands from our “kernel,” (mov ax, 0x0101
and mov ax, 0x0333
) so we have succeeded in loading it from the floppy!
If any of the general purpose assembly instructions were confusing, you can refer to [7] or [8]. The former has useful discussion of memory addressing and processor modes; the latter covers more ground and has some discussion of floating point coprocessor instructions. The Intel instruction reference [5] may also be useful for information about rarer instructions like cli
.
I’m planning for future posts to describe the operating systems concepts that were most difficult for me to understand. I do not plan to cover all of the code of an operating system in my posts, so I’ll refer to more detailed resources with the specifics of what I learned from them or what code I used. Linus took a similar approach when writing Linux: many remnants of MINIX are still visible in the source code (compare MINIX ioctls and Linux ioctls, for example).
Hopefully this helped to give you a better understanding of one of the ways that your operating system interfaces with hardware, via the BIOS (or used to - today it’s probably UEFI-based). Here are a few additional teaching operating systems which might be useful resources:
- HelenOS
- TILCK
- basekernel
- eduOS and its Rust counterpart
You are welcome to to email me with corrections or suggestions, as I’m still new to operating systems development. Citations:
- [1] Operating Systems: From 0 to 1
- [2] Assembly Language Step-by-Step
- [3] BIOS interrupt service routines documentation
- [4] Game Engine Black Book: Wolfenstein 3D
- [5] Intel 64 and IA-32 Architectures Software Developer Manuals
- [6] Operating Systems: Design and Implementation, 3e
- [7] The DWARF Debugging Standard
- [8] Professional Assembly Language