I’ve spent some time learning about operating systems and I believe my programming has benefitted from it. I think I have reached a point where I can provide some help to others who want to learn the basics of how their operating system works with their processor, so I’m planning to write a series about it. I’m new to this, so feel free to email me with corrections or suggestions.
In this post, I’ll write a short bootloader. My introduction to this topic came from Operating Systems: From 0 to 1 , which provides a sample bootloader that does not work (on my Ubuntu 20.04 Intel system, with a current version of NASM). I’ll do something similar to what is outlined in chapter 7.
The purpose of a bootloader is to load the operating system into memory and execute it. The bootloader is the first executable code that is loaded into RAM from the storage media that is available to your computer, such as a hard disk or floppy. This post will describe the floppy approach: this is simpler as we do not need to consider disk partitions, but has the drawback that less storage is available .
When an Intel processor starts, it is initially in 16-bit real mode, which emulates the operation of the 8086 architecture for legacy reasons . This has a couple of important implications for a bootloader program:
- Registers used will be the 16-bit versions: for example, you cannot use the 32-bit
raxregister and instead must use
ax, which refers to the lowest 16 bits of
raxon 32-bit and 64-bit processors respectively.
- Memory addressing is frequently done by combining two registers: a segment and an offset. This is a quirk of the 8086. You may see notation such as
es:bx. You will see this in documentation of BIOS interrupt service routines . To obtain the actual address, the segment value is left shifted by four bits and added to the offset. For example,
0xFFFF0. For a more detailed discussion, you can refer to [4, p.43].
- One of the responsibilities of the bootloader (or early kernel code) is to switch the processor from 16-bit real mode to protected mode . I will not discuss this process here, but the implication is that in real mode all facilities of the processor are open to us: there is no protection from a malicious bootloader program writing undesirable values to the output ports which are connected to your peripheral devices using the OUT instruction. You can refer to the Intel manual for more information on protected registers such as the IDTR and GDTR and protected instructions like LIDT, LGDT, OUT, and IN .
The bootloader program below will execute a BIOS interrupt service routine to load a placeholder kernel program from elsewhere on the floppy disk. It will use interrupt 13 with
ah = 0x02 (
ah is the upper 8 bits of the
ax register. It’s also worth noting that hexadecimal instructions in documentation such as the Intel manual or the interrupt list are denoted either with a
0x prefix or a
h suffix. So we could say
ah = 0x02 or
ah = 02h and still be referring to the same 8-bit binary number
The documentation for the interrupt can be found here. Here are the relevant parts:
AH = 02h AL = number of sectors to read (must be nonzero) CH = low eight bits of cylinder number CL = sector number 1-63 (bits 0-5) high two bits of cylinder (bits 6-7, hard disk only) DH = head number DL = drive number (bit 7 set for hard disk) ES:BX -> data buffer Return: CF set on error if AH = 11h (corrected ECC error), AL = burst length CF clear if successful AH = status (see #00234) AL = number of sectors transferred (only valid if CF set for some BIOSes)
It’s worth looking at a diagram of a floppy disk to understand what the parameters mean (source):
A 512 byte sector is the smallest quantity that can be read into memory at one time. A track is a ring of 18 sectors on the disk. Floppy disks have two sides: drive 0 refers to the needle used to read or write the top, while drive 1 is for the bottom. After all sectors from track 0, head 0 have been read, the next sequence of data comes from track 0, head 1 - on the bottom of the disk .
The bootloader program can only fill a single 512-byte sector because it is read from track 0, sector 0, head 0 by the BIOS program, stored in ROM. The program will load the “kernel” from sector 1, track 0 and execute it:
start: cli mov ax, 0x50 mov es, ax xor bx, bx ; es:bx contains the memory address at which the sector(s) will be loaded by this interrupt routine. The es:bx notation means that the value in es is left shifted by 4 bits before adding to bx. Therefore es:bx = 0x500. mov cl, 2 ; first sector to read mov al, 1 ; number of sectors to read starting from the sector number in cl mov ch, 0 ; lower eight bits of the cylinder number mov dh, 0 ; head number mov dl, 0 ; drive number mov ah, 0x02 ; used to indicate which routine we want int 0x13 ; interrupt 13 with ah = 02h mov bx, 0x0500 ; the memory address of the kernel loaded into memory jmp bx ; jump to the kernel code times 510-($-$$) db 0 ; $ is an assembler directive referring to the current position (in bytes) within this file at the beginning of this line. $$ refers to the beginning of the current section within the file. This ensures that our binary is 512 bytes in total, and trailing bytes up to the boot signature are zeros. dw 0xAA55 ; This is the boot signature. It is a magic number required by the BIOS to be in the bootloader program. If we do not include it, we'll get a "not a bootable disk" error.
There are a couple of things worth pointing out.
xor bx, bx clears the
bx register to zero, so that when
es is left shifted by 4 bits and added to
bx, the value is
0x500. Aside from the tricky 8086 memory addressing, it is mostly straightforward: the program from chapter 7 of Operating Systems: From 0 to 1 tries to perform a far jump with
jmp 0x50:0x0 which induces a jump to 0x00 using the q35 chipset with QEMU
4.2.1 (Debian 1:4.2-3ubuntu6.23) and NASM
2.14.02. I’ll look into this further in another post by working backwards from the bytes generated by NASM to determine which instruction is being produced (GDB’s disassembly can be inconsistent, and was not clear here). In the meantime, placing the address in a register (
bx) has solved the issue.
Notice that the value placed in
al is the number of sectors to load - you can imagine how you might increase this to load a larger operating system, rather than just the second sector. Most modern operating systems will not fit on a floppy disk, but the procedure for hard disks is similar: a 512-byte sector is read from the desired partition [6, p. 116].
All of the code for this project is hosted here with a Makefile. I’ll go through the steps for compiling and debugging the program, so we can verify the code is working. It may be useful to have NASM >= 2.14 and QEMU >= 4.2 installed to match these steps exactly. I’ve included a sample kernel program with the following code:
mov ax, 0x0101 mov ax, 0x0333 marker: db "this is where the kernel ends"
This does not accomplish much, but the instructions are unique so we can step through in the debugger and determine if the kernel has been loaded into memory and set to execute.
To compile, we need to convert the assembly instructions into bytes that are interpretable by the processor:
nasm -f bin bootloader.asm -o bootloader nasm -f bin kernel.asm -o kernel
We need to construct a floppy disk image to provide to QEMU:
dd if=/dev/zero of=disk.img bs=512 count=2880
Lastly, we need to write the bootloader to the first sector (
seek=0) and the kernel to the second (
conv=notrunc option ensures that the floppy disk remains 1.44 MB in size. This is the amount of storage available on a standard floppy (two heads * 80 tracks per head * 18 sectors per track * 512 bytes per sector = 1.44MB). See this for additional explanation.
dd conv=notrunc if=bootloader of=disk.img bs=512 count=1 seek=0 dd if=kernel of=disk.img bs=512 count=1 seek=1
We can verify that all of our code up to the end of the kernel program was written to the image using
hd which displays the content of the binary and decodes any ASCII:
ooc@thinkpad:~/writeyourownos$ hd disk.img 00000000 fa b8 50 00 8e c0 31 db b1 02 b0 01 b5 00 b6 00 |..P...1.........| 00000010 b2 00 b4 02 cd 13 bb 00 05 ea 00 00 50 00 00 00 |............P...| 00000020 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 000001f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 55 aa |..............U.| 00000200 b8 01 01 b8 33 03 74 68 69 73 20 69 73 20 77 68 |....3.this is wh| 00000210 65 72 65 20 74 68 65 20 6b 65 72 6e 65 6c 20 65 |ere the kernel e| 00000220 6e 64 73 00 00 00 00 00 00 00 00 00 00 00 00 00 |nds.............| 00000230 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................| * 00168000
We can see that the marker message from
kernel.asm was included. We inserted the marker using the
db command, which is an assembler directive, not an instruction that your processor understands: it means “define byte” (similarly, we have
dd for defining words and double words respectively). Conveniently, you do not need to type
db before every character in a string: each character is converted to its ASCII representation by the assembler.
We can execute our code on QEMU with the following:
qemu-system-i386 -machine q35 -fda disk.img -gdb tcp::26000 -S
The -S option ensures that the machine will not start until we have entered GDB and entered
continue. QEMU will greet you with a black screen until you connect with GDB using
target remote localhost:26000.
(base) ooc@thinkpad:~$ gdb GNU gdb (Ubuntu 9.2-0ubuntu1~20.04.1) 9.2 Copyright (C) 2020 Free Software Foundation, Inc. License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html> (gdb) target remote localhost:26000 target remote localhost:26000 Remote debugging using localhost:26000 0x0000fff0 in ?? () (gdb) b *0x7c00 Breakpoint 1 at 0x7c00 (gdb) c Continuing. Breakpoint 1, 0x00007c00 in ?? ()
The bootloader will be loaded at 0x7c00 in memory by the BIOS. We set a breakpoint there and type
c to continue to where the bootloader has been loaded. To get more insight into the current state of the registers and what instructions are being executed, you can type
layout reg at the prompt:
(gdb) layout reg ┌──Register group: general─────────────────────────────────────────────────────┐ │eax 0xaa55 43605 │ │ecx 0x0 0 │ │edx 0x0 0 │ │ebx 0x0 0 │ │esp 0x6f00 0x6f00 │ │ebp 0x0 0x0 │ └──────────────────────────────────────────────────────────────────────────────┘ │B+>0x7c00 cli │ │ 0x7c01 mov $0xc08e0050,%eax │ │ 0x7c06 xor %ebx,%ebx │ │ 0x7c08 mov $0x2,%cl │ │ 0x7c0a mov $0x1,%al │ │ 0x7c0c mov $0x0,%ch │ └──────────────────────────────────────────────────────────────────────────────┘ remote Thread 1.1 In: L?? PC: 0x7c00 (gdb)
You can see the first instruction of our bootloader, where we clear the interrupt flag, is at the current breakpoint. Step through all of the interrupt parameters using the
ni command until you reach the
int instruction. You can type
set disassembly-flavor intel to get the familiar assembly format from the
.asm files we compiled using NASM.
┌──Register group: general─────────────────────────────────────────────────────┐ │eax 0x201 513 │ │ecx 0x2 2 │ │edx 0x0 0 │ │ebx 0x0 0 │ │esp 0x6f00 0x6f00 │ │ebp 0x0 0x0 │ │esi 0x0 0 │ │edi 0x0 0 │ │eip 0x7c14 0x7c14 │ │eflags 0x46 [ IOPL=0 ZF PF ] │ ┌──────────────────────────────────────────────────────────────────────────────┐ │ 0x7c04 mov es,eax │ │ 0x7c06 xor ebx,ebx │ │ 0x7c08 mov cl,0x2 │ │ 0x7c0a mov al,0x1 │ │ 0x7c0c mov ch,0x0 │ │ 0x7c0e mov dh,0x0 │ │ 0x7c10 mov dl,0x0 │ │ 0x7c12 mov ah,0x2 │ │ >0x7c14 int 0x13 │ │ 0x7c16 mov ebx,0xea0500 │ │ 0x7c1b add BYTE PTR [eax+0x0],dl │ └──────────────────────────────────────────────────────────────────────────────┘ remote Thread 1.1 In: L?? PC: 0x7c14 (gdb) ni 0x00007c12 in ?? () (gdb) set disassembly-flavor intel (gdb) ni 0x00007c14 in ?? () (gdb)
You can use
ni to step over function calls, but this does not work for interrupt service routines. To step over without getting stuck in a long routine, set a breakpoint at the following instruction and continue with
┌──Register group: general─────────────────────────────────────────────────────┐ │eax 0x1 1 │ │ecx 0x2 2 │ │edx 0x0 0 │ │ebx 0x0 0 │ │esp 0x6f00 0x6f00 │ │ebp 0x0 0x0 │ │esi 0x0 0 │ │edi 0x0 0 │ │eip 0x7c16 0x7c16 │ │eflags 0x46 [ IOPL=0 ZF PF ] │ ┌──────────────────────────────────────────────────────────────────────────────┐ │B+>0x7c16 mov ebx,0xea0500 │ │ 0x7c1b add BYTE PTR [eax+0x0],dl │ │ 0x7c1e add BYTE PTR [eax],al │ │ 0x7c20 add BYTE PTR [eax],al │ │ 0x7c22 add BYTE PTR [eax],al │ │ 0x7c24 add BYTE PTR [eax],al │ │ 0x7c26 add BYTE PTR [eax],al │ │ 0x7c28 add BYTE PTR [eax],al │ │ 0x7c2a add BYTE PTR [eax],al │ │ 0x7c2c add BYTE PTR [eax],al │ │ 0x7c2e add BYTE PTR [eax],al │ └──────────────────────────────────────────────────────────────────────────────┘ remote Thread 1.1 In: L?? PC: 0x7c16 (gdb) ni 0x00007c12 in ?? () (gdb) ni 0x00007c14 in ?? () (gdb) set disassembly-flavor intel (gdb) break *0x7c16 Breakpoint 2 at 0x7c16 (gdb) c Continuing. Breakpoint 2, 0x00007c16 in ?? () (gdb)
You may notice when stepping through the code in GDB that
mov bx, 0x500 becomes
mov ebx,0xea0500. I think this is because GDB does not know we are simulating a 16-bit processor mode. It might also be helped by compiling our assembly in ELF format and supplying GDB with some debugging information . Regardless, the above instruction sets the lower 16 bits of
0x0500 as desired. Type
ni once more to hit the
jmp bx command. You can see that the kernel has been loaded into memory at
┌──Register group: general─────────────────────────────────────────────────────┐ │eax 0x1 1 │ │ecx 0x2 2 │ │edx 0x0 0 │ │ebx 0x500 1280 │ │esp 0x6f00 0x6f00 │ │ebp 0x0 0x0 │ │esi 0x0 0 │ │edi 0x0 0 │ │eip 0x500 0x500 │ │eflags 0x46 [ IOPL=0 ZF PF ] │ ┌──────────────────────────────────────────────────────────────────────────────┐ │ >0x500 mov eax,0x33b80101 | │ 0x503 mov eax,0x68740333 │ │ | │ 0x505 add esi,DWORD PTR [eax+ebp*2+0x69] │ │ 0x509 jae 0x52b │ │ 0x50b imul esi,DWORD PTR [ebx+0x20],0x72656877 │ │ 0x512 and BYTE PTR gs:[eax+ebp*2+0x65],dh │ │ 0x517 and BYTE PTR [ebx+0x65],ch │ │ 0x51a jb 0x58a │ │ 0x51c gs ins BYTE PTR es:[edi],dx │ │ 0x51e and BYTE PTR [ebp+0x6e],ah │ │ 0x521 fs jae 0x524 │ │ 0x524 add BYTE PTR [eax],al │ └──────────────────────────────────────────────────────────────────────────────┘ remote Thread 1.1 In: L?? PC: 0x500 (gdb) ni 0x00007c19 in ?? () (gdb) ni 0x00000500 in ?? () (gdb)
These are the two placeholder
mov commands from our “kernel,” (
mov ax, 0x0101 and
mov ax, 0x0333) so we have succeeded in loading it from the floppy!
If any of the general purpose assembly instructions were confusing, you can refer to  or . The former has useful discussion of memory addressing and processor modes; the latter covers more ground and has some discussion of floating point coprocessor instructions. The Intel instruction reference  may also be useful for information about rarer instructions like
Several tutorials exist for writing your own operating system, but most cover few topics. Since writing an OS is a big project and my goal is to understand the fundamentals, future posts will describe the steps that were most difficult for me. I do not plan to cover the bulk of the operating system code in my posts, so I’ll refer to more detailed resources with the specifics of what I learned from them or what code I used. Linus took a similar approach when writing Linux: many remnants of MINIX are still visible in the source code (compare MINIX ioctls and Linux ioctls, for example).
Hopefully this helped to give you a better understanding of one of the ways that your operating system interfaces with hardware, via the BIOS. Here are a few additional teaching operating systems which might be useful resources:
You are welcome to to email me with corrections or suggestions, as I’m still new to operating systems development. Citations:
-  Operating Systems: From 0 to 1
-  Assembly Language Step-by-Step
-  BIOS interrupt service routines documentation
-  Game Engine Black Book: Wolfenstein 3D
-  Intel 64 and IA-32 Architectures Software Developer Manuals
-  Operating Systems: Design and Implementation, 3e
-  The DWARF Debugging Standard
-  Professional Assembly Language