PC Boot Process and Boot Block
CS 301 Lecture,
Dr. Lawlor
How a PC boots
The boot sequence on an IBM PC is initiated by the BIOS (Basic
Input/Output System) stored in the computer's ROM (Read-Only
Memory).
Actually, everything in that sentence is partially a lie:
- IBM PC's aren't made by IBM anymore. They're made by
Dell,
HP, Compaq, and a zillion other manufacturers. IBM sold
off its
PC business to the Chinese Lenovo Group in late 2004. What
we've
meant by "IBM PC" for the last twenty years is actually "PC's
that are
compatible with the software that used to run on actual IBM
PC's", like
MS-DOS.
- The BIOS isn't very Basic (typical BIOS is a few hundred
kilobytes compressed), and hasn't been used for Input/Output
since the
DOS days (modern OS's invariably talk directly to hardware to do
I/O). I guess it's still a System.
- The BIOS isn't stored in ROM anymore, since it's so
complicated
it now has bugs. Thus it's stored on a chunk of flash
memory,
which you can upgrade with a BIOS
update.
Er, so back to the the booting process.
- You push the power button. This grounds the green wire
on your PC power supply, causing it to turn on.
- Power flows into the CPU.
- The CPU is wired to begin executing code out of its own
internal ROM (or flash memory). This is called
"microcode", and
is set up at the CPU factory. The CPU initializes itself
(e.g.,
clears out its registers and cache, and puts itself into a known
good
state), and performs a self-test.
- Once the CPU self-test passes, the CPU jumps to a known
location
in memory, which is hardwired on your motherboard to point to
your
motherboard's ROM (er, flash memory).
- Your motherboard's ROM contains a piece of software called the
"BIOS" that has just one purpose: to load up the *real*
operating
system. This is harder than it sounds, because the
operating
system might be stored on:
- A floppy disk, controlled by the floppy drive controller
- A CD-ROM drive, controlled via ATAPI (SCSI over IDE) on the
IDE controller
- A SCSI disk, controlled via SCSI commands on the SCSI
controller
- A USB mass storage device, controlled via the USB controller
- An IDE hard disk (Of course, the OS is almost always
actually here. Except when it's not.)
- Once the BIOS finds a bootable disk, it loads up the "boot
sector" (a disk sector, or disk block, of just 512 bytes) and
executes
it in 16-bit mode. Why? Because the original 8086
IBM PC
back in 1981 executed the boot sector in 16-bit mode, that's
why.
It was good enough for 1981, why isn't it good enough
today? Huh?
In a perfect world, the boot sector would actually be the kernel
(more
on that in a minute) of the operating system. Sadly:
- The boot sector is run in 16-bit addressing mode, which limits
you to 64KB at a time. Of course, by switching segments,
you can
access a whopping 1MB of memory in 16-bit mode. Of that
1MB,
chunks are reserved for display memory (0xA????), the network
card, the
BIOS code (0xF????), etc., so you're actually limited to 640K of
real
RAM in 16-bit mode. (See Memory
Maps of the ancient world.)
- The boot sector is 512 *bytes*. Wow.
The standard thing to do from inside a boot sector is to load up the
*real* OS loader ("bootloader") from disk. The first 64
sectors
of the disk are reserved for this, so the real bootloader can be up
to
32K. Known bootloaders include:
- NTLDR.COM,
the Windows NT bootloader. It uses "boot.ini"
- LILO, the ancient LInux LOader. It uses a compiled-in
"/etc/lilo.conf" file.
- GRUB, the
old Linux bootloader. It's configurable at runtime.
- GRUB2, the new Linux bootloader.
- SYSLINUX,
a Linux loader used for bootable DVDs and such.
The bootloader then loads up the real OS. So the boot sequence
is overall the following amazing cascade:
- CPU internal microcode
- BIOS code
- Boot sector
- Bootloader (possibly multi-stage!)
- OS
- Web browser
- JavaScript
- Facebook!
Writing a Boot Block
A PC
Master Boot Sector
is just 512 bytes of machine code that the BIOS loads at
bootup.
It's basically unchanged since 1981 with the original IBM PC,
although
there is a new different standard called EFI
that is just now catching on. With the original interface:
- The boot sector is a fixed 512 bytes long. If you need
more data, the boot sector code needs to read it in manually.
- It's 16-bit x86 code, so it starts with "BITS 16".
Because we're in 16 bit mode, "eax" isn't available, but "ax"
is!
- It's in raw binary machine code format, so you compile with
"-f bin".
- The
sector MUST end with the two bytes "dw 0xAA55", or the BIOS
won't boot
it. It's also got some partition information in the 64
bytes
before this.
- It's totally full raw access to the machine, so "cli" (clear
interrupts) is a good way to avoid problems at startup.
- To get anything done, you issue BIOS
interrupts using the "int" instruction.
Here's an example:
BITS 16 ; everything here is 16 bit code
; Code gets loaded by the PC BIOS into address 0x7C00 and executed.
mov al,'H'
mov ah,0x0e ; print command
int 0x10 ; talk to video card
mov al,'i'
mov ah,0x0e ; print command
int 0x10 ; talk to video card
hang:
jmp hang
; Pad data out to magic boot sector identifier
times 512-2-($-$$) db 0
db 0x55
db 0xaa
You compile your little boot block with:
nasm -f bin -o boot.bin yourcode.S
This should make a 512 byte file full of machine code. If you
point a virtual machine emulator at this as a tiny hard drive image,
it
should boot and run!
qemu-system-x86_64 boot.bin
On a Linux machine, you can also copy this onto a
flash drive or floppy using "dd":
sudo dd if=boot.bin of=/dev/sdX
This is scary, but it will actually boot any machine using pre-EFI
BIOS!
Interfacing with the BIOS
To get work done from a boot sector, you can either find and command
the hardware directly, or call BIOS
interrupts, which work a lot like
Linux system calls. Ralph Brown's
Interrupt List (indexed by interrupt number) is the definitive
reference for all BIOS (as well as MS-DOS) interrupt
functions.
For example, interrupt
10 with ah==0x0e is "output a character to the screen".
So above, I printed "H" using:
mov al,'H'
mov ah,0x0e ; print command
int 0x10 ; talk to video card
Interrupt
10: Video I/O, such as printing characters with ah=0x0e, int
0x10.
Interrupt
13: Disk I/O, such as ah=0x02,
int 0x13 to read data from disk.
Interrupt
16: Keyboard settings, such as reading characters with ah=0x00,
int 0x16.
To get pretty graphics onto the screen, you first switch the
hardware into graphics mode, typically VGA mode 13h:
mov ah, 0 ; set video mode 13h - 320x200
mov al, 13h
int 10h
You can then use direct memory access to segment 0xA000 (see below)
to draw pixels onscreen.
Segmented Memory, and Memory-Mapped I/O
In 16-bit mode, pointers actually have two parts: the "segment" is
the
general area of memory, and the "offset" is the location inside that
segment. Both segment and offset are 16 bit values, so it
looks like
you could access 32 bits worth of memory, but for some reason the
CPU
combines segment and offset like this:
actual address = (segment<<4) + offset;
This means segment:offset 0x0000:0x1230 is the same location as
address 0x0123:0x0000.
For example, the VGA text mode display starts at segment
0xB800. I can
print "Hi!" at the top left corner of the screen by directly
modifying
the data there:
mov ax,0xB800 ; this is the segment where VGA text mode data is stored
mov es,ax ; can only mov into es from ax (why?!)
mov BYTE [es:0x0000],'H' ; shows up at top left corner of screen
mov BYTE [es:0x0002],'i'
mov BYTE [es:0x0004],'!'
"es" is the "Extra Segment" register. You can only set it from
ax. "es:0x0002" means use segment es, and offset 2. In
text
mode, each character is at an even address, and the color and font
of
that character is stored in the corresponding odd address.