svn commit: r43645 - head/en_US.ISO8859-1/books/arch-handbook/boot
Warren Block
wblock at FreeBSD.org
Sun Jan 26 02:30:35 UTC 2014
Author: wblock
Date: Sun Jan 26 02:30:34 2014
New Revision: 43645
URL: http://svnweb.freebsd.org/changeset/doc/43645
Log:
Rewrite of portions of the Boot chapter by Sergio Andrés Gómez del Real.
Committed version is a modified version of the one submitted with the
patch. Thanks to Sergio Andrés Gómez del Real for the submission, to
John-Mark Gurney for technical review, and to both for their patience.
PR: docs/185780
Submitted by: Sergio Andrés Gómez del Real <Sergio.G.DelReal at gmail.com>
Reviewed by: jmg
Modified:
head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml
Modified: head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml
==============================================================================
--- head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml Sun Jan 26 00:10:46 2014 (r43644)
+++ head/en_US.ISO8859-1/books/arch-handbook/boot/chapter.xml Sun Jan 26 02:30:34 2014 (r43645)
@@ -4,6 +4,8 @@ The FreeBSD Documentation Project
Copyright (c) 2002 Sergey Lyubka <devnull at uptsoft.com>
All rights reserved
+Copyright (c) 2014 Sergio Andr?s G?mez del Real <Sergio.G.delReal at gmail.com>
+All rights reserved
$FreeBSD$
-->
@@ -25,6 +27,18 @@ $FreeBSD$
</author>
<!-- devnull at uptsoft.com 12 Jun 2002 -->
</authorgroup>
+
+ <authorgroup>
+ <author>
+ <personname>
+ <firstname>Sergio Andrés</firstname>
+ <surname> Gómez del Real</surname>
+ </personname>
+
+ <contrib>Updated and enhanced by </contrib>
+ </author>
+ <!-- Sergio.G.DelReal at gmail.com Jan 2014 -->
+ </authorgroup>
</info>
<sect1 xml:id="boot-synopsis">
@@ -37,88 +51,103 @@ $FreeBSD$
<indexterm><primary>booting</primary></indexterm>
<indexterm><primary>system initialization</primary></indexterm>
<para>This chapter is an overview of the boot and system
- initialization process, starting from the BIOS (firmware) POST,
- to the first user process creation. Since the initial steps of
- system startup are very architecture dependent, the IA-32
- architecture is used as an example.</para>
+ initialization processes, starting from the <acronym>BIOS</acronym> (firmware)
+ <acronym>POST</acronym>, to the first user process creation. Since the initial
+ steps of system startup are very architecture dependent, the
+ IA-32 architecture is used as an example.</para>
+
+ <para>The &os; boot process can be surprisingly complex. After
+ control is passed from the <acronym>BIOS</acronym>, a considerable amount of
+ low-level configuration must be done before the kernel can be
+ loaded and executed. This setup must be done in a simple and
+ flexible manner, allowing the user a great deal of customization
+ possibilities.</para>
</sect1>
<sect1 xml:id="boot-overview">
<title>Overview</title>
- <para>A computer running FreeBSD can boot by several methods,
- although the most common method, booting from a harddisk where
- the OS is installed, will be discussed here. The boot process
- is divided into several steps:</para>
-
- <itemizedlist>
- <listitem><para>BIOS POST</para></listitem>
- <listitem><para><literal>boot0</literal> stage</para></listitem>
- <listitem><para><literal>boot2</literal> stage</para></listitem>
- <listitem><para>loader stage</para></listitem>
- <listitem><para>kernel initialization</para></listitem>
- </itemizedlist>
+ <para>The boot process is an extremely machine-dependent
+ activity. Not only must code be written for every computer
+ architecture, but there may also be multiple types of booting on
+ the same architecture. For example, looking at
+ <filename class="directory">/usr/sys/src/boot</filename>
+ reveals a great amount of architecture-dependent code. There is
+ a directory for each of the various supported architectures. In
+ the x86-specific <filename class="directory">i386</filename>
+ directory, there are subdirectories for different boot standards
+ like <filename>mbr</filename> (Master Boot Record),
+ <filename>gpt</filename> (<acronym>GUID</acronym> Partition
+ Table), and <filename>efi</filename> (Extensible Firmware
+ Interface). Each boot standard has its own conventions and data
+ structures. The example that follows shows booting an x86
+ computer from an <acronym>MBR</acronym> hard drive with the &os;
+ <filename>boot0</filename> multi-boot loader stored in the very
+ first sector. That boot code starts the &os; three-stage boot
+ process.</para>
+
+ <para>The key to understanding this process is that it is a series
+ of stages of increasing complexity. These stages are
+ <filename>boot1</filename>, <filename>boot2</filename>, and
+ <filename>loader</filename> (see &man.boot.8; for more detail).
+ The boot system executes each stage in sequence. The last
+ stage, <filename>loader</filename>, is responsible for loading
+ the &os; kernel. Each stage is examined in the following
+ sections.</para>
- <indexterm><primary>BIOS POST</primary></indexterm>
- <indexterm><primary>boot0</primary></indexterm>
- <indexterm><primary>boot2</primary></indexterm>
- <indexterm><primary>loader</primary></indexterm>
- <para>The <literal>boot0</literal> and <literal>boot2</literal>
- stages are also referred to as <emphasis>bootstrap stages 1 and
- 2</emphasis> in &man.boot.8; as the first steps in FreeBSD's
- 3-stage bootstrapping procedure. Various information is printed
- on the screen at each stage, so you may visually recognize them
- using the table that follows. Please note that the actual data
+ <para>Here is an example of the output generated by the
+ different boot stages. Actual output
may differ from machine to machine:</para>
<informaltable frame="none" pgwide="0">
<tgroup cols="2">
<tbody>
<row>
- <entry><para>Output (may vary)</para></entry>
- <entry><para>BIOS (firmware) messages</para></entry>
+ <entry>&os; Component</entry>
+ <entry>Output (may vary)</entry>
</row>
<row>
- <entry><para><screen>F1 FreeBSD
+ <entry><literal>boot0</literal></entry>
+ <entry><screen>F1 FreeBSD
F2 BSD
-F5 Disk 2</screen></para></entry>
- <entry><para><literal>boot0</literal></para></entry>
+F5 Disk 2</screen></entry>
</row>
<row>
- <entry><para><screen>>>FreeBSD/i386 BOOT
-Default: 1:ad(1,a)/boot/loader
-boot:</screen></para></entry>
- <entry><para><literal>boot2</literal>
+ <entry><literal>boot2</literal>
<footnote><para>This prompt will appear if the user
presses a key just after selecting an OS to boot
at the <literal>boot0</literal>
- stage.</para></footnote></para></entry>
+ stage.</para></footnote></entry>
+ <entry><screen>>>FreeBSD/i386 BOOT
+Default: 1:ad(1,a)/boot/loader
+boot:</screen></entry>
</row>
<row>
- <entry><para><screen>BTX loader 1.0 BTX version is 1.01
-BIOS drive A: is disk0
-BIOS drive C: is disk1
-BIOS 639kB/64512kB available memory
-FreeBSD/i386 bootstrap loader, Revision 0.8
+ <entry><filename>loader</filename></entry>
+ <entry><screen>BTX loader 1.00 BTX version is 1.02
+Consoles: internal video/keyboard
+BIOS drive C: is disk0
+BIOS 639kB/2096064kB available memory
+
+FreeBSD/x86 bootstrap loader, Revision 1.1
Console internal video/keyboard
-(jkh at bento.freebsd.org, Mon Nov 20 11:41:23 GMT 2000)
-/kernel text=0x1234 data=0x2345 syms=[0x4+0x3456]
-Hit [Enter] to boot immediately, or any other key for command prompt
-Booting [kernel] in 9 seconds..._</screen></para></entry>
- <entry><para>loader</para></entry>
+(root at snap.freebsd.org, Thu Jan 16 22:18:05 UTC 2014)
+Loading /boot/defaults/loader.conf
+/boot/kernel/kernel text=0xed9008 data=0x117d28+0x176650 syms=[0x8+0x137988+0x8+0x1515f8]</screen></entry>
</row>
<row>
- <entry><para><screen>Copyright (c) 1992-2002 The FreeBSD Project.
+ <entry>kernel</entry>
+ <entry><screen>Copyright (c) 1992-2013 The FreeBSD Project.
Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
The Regents of the University of California. All rights reserved.
-FreeBSD 4.6-RC #0: Sat May 4 22:49:02 GMT 2002
- devnull at kukas:/usr/obj/usr/src/sys/DEVNULL
-Timecounter "i8254" frequency 1193182 Hz</screen></para></entry>
- <entry><para>kernel</para></entry>
+FreeBSD is a registered trademark of The FreeBSD Foundation.
+FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014
+ root at snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
+FreeBSD clang version 3.3 (tags/RELEASE_33/final 183502) 20130610</screen></entry>
</row>
</tbody>
</tgroup>
@@ -126,84 +155,114 @@ Timecounter "i8254" frequency 1193182 H
</sect1>
<sect1 xml:id="boot-bios">
- <title>BIOS POST</title>
+ <title>The <acronym>BIOS</acronym></title>
- <para>When the PC powers on, the processor's registers are set
- to some predefined values. One of the registers is the
+ <para>When the computer powers on, the processor's registers are
+ set to some predefined values. One of the registers is the
<emphasis>instruction pointer</emphasis> register, and its value
after a power on is well defined: it is a 32-bit value of
- 0xfffffff0. The instruction pointer register points to code to
- be executed by the processor. One of the registers is the
+ <literal>0xfffffff0</literal>. The instruction pointer register
+ (also known as the Program Counter) points to code to be
+ executed by the processor. Another important register is the
<literal>cr0</literal> 32-bit control register, and its value
- just after the reboot is 0. One of the cr0's bits, the bit PE
- (Protection Enabled) indicates whether the processor is running
- in protected or real mode. Since at boot time this bit is
- cleared, the processor boots in real mode. Real mode means,
+ just after a reboot is <literal>0</literal>. One of
+ <literal>cr0</literal>'s bits, the PE (Protection Enabled) bit,
+ indicates whether the processor is running in 32-bit protected
+ mode or 16-bit real mode. Since this bit is cleared at boot
+ time, the processor boots in 16-bit real mode. Real mode means,
among other things, that linear and physical addresses are
- identical.</para>
-
- <para>The value of 0xfffffff0 is slightly less then 4Gb, so unless
- the machine has 4Gb physical memory, it cannot point to a valid
- memory address. The computer's hardware translates this address
- so that it points to a BIOS memory block.</para>
-
- <para>BIOS stands for <emphasis>Basic Input Output
- System</emphasis>, and it is a chip on the motherboard that
- has a relatively small amount of read-only memory (ROM). This
+ identical. The reason for the processor not to start
+ immediately in 32-bit protected mode is backwards compatibility.
+ In particular, the boot process relies on the services provided
+ by the <acronym>BIOS</acronym>, and the <acronym>BIOS</acronym>
+ itself works in legacy, 16-bit code.</para>
+
+ <para>The value of <literal>0xfffffff0</literal> is slightly less
+ than 4 GB, so unless the machine has 4 GB of physical
+ memory, it cannot point to a valid memory address. The
+ computer's hardware translates this address so that it points to
+ a <acronym>BIOS</acronym> memory block.</para>
+
+ <para>The <acronym>BIOS</acronym> (Basic Input Output
+ System) is a chip on the motherboard that has a relatively small
+ amount of read-only memory (<acronym>ROM</acronym>). This
memory contains various low-level routines that are specific to
- the hardware supplied with the motherboard. So, the processor
- will first jump to the address 0xfffffff0, which really resides
- in the BIOS's memory. Usually this address contains a jump
- instruction to the BIOS's POST routines.</para>
-
- <para>POST stands for <emphasis>Power On Self Test</emphasis>.
- This is a set of routines including the memory check, system bus
- check and other low-level stuff so that the CPU can initialize
- the computer properly. The important step on this stage is
- determining the boot device. All modern BIOS's allow the boot
- device to be set manually, so you can boot from a floppy,
- CD-ROM, harddisk etc.</para>
-
- <para>The very last thing in the POST is the <literal>INT
- 0x19</literal> instruction. That instruction reads 512 bytes
- from the first sector of boot device into the memory at address
- 0x7c00. The term <emphasis>first sector</emphasis> originates
- from harddrive architecture, where the magnetic plate is divided
- to a number of cylindrical tracks. Tracks are numbered, and
- every track is divided by a number (usually 64) sectors. Track
- number 0 is the outermost on the magnetic plate, and sector 1,
- the first sector (tracks, or, cylinders, are numbered starting
- from 0, but sectors - starting from 1), has a special meaning.
- It is also called Master Boot Record, or MBR. The remaining
- sectors on the first track are never used <footnote><para>Some
- utilities such as &man.disklabel.8; may store the
- information in this area, mostly in the second
- sector.</para></footnote>.</para>
+ the hardware supplied with the motherboard. The processor will
+ first jump to the address 0xfffffff0, which really resides in
+ the <acronym>BIOS</acronym>'s memory. Usually this address
+ contains a jump instruction to the <acronym>BIOS</acronym>'s
+ POST routines.</para>
+
+ <para>The <acronym>POST</acronym> (Power On Self Test)
+ is a set of routines including the memory check, system bus
+ check, and other low-level initialization so the
+ <acronym>CPU</acronym> can set up the computer properly. The
+ important step of this stage is determining the boot device.
+ Modern <acronym>BIOS</acronym> implementations permit the
+ selection of a boot device, allowing booting from a floppy,
+ <acronym>CD-ROM</acronym>, hard disk, or other devices.</para>
+
+ <para>The very last thing in the <acronym>POST</acronym> is the
+ <literal>INT 0x19</literal> instruction. The
+ <literal>INT 0x19</literal> handler reads 512 bytes from the
+ first sector of boot device into the memory at address
+ <literal>0x7c00</literal>. The term
+ <emphasis>first sector</emphasis> originates from hard drive
+ architecture, where the magnetic plate is divided into a number
+ of cylindrical tracks. Tracks are numbered, and every track is
+ divided into a number (usually 64) of sectors. Track numbers
+ start at 0, but sector numbers start from 1. Track 0 is the
+ outermost on the magnetic plate, and sector 1, the first sector,
+ has a special purpose. It is also called the
+ <acronym>MBR</acronym>, or Master Boot Record. The remaining
+ sectors on the first track are never used.</para>
+
+ <para>This sector is our boot-sequence starting point. As we will
+ see, this sector contains a copy of our
+ <filename>boot0</filename> program. A jump is made by the
+ <acronym>BIOS</acronym> to address <literal>0x7c00</literal> so
+ it starts executing.</para>
</sect1>
<sect1 xml:id="boot-boot0">
- <title><literal>boot0</literal> Stage</title>
+ <title>The Master Boot Record (<literal>boot0</literal>)</title>
<indexterm><primary>MBR</primary></indexterm>
- <para>Take a look at the file <filename>/boot/boot0</filename>.
- This is a small 512-byte file, and it is exactly what FreeBSD's
- installation procedure wrote to your harddisk's MBR if you chose
- the <quote>bootmanager</quote> option at installation
- time.</para>
+
+ <para>After control is received from the <acronym>BIOS</acronym>
+ at memory address <literal>0x7c00</literal>,
+ <filename>boot0</filename> starts executing. It is the first
+ piece of code under &os; control. The task of
+ <filename>boot0</filename> is quite simple: scan the partition
+ table and let the user choose which partition to boot from. The
+ Partition Table is a special, standard data structure embedded
+ in the <acronym>MBR</acronym> (hence embedded in
+ <filename>boot0</filename>) describing the four standard PC
+ <quote>partitions</quote>
+ <footnote>
+ <para><link
+ xlink:href="http://en.wikipedia.org/wiki/Master_boot_record"></link></para></footnote>.
+ <filename>boot0</filename> resides in the filesystem as
+ <filename>/boot/boot0</filename>. It is a small 512-byte file,
+ and it is exactly what &os;'s installation procedure wrote to
+ the hard disk's <acronym>MBR</acronym> if you chose the <quote>bootmanager</quote>
+ option at installation time. Indeed,
+ <filename>boot0</filename> <emphasis>is</emphasis> the
+ <acronym>MBR</acronym>.</para>
<para>As mentioned previously, the <literal>INT 0x19</literal>
- instruction loads an MBR, i.e., the <filename>boot0</filename>
- content, into the memory at address 0x7c00. Taking a look at
- the file <filename>sys/boot/i386/boot0/boot0.S</filename> can
- give a guess at what is happening there - this is the boot
- manager, which is an awesome piece of code written by Robert
- Nordier.</para>
-
- <para>The MBR, or, <filename>boot0</filename>, has a special
- structure starting from offset 0x1be, called the
- <emphasis>partition table</emphasis>. It has 4 records of 16
- bytes each, called <emphasis>partition records</emphasis>, which
- represent how the harddisk(s) are partitioned, or, in FreeBSD's
+ instruction causes the <literal>INT 0x19</literal> handler to
+ load an <acronym>MBR</acronym> (<filename>boot0</filename>) into
+ memory at address <literal>0x7c00</literal>. The source file
+ for <filename>boot0</filename> can be found in
+ <filename>sys/boot/i386/boot0/boot0.S</filename> - which is an
+ awesome piece of code written by Robert Nordier.</para>
+
+ <para>A special structure starting from offset
+ <literal>0x1be</literal> in the <acronym>MBR</acronym> is called
+ the <emphasis>partition table</emphasis>. It has four records
+ of 16 bytes each, called <emphasis>partition records</emphasis>,
+ which represent how the hard disk is partitioned, or, in &os;'s
terminology, sliced. One byte of those 16 says whether a
partition (slice) is bootable or not. Exactly one record must
have that flag set, otherwise <filename>boot0</filename>'s code
@@ -229,186 +288,1471 @@ Timecounter "i8254" frequency 1193182 H
</listitem>
</itemizedlist>
- <para>A partition record descriptor has the information about
+ <para>A partition record descriptor contains information about
where exactly the partition resides on the drive. Both
- descriptors, LBA and CHS, describe the same information, but in
- different ways: LBA (Logical Block Addressing) has the starting
- sector for the partition and the partition's length, while CHS
- (Cylinder Head Sector) has coordinates for the first and last
- sectors of the partition.</para>
-
- <para>The boot manager scans the partition table and prints the
- menu on the screen so the user can select what disk and what
- slice to boot. By pressing an appropriate key,
- <filename>boot0</filename> performs the following
- actions:</para>
+ descriptors, <acronym>LBA</acronym> and <acronym>CHS</acronym>,
+ describe the same information, but in different ways:
+ <acronym>LBA</acronym> (Logical Block Addressing) has the
+ starting sector for the partition and the partition's length,
+ while <acronym>CHS</acronym> (Cylinder Head Sector) has
+ coordinates for the first and last sectors of the partition.
+ The partition table ends with the special signature
+ <literal>0xaa55</literal>.</para>
+
+ <para>The <acronym>MBR</acronym> must fit into 512 bytes, a single
+ disk sector. This program uses low-level <quote>tricks</quote>
+ like taking advantage of the side effects of certain
+ instructions and reusing register values from previous
+ operations to make the most out of the fewest possible
+ instructions. Care must also be taken when handling the
+ partition table, which is embedded in the <acronym>MBR</acronym>
+ itself. For these reasons, be very careful when modifying
+ <filename>boot0.S</filename>.</para>
+
+ <para>Note that the <filename>boot0.S</filename> source file
+ is assembled <quote>as is</quote>: instructions are translated
+ one by one to binary, with no additional information (no
+ <acronym>ELF</acronym> file format, for example). This kind of
+ low-level control is achieved at link time through special
+ control flags passed to the linker. For example, the text
+ section of the program is set to be located at address
+ <literal>0x600</literal>. In practice this means that
+ <filename>boot0</filename> must be loaded to memory address
+ <literal>0x600</literal> in order to function properly.</para>
+
+ <para>It is worth looking at the <filename>Makefile</filename> for
+ <filename>boot0</filename>
+ (<filename>sys/boot/i386/boot0/Makefile</filename>), as it
+ defines some of the run-time behavior of
+ <filename>boot0</filename>. For instance, if a terminal
+ connected to the serial port (COM1) is used for I/O, the macro
+ <literal>SIO</literal> must be defined
+ (<literal>-DSIO</literal>). <literal>-DPXE</literal> enables
+ boot through <acronym>PXE</acronym> by pressing
+ <keycap>F6</keycap>. Additionally, the program defines a set of
+ <emphasis>flags</emphasis> that allow further modification of
+ its behavior. All of this is illustrated in the
+ <filename>Makefile</filename>. For example, look at the
+ linker directives which command the linker to start the text
+ section at address <literal>0x600</literal>, and to build the
+ output file <quote>as is</quote> (strip out any file
+ formatting):</para>
+
+ <figure xml:id="boot-boot0-makefile-as-is">
+ <title><filename>sys/boot/i386/boot0/Makefile</filename></title>
+
+ <programlisting> BOOT_BOOT0_ORG?=0x600
+ LDFLAGS=-e start -Ttext ${BOOT_BOOT0_ORG} \
+ -Wl,-N,-S,--oformat,binary</programlisting>
+ </figure>
+
+ <para>Let us now start our study of the <acronym>MBR</acronym>, or
+ <filename>boot0</filename>, starting where execution
+ begins.</para>
+
+ <note>
+ <para>Some modifications have been made to some instructions in
+ favor of better exposition. For example, some macros are
+ expanded, and some macro tests are omitted when the result of
+ the test is known. This applies to all of the code examples
+ shown.</para>
+ </note>
+
+ <figure xml:id="boot-boot0-entrypoint">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting>start:
+ cld # String ops inc
+ xorw %ax,%ax # Zero
+ movw %ax,%es # Address
+ movw %ax,%ds # data
+ movw %ax,%ss # Set up
+ movw 0x7c00,%sp # stack</programlisting>
+ </figure>
+
+ <para>This first block of code is the entry point of the program.
+ It is where the <acronym>BIOS</acronym> transfers control.
+ First, it makes sure that the string operations autoincrement
+ its pointer operands (the <literal>cld</literal> instruction)
+ <footnote>
+ <para>When in doubt, we refer the reader to the official Intel
+ manuals, which describe the exact semantics for each
+ instruction: <link
+ xlink:href="http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html"></link>.</para></footnote>.
+ Then, as it makes no assumption about the state of the segment
+ registers, it initializes them. Finally, it sets the stack
+ pointer register (<literal>%sp</literal>) to address
+ <literal>0x7c00</literal>, so we have a working stack.</para>
+
+ <para>The next block is responsible for the relocation and
+ subsequent jump to the relocated code.</para>
+
+ <figure xml:id="boot-boot0-relocation">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting> movw $0x7c00,%si # Source
+ movw $0x600,%di # Destination
+ movw $512,%cx # Word count
+ rep # Relocate
+ movsb # code
+ movw %di,%bp # Address variables
+ movb $16,%cl # Words to clear
+ rep # Zero
+ stosb # them
+ incb -0xe(%di) # Set the S field to 1
+ jmp main-0x7c00+0x600 # Jump to relocated code</programlisting>
+ </figure>
+
+ <para>Because <filename>boot0</filename> is loaded by the
+ <acronym>BIOS</acronym> to address <literal>0x7C00</literal>, it
+ copies itself to address <literal>0x600</literal> and then
+ transfers control there (recall that it was linked to execute at
+ address <literal>0x600</literal>). The source address,
+ <literal>0x7c00</literal>, is copied to register
+ <literal>%si</literal>. The destination address,
+ <literal>0x600</literal>, to register <literal>%di</literal>.
+ The number of bytes to copy, <literal>512</literal> (the
+ program's size), is copied to register <literal>%cx</literal>.
+ Next, the <literal>rep</literal> instruction repeats the
+ instruction that follows, that is, <literal>movsb</literal>, the
+ number of times dictated by the <literal>%cx</literal> register.
+ The <literal>movsb</literal> instruction copies the byte pointed
+ to by <literal>%si</literal> to the address pointed to by
+ <literal>%di</literal>. This is repeated another 511 times. On
+ each repetition, both the source and destination registers,
+ <literal>%si</literal> and <literal>%di</literal>, are
+ incremented by one. Thus, upon completion of the 512-byte copy,
+ <literal>%di</literal> has the value
+ <literal>0x600</literal>+<literal>512</literal>=
+ <literal>0x800</literal>, and <literal>%si</literal> has the
+ value <literal>0x7c00</literal>+<literal>512</literal>=
+ <literal>0x7e00</literal>; we have thus completed the code
+ <emphasis>relocation</emphasis>.</para>
+
+ <para>Next, the destination register
+ <literal>%di</literal> is copied to <literal>%bp</literal>.
+ <literal>%bp</literal> gets the value <literal>0x800</literal>.
+ The value <literal>16</literal> is copied to
+ <literal>%cl</literal> in preparation for a new string operation
+ (like our previous <literal>movsb</literal>). Now,
+ <literal>stosb</literal> is executed 16 times. This instruction
+ copies a <literal>0</literal> value to the address pointed to by
+ the destination register (<literal>%di</literal>, which is
+ <literal>0x800</literal>), and increments it. This is repeated
+ another 15 times, so <literal>%di</literal> ends up with value
+ <literal>0x810</literal>. Effectively, this clears the address
+ range <literal>0x800</literal>-<literal>0x80f</literal>. This
+ range is used as a (fake) partition table for writing the
+ <acronym>MBR</acronym> back to disk. Finally, the sector field
+ for the <acronym>CHS</acronym> addressing of this fake partition
+ is given the value 1 and a jump is made to the main function
+ from the relocated code. Note that until this jump to the
+ relocated code, any reference to an absolute address was
+ avoided.</para>
+
+ <para>The following code block tests whether the drive number
+ provided by the <acronym>BIOS</acronym> should be used, or
+ the one stored in <filename>boot0</filename>.</para>
+
+ <figure xml:id="boot-boot0-drivenumber">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting>main:
+ testb $SETDRV,-69(%bp) # Set drive number?
+ jnz disable_update # Yes
+ testb %dl,%dl # Drive number valid?
+ js save_curdrive # Possibly (0x80 set)</programlisting>
+ </figure>
+
+ <para>This code tests the <literal>SETDRV</literal> bit
+ (<literal>0x20</literal>) in the <emphasis>flags</emphasis>
+ variable. Recall that register <literal>%bp</literal> points to
+ address location <literal>0x800</literal>, so the test is done
+ to the <emphasis>flags</emphasis> variable at address
+ <literal>0x800</literal>-<literal>69</literal>=
+ <literal>0x7bb</literal>. This is an example of the type of
+ modifications that can be done to <filename>boot0</filename>.
+ The <literal>SETDRV</literal> flag is not set by default, but it
+ can be set in the <filename>Makefile</filename>. When set, the
+ drive number stored in the <acronym>MBR</acronym> is used
+ instead of the one provided by the <acronym>BIOS</acronym>. We
+ assume the defaults, and that the <acronym>BIOS</acronym>
+ provided a valid drive number, so we jump to
+ <literal>save_curdrive</literal>.</para>
+
+ <para>The next block saves the drive number provided by the
+ <acronym>BIOS</acronym>, and calls <literal>putn</literal> to
+ print a new line on the screen.</para>
+
+ <figure xml:id="boot-boot0-savedrivenumber">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting>save_curdrive:
+ movb %dl, (%bp) # Save drive number
+ pushw %dx # Also in the stack
+#ifdef TEST /* test code, print internal bios drive */
+ rolb $1, %dl
+ movw $drive, %si
+ call putkey
+#endif
+ callw putn # Print a newline</programlisting>
+ </figure>
+
+ <para>Note that we assume <varname>TEST</varname> is not defined,
+ so the conditional code in it is not assembled and will not
+ appear in our executable <filename>boot0</filename>.</para>
+
+ <para>Our next block implements the actual scanning of the
+ partition table. It prints to the screen the partition type for
+ each of the four entries in the partition table. It compares
+ each type with a list of well-known operating system file
+ systems. Examples of recognized partition types are
+ <acronym>NTFS</acronym> (&windows;, ID 0x7),
+ <literal>ext2fs</literal> (&linux;, ID 0x83), and, of course,
+ <literal>ffs</literal>/<literal>ufs2</literal> (&os;, ID 0xa5).
+ The implementation is fairly simple.</para>
+
+ <figure xml:id="boot-boot0-partition-scan">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting> movw $(partbl+0x4),%bx # Partition table (+4)
+ xorw %dx,%dx # Item number
+
+read_entry:
+ movb %ch,-0x4(%bx) # Zero active flag (ch == 0)
+ btw %dx,_FLAGS(%bp) # Entry enabled?
+ jnc next_entry # No
+ movb (%bx),%al # Load type
+ test %al, %al # skip empty partition
+ jz next_entry
+ movw $bootable_ids,%di # Lookup tables
+ movb $(TLEN+1),%cl # Number of entries
+ repne # Locate
+ scasb # type
+ addw $(TLEN-1), %di # Adjust
+ movb (%di),%cl # Partition
+ addw %cx,%di # description
+ callw putx # Display it
+
+next_entry:
+ incw %dx # Next item
+ addb $0x10,%bl # Next entry
+ jnc read_entry # Till done</programlisting>
+ </figure>
+
+ <para>It is important to note that the active flag for each entry
+ is cleared, so after the scanning, <emphasis>no</emphasis>
+ partition entry is active in our memory copy of
+ <filename>boot0</filename>. Later, the active flag will be set
+ for the selected partition. This ensures that only one active
+ partition exists if the user chooses to write the changes back
+ to disk.</para>
+
+ <para>The next block tests for other drives. At startup,
+ the <acronym>BIOS</acronym> writes the number of drives present
+ in the computer to address <literal>0x475</literal>. If there
+ are any other drives present, <filename>boot0</filename> prints
+ the current drive to screen. The user may command
+ <filename>boot0</filename> to scan partitions on another drive
+ later.</para>
+
+ <figure xml:id="boot-boot0-test-drives">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting> popw %ax # Drive number
+ subb $0x79,%al # Does next
+ cmpb 0x475,%al # drive exist? (from BIOS?)
+ jb print_drive # Yes
+ decw %ax # Already drive 0?
+ jz print_prompt # Yes</programlisting>
+ </figure>
+
+ <para>We make the assumption that a single drive is present, so
+ the jump to <literal>print_drive</literal> is not performed. We
+ also assume nothing strange happened, so we jump to
+ <literal>print_prompt</literal>.</para>
+
+ <para>This next block just prints out a prompt followed by the
+ default option:</para>
+
+ <figure xml:id="boot-boot0-prompt">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting>print_prompt:
+ movw $prompt,%si # Display
+ callw putstr # prompt
+ movb _OPT(%bp),%dl # Display
+ decw %si # default
+ callw putkey # key
+ jmp start_input # Skip beep</programlisting>
+ </figure>
+
+ <para>Finally, a jump is performed to
+ <literal>start_input</literal>, where the
+ <acronym>BIOS</acronym> services are used to start a timer and
+ for reading user input from the keyboard; if the timer expires,
+ the default option will be selected:</para>
+
+ <figure xml:id="boot-boot0-start-input">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting>start_input:
+ xorb %ah,%ah # BIOS: Get
+ int $0x1a # system time
+ movw %dx,%di # Ticks when
+ addw _TICKS(%bp),%di # timeout
+read_key:
+ movb $0x1,%ah # BIOS: Check
+ int $0x16 # for keypress
+ jnz got_key # Have input
+ xorb %ah,%ah # BIOS: int 0x1a, 00
+ int $0x1a # get system time
+ cmpw %di,%dx # Timeout?
+ jb read_key # No</programlisting>
+ </figure>
+
+ <para>An interrupt is requested with number
+ <literal>0x1a</literal> and argument <literal>0</literal> in
+ register <literal>%ah</literal>. The <acronym>BIOS</acronym>
+ has a predefined set of services, requested by applications as
+ software-generated interrupts through the <literal>int</literal>
+ instruction and receiving arguments in registers (in this case,
+ <literal>%ah</literal>). Here, particularly, we are requesting
+ the number of clock ticks since last midnight; this value is
+ computed by the <acronym>BIOS</acronym> through the
+ <acronym>RTC</acronym> (Real Time Clock). This clock can be
+ programmed to work at frequencies ranging from 2 Hz to
+ 8192 Hz. The <acronym>BIOS</acronym> sets it to
+ 18.2 Hz at startup. When the request is satisfied, a
+ 32-bit result is returned by the <acronym>BIOS</acronym> in
+ registers <literal>%cx</literal> and <literal>%dx</literal>
+ (lower bytes in <literal>%dx</literal>). This result (the
+ <literal>%dx</literal> part) is copied to register
+ <literal>%di</literal>, and the value of the
+ <varname>TICKS</varname> variable is added to
+ <literal>%di</literal>. This variable resides in
+ <filename>boot0</filename> at offset <literal>_TICKS</literal>
+ (a negative value) from register <literal>%bp</literal> (which,
+ recall, points to <literal>0x800</literal>). The default value
+ of this variable is <literal>0xb6</literal> (182 in decimal).
+ Now, the idea is that <filename>boot0</filename> constantly
+ requests the time from the <acronym>BIOS</acronym>, and when the
+ value returned in register <literal>%dx</literal> is greater
+ than the value stored in <literal>%di</literal>, the time is up
+ and the default selection will be made. Since the RTC ticks
+ 18.2 times per second, this condition will be met after 10
+ seconds (this default behaviour can be changed in the
+ <filename>Makefile</filename>). Until this time has passed,
+ <filename>boot0</filename> continually asks the
+ <acronym>BIOS</acronym> for any user input; this is done through
+ <literal>int 0x16</literal>, argument <literal>1</literal> in
+ <literal>%ah</literal>.</para>
+
+ <para>Whether a key was pressed or the time expired, subsequent
+ code validates the selection. Based on the selection, the
+ register <literal>%si</literal> is set to point to the
+ appropriate partition entry in the partition table. This new
+ selection overrides the previous default one. Indeed, it
+ becomes the new default. Finally, the ACTIVE flag of the
+ selected partition is set. If it was enabled at compile time,
+ the in-memory version of <filename>boot0</filename> with these
+ modified values is written back to the <acronym>MBR</acronym> on
+ disk. We leave the details of this implementation to the
+ reader.</para>
+
+ <para>We now end our study with the last code block from the
+ <filename>boot0</filename> program:</para>
+
+ <figure xml:id="boot-boot0-check-bootable">
+ <title><filename>sys/boot/i386/boot0/boot0.S</filename></title>
+
+ <programlisting> movw $0x7c00,%bx # Address for read
+ movb $0x2,%ah # Read sector
+ callw intx13 # from disk
+ jc beep # If error
+ cmpw $0xaa55,0x1fe(%bx) # Bootable?
+ jne beep # No
+ pushw %si # Save ptr to selected part.
+ callw putn # Leave some space
+ popw %si # Restore, next stage uses it
+ jmp *%bx # Invoke bootstrap</programlisting>
+ </figure>
+
+ <para>Recall that <literal>%si</literal> points to the selected
+ partition entry. This entry tells us where the partition begins
+ on disk. We assume, of course, that the partition selected is
+ actually a &os; slice.</para>
+
+ <note>
+ <para>From now on, we will favor the use of the technically
+ more accurate term <quote>slice</quote> rather than
+ <quote>partition</quote>.</para>
+ </note>
+
+ <para>The transfer buffer is set to <literal>0x7c00</literal>
+ (register <literal>%bx</literal>), and a read for the first
+ sector of the &os; slice is requested by calling
+ <literal>intx13</literal>. We assume that everything went okay,
+ so a jump to <literal>beep</literal> is not performed. In
+ particular, the new sector read must end with the magic sequence
+ <literal>0xaa55</literal>. Finally, the value at
+ <literal>%si</literal> (the pointer to the selected partition
+ table) is preserved for use by the next stage, and a jump is
+ performed to address <literal>0x7c00</literal>, where execution
+ of our next stage (the just-read block) is started.</para>
+ </sect1>
+
+ <sect1 xml:id="boot-boot1">
+ <title><literal>boot1</literal> Stage</title>
+
+ <para>So far we have gone through the following sequence:</para>
<itemizedlist>
<listitem>
- <para>modifies the bootable flag for the selected partition to
- make it bootable, and clears the previous</para>
+ <para>The <acronym>BIOS</acronym> did some early hardware
+ initialization, including the <acronym>POST</acronym>. The
+ <acronym>MBR</acronym> (<filename>boot0</filename>) was
+ loaded from absolute disk sector one to address
+ <literal>0x7c00</literal>. Execution control was passed to
+ that location.</para>
</listitem>
<listitem>
- <para>saves itself to disk to remember what partition (slice)
- has been selected so to use it as the default on the next
- boot</para>
+ <para><filename>boot0</filename> relocated itself to the
+ location it was linked to execute
+ (<literal>0x600</literal>), followed by a jump to continue
+ execution at the appropriate place. Finally,
+ <filename>boot0</filename> loaded the first disk sector from
+ the &os; slice to address <literal>0x7c00</literal>.
+ Execution control was passed to that location.</para>
</listitem>
+ </itemizedlist>
+
+ <para><filename>boot1</filename> is the next step in the
+ boot-loading sequence. It is the first of three boot stages.
+ Note that we have been dealing exclusively
+ with disk sectors. Indeed, the <acronym>BIOS</acronym> loads
+ the absolute first sector, while <filename>boot0</filename>
+ loads the first sector of the &os; slice. Both loads are to
+ address <literal>0x7c00</literal>. We can conceptually think of
+ these disk sectors as containing the files
+ <filename>boot0</filename> and <filename>boot1</filename>,
+ respectively, but in reality this is not entirely true for
+ <filename>boot1</filename>. Strictly speaking, unlike
+ <filename>boot0</filename>, <filename>boot1</filename> is not
+ part of the boot blocks
+ <footnote>
+ <para>There is a file <filename>/boot/boot1</filename>, but it
+ is not the written to the beginning of the &os; slice.
+ Instead, it is concatenated with <filename>boot2</filename>
+ to form <filename>boot</filename>, which
+ <emphasis>is</emphasis> written to the beginning of the &os;
+ slice and read at boot time.</para></footnote>.
+ Instead, a single, full-blown file, <filename>boot</filename>
+ (<filename>/boot/boot</filename>), is what ultimately is
+ written to disk. This file is a combination of
+ <filename>boot1</filename>, <filename>boot2</filename> and the
+ <literal>Boot Extender</literal> (or <acronym>BTX</acronym>).
+ This single file is greater in size than a single sector
+ (greater than 512 bytes). Fortunately,
+ <filename>boot1</filename> occupies <emphasis>exactly</emphasis>
+ the first 512 bytes of this single file, so when
+ <filename>boot0</filename> loads the first sector of the &os;
+ slice (512 bytes), it is actually loading
+ <filename>boot1</filename> and transferring control to
+ it.</para>
+
+ <para>The main task of <filename>boot1</filename> is to load the
+ next boot stage. This next stage is somewhat more complex. It
+ is composed of a server called the <quote>Boot Extender</quote>,
+ or <acronym>BTX</acronym>, and a client, called
+ <filename>boot2</filename>. As we will see, the last boot
+ stage, <filename>loader</filename>, is also a client of the
+ <acronym>BTX</acronym> server.</para>
+
+ <para>Let us now look in detail at what exactly is done by
+ <filename>boot1</filename>, starting like we did for
+ <filename>boot0</filename>, at its entry point:</para>
+
+ <figure xml:id="boot-boot1-entry">
+ <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+ <programlisting>start:
+ jmp main</programlisting>
+ </figure>
+
+ <para>The entry point at <literal>start</literal> simply jumps
+ past a special data area to the label <literal>main</literal>,
+ which in turn looks like this:</para>
+
+ <figure xml:id="boot-boot1-main">
+ <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+ <programlisting>main:
+ cld # String ops inc
+ xor %cx,%cx # Zero
+ mov %cx,%es # Address
+ mov %cx,%ds # data
+ mov %cx,%ss # Set up
+ mov $start,%sp # stack
+ mov %sp,%si # Source
+ mov $0x700,%di # Destination
+ incb %ch # Word count
+ rep # Copy
+ movsw # code</programlisting>
+ </figure>
+
+ <para>Just like <filename>boot0</filename>, this
+ code relocates <filename>boot1</filename>,
+ this time to memory address <literal>0x700</literal>. However,
+ unlike <filename>boot0</filename>, it does not jump there.
+ <filename>boot1</filename> is linked to execute at
+ address <literal>0x7c00</literal>, effectively where it was
+ loaded in the first place. The reason for this relocation will
+ be discussed shortly.</para>
+
+ <para>Next comes a loop that looks for the &os; slice. Although
+ <filename>boot0</filename> loaded <filename>boot1</filename>
+ from the &os; slice, no information was passed to it about this
+ <footnote>
+ <para>Actually we did pass a pointer to the slice entry in
+ register <literal>%si</literal>. However,
+ <filename>boot1</filename> does not assume that it was
+ loaded by <filename>boot0</filename> (perhaps some other
+ <acronym>MBR</acronym> loaded it, and did not pass this
+ information), so it assumes nothing.</para></footnote>,
+ so <filename>boot1</filename> must rescan the
+ partition table to find where the &os; slice starts. Therefore
+ it rereads the <acronym>MBR</acronym>:</para>
+
+ <figure xml:id="boot-boot1-find-freebsd">
+ <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+ <programlisting> mov $part4,%si # Partition
+ cmpb $0x80,%dl # Hard drive?
+ jb main.4 # No
+ movb $0x1,%dh # Block count
+ callw nread # Read MBR</programlisting>
+ </figure>
+
+ <para>In the code above, register <literal>%dl</literal>
+ maintains information about the boot device. This is passed on
+ by the <acronym>BIOS</acronym> and preserved by the
+ <acronym>MBR</acronym>. Numbers <literal>0x80</literal> and
+ greater tells us that we are dealing with a hard drive, so a
+ call is made to <literal>nread</literal>, where the
+ <acronym>MBR</acronym> is read. Arguments to
+ <literal>nread</literal> are passed through
+ <literal>%si</literal> and <literal>%dh</literal>. The memory
+ address at label <literal>part4</literal> is copied to
+ <literal>%si</literal>. This memory address holds a
+ <quote>fake partition</quote> to be used by
+ <literal>nread</literal>. The following is the data in the fake
+ partition:</para>
+
+ <figure xml:id="boot-boot2-make-fake-partition">
+ <title><filename>sys/boot/i386/boot2/Makefile</filename></title>
+
+ <programlisting> part4:
+ .byte 0x80, 0x00, 0x01, 0x00
+ .byte 0xa5, 0xfe, 0xff, 0xff
+ .byte 0x00, 0x00, 0x00, 0x00
+ .byte 0x50, 0xc3, 0x00, 0x00</programlisting>
+ </figure>
+
+ <para>In particular, the <acronym>LBA</acronym> for this fake
+ partition is hardcoded to zero. This is used as an argument to
+ the <acronym>BIOS</acronym> for reading absolute sector one from
+ the hard drive. Alternatively, CHS addressing could be used.
+ In this case, the fake partition holds cylinder 0, head 0 and
+ sector 1, which is equivalent to absolute sector one.</para>
+
+ <para>Let us now proceed to take a look at
+ <literal>nread</literal>:</para>
+
+ <figure xml:id="boot-boot1-nread">
+ <title><filename>sys/boot/i386/boot2/boot1.S</filename></title>
+
+ <programlisting>nread:
+ mov $0x8c00,%bx # Transfer buffer
+ mov 0x8(%si),%ax # Get
+ mov 0xa(%si),%cx # LBA
+ push %cs # Read from
+ callw xread.1 # disk
+ jnc return # If success, return</programlisting>
+ </figure>
+
+ <para>Recall that <literal>%si</literal> points to the fake
+ partition. The word
+ <footnote>
+ <para>In the context of 16-bit real mode, a word is 2
+ bytes.</para></footnote>
+ at offset <literal>0x8</literal> is copied to register
+ <literal>%ax</literal> and word at offset <literal>0xa</literal>
+ to <literal>%cx</literal>. They are interpreted by the
+ <acronym>BIOS</acronym> as the lower 4-byte value denoting the
+ LBA to be read (the upper four bytes are assumed to be zero).
+ Register <literal>%bx</literal> holds the memory address where
+ the <acronym>MBR</acronym> will be loaded. The instruction
+ pushing <literal>%cs</literal> onto the stack is very
+ interesting. In this context, it accomplishes nothing. However, as
+ we will see shortly, <filename>boot2</filename>, in conjunction
+ with the <acronym>BTX</acronym> server, also uses
+ <literal>xread.1</literal>. This mechanism will be discussed in
+ the next section.</para>
+
+ <para>The code at <literal>xread.1</literal> further calls
*** DIFF OUTPUT TRUNCATED AT 1000 LINES ***
More information about the svn-doc-head
mailing list