Overview
Booting FreeBSD on an ARM SoC chip is a multi-stage process involving several distinct software components. This article describes the boot process using the Atmel AT91RM9200 SoC as an example.
FreeBSD includes all the source code required to boot an AT91RM9200 chip without requiring any "binary blob" from a vendor or source code downloaded from other locations. This makes it an ideal example for explaining the very early parts of the boot process, which are often hidden in such vendor-supplied components when you buy an evaluation board or one of the numerous small ARM systems available.
While the early part of the boot process described here is specific to the Atmel chip, the general principles apply to all systems, and once control is handed to the kernel entry point the process is the same for all ARM chips.
Within The Chip
SoC chips typically contain a small amount of SRAM memory, and a small amount of boot firmware in an internal ROM. The memory is typically on the order of 16-64K. At power-on the ROM code does very minimal hardware setup, enabling processor and peripherals just enough to locate a boot image. Because resources are limited, the ROM program is generally very small and limitted to checking simple busses such as SPI, I2C, NOR flash, and NAND flash, and often as a last resort an RS-232 serial debug port. The datasheet for the SoC usually describes this builtin boot program in detail.
The builtin ROM boot code often doesn't know anything about the board it's attached to. In particular it doesn't know how much and what kind of memory is attached, which of its multifunction pins are actually connected to a bus or device as opposed being used for a status LED, etc. It is the responsibility of the first-level bootloader code to initialize the chip for the attached hardware, at least enough to proceed to the next boot stage.
The AT91RM9200
The Atmel chip has 16K of internal static ram. It will attempt to locate a file on an attached SPI dataflash chip or I2C eeprom by matching a signature pattern in the first few bytes read from the device. If it matches, the code is downloaded into the SRAM, at physical address 0x00000000. The ROM loader uses 4K at the top of the SRAM as buffer and work space while downloading, so the first-level bootloader code for this chip must fit within 12K.
The First-Level Bootloader
In the FreeBSD source code, the first-level bootloader for the AT91RM9200 is in directory src/sys/boot/arm/at91/boot2 with supporting code in nearby directories. This bootloader sets up the hardware enough to load the kernel (or a next-stage bootloader) from an SD card. The following section describes the general steps it takes to do this, with a particular emphasis on what lives where in memory and how control transfers from one step to the next.
Early Boot Steps
- The Atmel chip loads the boot2 code from either spi flash or i2c eeprom or the serial port into the 16K on-chip sram at physical address 0x00000000. It jumps to the entry point address stored in the vector / signature block at the start of the bootloader image.
- boot2 is linked with the text segment at 0x00000000, and the data segment immediately following it.
- boot2's linker config script places the bss segment at an arbitrary address located in sdram, and the boot2 code must be careful not to access any variable in bss until after the sdram setup code runs.
- The boot2 entry code in sys/boot/arm/libat91/arm_init.S sets the stack pointer to the top of sram, then calls _init() in libat91/at91rm9200_lowlevel.c.
- The code in _init() sets up the processor clocks and sdram controller. After sdram is configured it zeroes out boot2's bss data area. It also initializes the DBGU serial port, so printf() in boot2 works after _init().
- When _init() returns to arm_init.S, the stack is relocated into sdram at the address specified in the SVC_STACK_USE constant near the top of the source file. It then calls main() in boot2.c.
- At this point the boot2 code is running in a memory environment somewhat similar to normal: the stack will grow downwards towards bss and the stack size is the difference between the stack pointer and the end of bss (at this writing it's about 4MB of stack space).
- The boot2 code loads the next stage, which could be another bootloader stage such as ubldr, or it could be the kernel itself, from the first partition on the SD card into physical ram.
- boot2 is an elf loader. That is, it obtains information about the image it's loading from the elf headers of the file.
- The elf headers contain information on one or more segments of type PT_LOAD to be loaded.
- For each segment the header contains a load address and size.
- The Program Header contains the entry point address.
- Because a FreeBSD kernel is typically linked to run at virtual address 0xC0000000, and the elf headers have historically had this virtual address in the phys_addr field (oops), the loader masks out the high order part of the addresses and plugs in the high order part of the address where ram is physically located.
- For sdram on the AT91RM9200 that means changing 0xCxxxxxxx to 0x2xxxxxxx.
The virtual and physical addresses for the kernel linking are found in src/sys/arm/at91/std.tsc4370, or the corresponding file for other SoCs.
- Once the kernel or next stage bootloader is loaded into ram, boot2 launches it by jumping to the entry point address specified in the elf header (after again translating virtual to physical address).
- It passes four parameters (in registers r0-r3). There was no standard for what information to pass to the ARM kernel entry when boot2 was written; the code has been one-off customized for several different boards by different parties.
- For the AT91RM9200 boot2, the MMU is turned off. Other bootloaders may invoke the next stage at this point with the MMU on, and perhaps even with caching enabled.
- Entry to the next stage is made with a physical address in the PC and the stack pointer still pointing to the top of sdram. Physical addressing, or va=pa mapping, should be in effect at this point
Memory Layout
Managing the memory layout during the early boot process is largely a tedious manual process. The absolute location in physical memory of several important items is specified in source code or source config files. To customize boot2 for a particular memory configuration you need to locate these constants and carefully avoid overlaps.
The values shown in this table are appropriate for a 64MB system. They are not the values currently checked in to FreeBSD.
Memory Location Constants |
||
sys/boot/arm/at91/libat91/arm_init.S |
SVC_STACK_USE (boot 2 stack) |
0x23F00000 |
sys/boot/arm/at91/linker.cfg |
bss |
0x23B00000 |
arm/at91/std.tsc4370 |
STARTUP_PAGETABLE_ADDR |
0x23FF0000 |
This article started out as a readme file containing just the information in this table. The path to locating it began with a kernel image with a 25MB filesystem embedded in it, and the fact that SVC_STACK_USE at that time located the stack 16MB into physical ram.
Other Bootloaders
U-Boot
Many modern boards and ARM-based systems come with U-Boot. Something like the first-level bootloader described above often runs first, and gets the low-level hardware (clocks and ram) running enough to load and launch u-boot. U-boot can be customized with support for a wide variety of storage systems, network interfaces, filesystems, and even utilities to test and format hardware devices and copy data to them.
Generally you need a copy of u-boot that has been customized and compiled for your particular chip and board.
- [ need more info here. ]
ubldr
- [ need more here too ]
The Kernel
- The entry point on the kernel side depends on which flavor of kernel is being used.
Kernel Flavors |
|
kernel |
Kernel with elf headers |
kernel.bin |
Kernel without elf headers |
kernel.debug |
Kernel with elf headers and full debugging info |
kernel.symbols |
Just symbols for debugging a crash dump with gdb |
kernel.gz.tramp |
Gzipped kernel+unzipper trampoline with elf headers |
kernel.gz.tramp.bin |
Gzipped kernel+unzipper trampoline w/o elf headers |
kernel.tramp |
Uncompressed kernel+trampoline with elf headers |
kernel.tramp.bin |
Uncompressed kernel+trampoline w/o elf headers |
The .bin versions have the elf headers stripped off, and a bootloader would launch such a kernel by jumping to offset zero (the load address).
For a kernel without trampoline code, the entry point is in locore.S. This is described below.
The trampoline entry point is _startC() in sys/arm/elf_trampoline.c. This is specified in sys/conf/Makefile.arm which actually writes a little asm source file named tmphack.S which jumps to the _startC() routine. Tricky.
The stack on entry to _startC() is still the one from boot2. The first thing _startC() does is move the stack to be about 1MB above the end of the loaded kernel. At this point physical addressing is still in effect; the trampoline code is linked to run at the physical address specified in std.tsc4370, not at the virtual address the kernel is linked for.
The trampoline code decompresses the kernel to the ram that immediately follows the end of the loaded kernel, then copies the uncompressed kernel to the physical address specified in the decompressed elf headers. It sets up rudimentary page tables immediately following the kernel (I think this is unecessary, the kernel will do its own mmu setup), and then sets up the initial kernel entry stack to be right after these page tables.
Custom code we've added (not in stock FreeBSD) copies the tsc_bootinfo struct to the new stack so that it will remain accessible to the kernel during the handoff. (It turns out this may be a bit dodgy -- the struct passed by boot2 lives in boot's bss, so it's essentially sitting in the middle of sdram somewhere, and could have been wiped out by uncompressing the kernel.)
The trampoline jumps to the entry point listed in the kernel's elf header.
The kernel entry point listed in the elf header is the label _start in the file sys/arm/locore.S. When entered directly from a bootloader the MMU is typically disabled and the pc is the physical address of the entry point. When entered from the trampoline the MMU is enabled and the pc is the virtual address. The code in locore.S is designed to work right either way.
The entry code crafts simple startup page tables, enables the MMU (but not caches, MD code does that later), and transitions the pc from physical to virtual addresses. The page tables are placed at the location specified by STARTUP_PAGETABLE_ADDR, which comes from sys/arm/at91/std.tsc4370. These have a short lifetime, the machine-dependent initarm() routine is expected to create and install more complete page tables.
The stack pointer is set to a small (2K) stack in bss, and the initarm() function is called, passing the same values in r0-r3 that were present on entry to _start. Each type of arm SoC has its own initarm() routine which does a lot of machine-specific and board-specific setup. It must preserve any incoming parms from the bootloader. Some early parts of kernel init are called from within initarm(), including cninit() which is the point at which printf() begins to work. The initarm() routine returns a pointer to what will become the supervisor mode stack for the kernel.
When initarm() returns, the code in locore.S installs the stack pointer it returned. At this point there is no longer any dependence on magical fixed addresses in memory that were compiled-in or came from a config file.
Next the code in locore.S calls mi_startup(), which is essentially like main() for the kernel. It should not return.