ColinPercival is coordinating an effort to speed up the FreeBSD boot process and would like to see FreeBSD become far more competitive in 14.0. Some improvements have already been made since 13.0.
An Amazon EC2 c5.xlarge instance is being used as a reference platform and measuring the time between the instance entering "running" state and when it is possible to SSH into the instance.
Boot time consists of four consecutive stages:
- System: BIOS/UEFI, before running any FreeBSD code -- we can't do anything about this.
- FreeBSD: boot loader.
- FreeBSD: kernel, before init starts running.
- FreeBSD: userland boot process.
The boot loader and kernel can be profiled using the TSLOG framework to generate a flamechart. To do this:
- Build a kernel with 'options TSLOG'.
Check out the freebsd-boot-profiling repository
Run sh mkflame.sh > tslog.svg
Some statistics collected in August 20211:
- FreeBSD 13.0 takes 19.00 seconds to get from "running" to "port TCP/22 closed" and another 5.13 seconds to get to "port TCP/22 open", for a total of 24.13 seconds.
- Debian 10 takes a total of 10.35 seconds.
- Clear Linux 34640 takes a total of 1.23 seconds.
Performance is measured using the ec2-boot-bench utility.
How it started
How it's going
(As of late March 2022.)
Known Performance Issues
Loader (~260 ms, more on first boot):
- Reading the kernel from disk takes ~150 ms. This would be faster with a smaller and/or a compressed kernel, but tooling work would be needed for kernel updates. This sometimes takes longer the first time an instance boots, possibly due to the underlying disk being "lazily" initialized from the backing EC2 snapshot.
- The first time an EC2 instance boots, this I/O takes longer -- about 500 ms is spent on loader I/O when the "disk cache" is cold.
- The loader spends ~45 ms printing status information to the console.
- The loader spends ~65 ms on other stuff (reading in the lua scripts, loading fonts, etc).
Kernel (~ 400 ms):
- SYSINIT vm_mem takes ~25 ms on the benchmark system (8 GB of RAM), increasing increases proportionally to RAM; potentially a concern on larger systems.
- SYSINIT cpu_mp takes ~30 ms on the benchmark system (4 vCPUs), increasing proportionally to the number of APs; potentially a concern on larger systems.
- Mounting the root filesystem spends ~40 ms waiting for g_event quiescence. Disk/partition tasting perhaps? Needs investigation.
- Probing and attaching devices takes ~150 ms.
- The kernel spends ~95 ms printing information to the console.
- The kernel spends ~60 ms on other stuff.
Userland (~ 3500 ms, more on first boot):
- /etc/rc.d/netif takes ~1300 ms, including a 1-second sleep(1).
- The first time an EC2 instance boots, dhclient takes ~2200 ms for unknown reasons.
- /etc/rc.d/rtsold takes ~1050 ms, including a 1-second sleep(1).
- /etc/rc.d/devd takes ~240 ms, mostly running many devmatches.
- Everything else combined takes ~910 ms.
Performance Issues not affecting EC2
In addition to the above, there are some performance issues which don't affect the benchmark system but should be addressed to help other platforms.
On Colin's 13.0-RELEASE laptop:
- hammer_time spends ~650 ms of DELAYs resetting the PS/2 keyboard controller ("early console").
- DEVICE_ATTACH acpi_ec has ~800 ms of DELAYs.
- DEVICE_ATTACH em has ~120 ms of DELAYs.
- DEVICE_ATTACH atkbd spends ~450 ms of DELAYs resetting the PS/2 keyboard controller (again).
Past Performance Improvements
Boot performance improvements which have been prompted by this effort:
2017-09-07 (r323290, by markj): Speed up vm_page_array initialization. Reduces SYSINIT vm_mem time by ~ 100 ms on the benchmark system. (Scales with RAM size.)
2017-11-04 (r325383, by delphij): hpt* driver fixes. Reduces DEVICE_PROBE hpt* time by ~ 330 ms on the benchmark system.
2018-05-07 (r333335, by imp): Put the CPU starting on one line. Reduces SYSINIT start_aps time by ~ 120 ms on the benchmark system. (Scales with number of APs.)
2018-08-07 (r337411, by cperciva): Replace a pair of 8-bit writes to VGA memory by a 16-bit write. Reduces time spent in vt(4) by ~ 1300 ms on the benchmark system.
2018-08-25 (r338316, by cperciva): Cache the most recently drawn characters on terminal and avoid redrawing. Reduces time spent in vt(4) by ~ 500 ms on the benchmark system.
2018-08-26 (r338321, by cperciva): Disable atkbd0 and atkdbc0 in EC2 AMIs. This speeds up the boot process by ~ 2500 ms in EC2.
2021-06-21 (524260db7683, by cperciva): Tell gptboot to skip its 3 second wait. This speeds up the boot process by 3000 ms in EC2.
2021-09-08 (a8b89dff6ac0, by cperciva): Disable acpi_timer_test by default. Reduces time spent in DEVICE_PROBE acpi_timer by ~ 140 ms on the benchmark system.
2021-09-08 (b4cb3fe0e39a, by tsoome): Keep root file system open to preserve bcache segment between file accesses. Reduces time spent in loader by ~ 1600 ms on the benchmark system.
2021-09-16 (b43d7aa09b3c, by cperciva): Switch from BIOS to UEFI boot by default when creating AMIs. This speeds up the boot process by ~ 5000 ms in EC2.
2021-09-28 (7457840230c5, by cperciva): Set twiddle globaldiv to 16 by default. This speeds up the boot process by ~ 50 ms in EC2.
2021-09-30 (ce73f768b764, by cperciva): Don't free bcache for DEVT_DISK devs. This speeds up the boot process by ~ 40 ms in EC2.
- 2021-10-01 (several patches, by imp): Remove unnecessary DELAYs, a pause, and an extra reset in the nvme driver, and speed up polling. This speeds up the boot process by ~ 330 ms in EC2.
2021-10-03 (04b9b7c507c5, by cperciva): Track unconsumed readahead in loader. This speeds up the boot process by ~ 120 ms in EC2.
2021-10-03 (248682a58915, by cperciva): Allow readahead up to 256 kB I/Os in loader. This speeds up the boot process by ~ 80 ms in EC2.
2021-11-15 (1580afcd6eaf, by cperciva): Remove 100 ms sleep from randomdev write routine. This speeds up the boot process by up to 1000 ms but has no effect on newly created systems without cached entropy.
2021-11-25 (e29711da2352, by cperciva): Turn off random delay before IPv6 Router Solicitation by default. This speeds up the boot process with IPv6 enabled by an average of 500 ms.
2021-11-25 (81075203a057, by cperciva): Turn off IPv6 Duplicate Address Detection in EC2. This speeds up the boot process by 2000 ms in EC2.
2021-12-20 (c1381f07f61a, by andrew): Skip unnecessary arm64 cache invalidation when caches are coherent. This speeds up the boot process on amd64 EC2 instances (but not the reference c5.xlarge platform) by 300 ms.
2021-12-28 (33812d60b960, by cperciva): Check for root device before waiting in vfs_mountroot. This has no effect on the boot process on EC2 c5.xlarge instances but saves 100 ms on some other systems.
2022-01-12 (c2705ceaeb09, by cperciva): Calibrate clock frequencies statistically. This speeds up the boot process by about 1997 ms on x86 systems.
2022-01-14 (4a432614f68c, by cperciva): Use 0x40000010 CPUID leaf for all VM types. This speeds up the boot process by 100 ms in EC2.