rcorder

An experimental enhancement to the rcorder utility and associated changes to /etc/rc to leverage this feature to reduce startup times (and possibly suspend / resume if /etc/rc.resume is included in scope).

Origins

Luke Mewburn's original proposal for /etc/rc in 1999.

https://groups.google.com/g/mailing.netbsd.tech.userlevel/c/O4GxRawRPAw/m/eOuO3C1Hi7EJ?pli=1

http://www.mewburn.net/luke/papers/rc.d.pdf

Other Attempts

https://github.com/kil/rcorder

https://reviews.freebsd.org/D2102

https://github.com/buganini/rcexecr

https://reviews.freebsd.org/D3715

https://github.com/ultijlam/rcorder.sh

https://github.com/ngie-eign/rcorder3

This last one - authored by Boris Lytochkin - has been committed.

https://reviews.freebsd.org/D25389

Summary - the above uses "-p" to enable concurrency, is based on the original rcorder, and does not have any unit tests. The rc script edits below - with a minor change - could easily work with this new D25389 rcorder.

Summary

The enhancement is a new option "-p". When set, rcorder will return more than one RC script path per line. If /etc/rc is modified to pass this option, it must also be modified to parse the result to accept multiple script paths per line. The standard /etc/rc script does not support this.

Three or four rc scripts contain code similar to the following:

files=`rcorder /etc/rc.d/* ${local_rc} 2>/dev/null`
for _rc_elem in $files; do
    run_rc_script ${_rc_elem} ${_boot}
done

${_rc_elem} is the full path to the daemon script (usually found under /etc/rc.d/* or /usr/local/etc/rc.d/*). The script path is passed to run_rc_script which is described in rc.subr.

The for-loop is modified like so:

files=`rcorder /etc/rc.d/* ${local_rc} ${rc_parallel} 2>/dev/null`
IFS=$'\n'
for _rc_group in $files; do
    IFS=$' '
    for _rc_elem in $_rc_group; do
        run_rc_script ${_rc_elem} ${_boot} &
    done
    wait
    IFS=$'\n'
done

This also works with the D25389 rcorder (see link above).

The run_rc_script line is now run as a background task. The wait command waits for the group of tasks to complete before continuing.

If rc_parallel_start is not specified in /etc/rc.conf, the script defaults to running RC scripts one-at-a-time. Example:

rc_parallel_start="YES"

Test Run

The changes were tested on modest hardware (Core 2 Duo with spinning disk). A system booting from an SSD or diskless may give wildly different results.

The table below shows that enabling parallelism offers some savings only when there are more services to start.

/etc/rc.conf

rc_parallel_start=NO

rc_parallel_start=YES

Savings

system with minimal services

8

8

0

seconds

system with 1 extra service

13

13

0

seconds

system with 2 extra services

17

13

5

seconds

system with 3 extra services

23

13

10

seconds

system with 4 extra services

28

13

15

seconds

system with 5 extra services

33

13

20

seconds

The "extra service" is a placebo script that does nothing but sleep for five seconds - a way to simulate a busy service startup.

Savings reported here were 0 to 20 seconds. Notice there was no time savings until more than one service was added. Your system's benefit will vary.

Not bad for a relatively small change.

Test Plan

The above tests validate the feature as a proof of concept. What follows is plan for a real world test of two physical x64 systems running 12.2-RELEASE with patched /etc/rc and rcorder from 13.0-BETAx (soon to be -RELEASE).

asterisk

mongodb

nginx

openldap

postgres

salt-minion

sendmail (base)

A salt-master monitors both machines to validate success.

Acceptance criteria:

1. Compare /var/log/message from the two machines for equivalent output with no errors. Expect the log output will be in different order. What's important is each daemon starts successfully.

2. Each daemon answers to a client. A client API error is actually a success for us (daemon works as designed).

I'm interested in the boot phase so configuration on each of these services will be minimal.

Code

Patch file

Note: this patch is now in CURRENT along with a critical bug fix. Thanks to Cy Shubert for fixing this so quickly.

Edit: the patch was later backed out after more problems were reported. This little project is now on hold until I can understand how and why. Bourne shell scripting looks easy - until it isn't.

Build

The new rcorder is available in CURRENT. For 12.X-RELEASE, you can copy the rcorder source files to your /usr/src tree and build only rcorder.

To deploy rcorder on you system, copy the rcorder binary to /sbin/ - AFTER you have made a backup of the standard rcorder.

Deploy

To test rcorder, try it with and without the "-p" option:

$ rcorder /etc/rc.d/*
$ rcorder -p /etc/rc.d/*

To deploy the RC scripts, make a backup of /etc/rc (for example, copy it as /etc/rc.original) and apply the patch to /etc/rc.

To verify, reboot. Your system should start up as before - no faster or slower.

To turn on concurrent tasking, add an entry to /etc/rc.conf:

rc_parallel_start=YES

Reboot again. This time start up will be a bit faster than before.

Limitations

This won't make jails start concurrently. You can use rc.conf(5) variable jail_parallel_start="YES" to enable concurrent jail startup.

Reference

https://en.m.wikipedia.org/wiki/Topological_sorting



CategoryProject

unitrunker/rcorder (last edited 2021-05-23T13:54:05+0000 by unitrunker)