Jails can now have the amount of memory available to their processes' resident sets (RSS) with arguments to the jail command and the jtune command.
A kernel thread for each jail, jpager_td, periodically traverses the processes in the jail and sums the amount of memory being consumed by the jail; if the security.jail.limit_jail_memory sysctl is set and the jail's memory exceeds the pre-set limit, jpager_td asks the virtual memory system to reclaim some of the memory being used by the jail's processes.
Currently, all processes in the jail are asked for 6.25% (1/16th) of their resident set size. This means that both a 30MB Apache process and a 300KB shell will be asked to return 1/16 of their RSS). This allows for short-term overcommits which return to the established limit over time; the security.jail.jail_pager_timeout sysctl determines how frequently (in seconds) the pager thread checks memory usage.
Things to think about for the future: progressive taxation of processes (e.g. charging the 30MB Apache process a higher rate).
CPU limiting is implemented by giving each jail a number of CPU shares and tracking the estimated CPU usage of the tasks that run in that jail. If the ratio of the jail's estimated CPU usage to total CPU usage exceeds the ratio of the jail's CPU usage shares to total CPU usage shares outstanding, the jailed processes have their priorities decreased until the ratio of actual usage (estimated CPU) drops below permitted usage (shares). In short, the more shares a jail has, the more often its processes will run. Unjailed processes are not subject to this regime.
This system does not prevent a jailed process from monopolizing the CPU when there are no other runnable processes; rather, it only prevents a jail from using more CPU time than its share if there are other jailed processes (in which case they will tend to share CPU time in proportion to their respective CPU share allocations).
The kern.sched.limit_jail_cpu sysctl enables jail CPU usage limits, and the kern.sched.system_cpu_shares sysctl determines how many CPU usage shares are attributed to unjailed processes. While they count towards the total number of CPU usage shares outstanding on a system (and so decrease the priority of jails), they do not affect the priority of unjailed processes.
Future work could look at implementing a full-blown fair-share scheduler (see Kay & Lauder or Waldspurger).
Where To Get It?
You can find the source in the FreeBSD perforce repository under //depot/projects/soc2006/cdjones_jail, at http://www.ualberta.ca/~cdjones/cdjones_jail_soc2006.tgz (as a tarball), or at http://www.ualberta.ca/~cdjones/cdjones_jail_soc2006.patch (as a patch against RELENG_6).
- Implement memory limits in kern_jail.c [done]
- Implement CPU share limiting in sched_hier.c [done]
- jtune program to modify CPU and memory limits on running jails [done]
- Write / update man pages for jtune, jail [done]
- Port from RELENG_6 to -CURRENT [post-SoC]