This work has been committed to FreeBSD 10 (2013, March 4th)
Contacts: <davide AT FreeBSD DOT org> , <mav AT FreeBSD DOT org>
This summer I'll work on a complete rewrite of the callout(9) facility subsystem in order to use better precision allowed by new timer drivers (eventtimers(9)).
Description of the problem
In all the BSD kernels, timers are provided using the callout(9) facility, which allows a function to be registered in order to be called at a future time. Right now, FreeBSD can't handle timeouts less then 2/HZ and precision less then 1/HZ. According to some recent tests, other OSes can do it much better. Some consumers may need better resolution, and this is important in lots of applications, e.g. allow faster TCP recovery in case of error or package loss, or real-time applications.
- Switch from a tick oriented approach to a bintime one (i.e, completely change approach from periodic to one-shot) for the callout(9) facility. The main achievement here will be to become tickless. The current implementation is just unable to handle bintime, so we need to modify it to allow integration with eventtimers(9)
- Write a test framework or use an existing one to evaluate how much the project is good. Re-write the documentation (in particular the callout(9) manpage) or adapt it.
- Start experimenting the new backend on existent services, in order to achieve better precision. Services that can be touched comprehend: nanosleep(2), usleep(2), select/pool, etc...
- Make the new callout subsystem to execute some specified events in hardware context (now it is always done in SWI) and make all these XXXXclocks() to use unified callout API. Existing callout(9) is not suitable for this.
- Investigate the possibility of event aggregation. If two events are scheduled to be run after one minute with 1 millisecond of difference there would be the possibility to group them together to not wake CPU/switch context extra time. This achievement would be applicable not only to new high-precise events that will be implemented , but also to old one. There are in this case, some open questions that could/should be addressed, e.g. how to pass that precision value to callout_reset() or its new equivalent.
- Investigate the possibility of improving CPU affinity for callouts, which right now looks rudimentary