Your Name

Attilio Rao

Email: <attilio AT FreeBSD DOT org>

For the Summer of Code 2007 I will work on rewriting lockmgr syncronization primitive since recent efforts (in particular sun4v porting) evicted that this is a strong bottleneck for fs workloads (due to its spreadness in VFS land). One of the main goal of the rewriting is offering a more customed interface, trimming all unused (and possibily bugged) features of lockmgr and offering a more intelligent interface (that would help a lot in debugging and lock assertions).

- Open problems with lockmgr

Actually, lockmgrs are implemented through a mutex, which protects some flags used to handle concurrent accesses to the lockmgr structure itself and a single wait-channel (managed through calls to msleep()) for assuring atomicity of operations performed inside the path protected by the primitive. The implementation offers various performance problems which need to be addressed:

* two different levels of contention are created. The first one is rappresented by the front-end mutex protecting the structure while the second one is rappresented by the msleep() tweaks. Also note that, since msleep_spin() is a lot more recent than lockmgr, they are forced to use a sleepable mutex which is certainly not ideal here.

* using just one wait channel makes impossible to split shared waiters from write waiters. This forces too much spourious wake-ups as makes impossible to apply any optimization to wakeup algorithms.

* using a mutex in order to protect the lockmgr structure itself doesn't let differentiate between 'easy' (mostly uncontested) case and 'tougher' case.

* using msleep instead than doing direct calls to sleepqueue(9) functions-set just adds overhead since a lot of internal msleep's flags can be reused for lockmgr.

* using a "flat" sleepable mutex for accessing lockmgr structure makes even the shared accesses serialized for the structure (which is far from being ideal)

* an extra problem is that lockmgr's clumsy interface gets rather hard to feed the primitive in the right way with sanity checks and debugging stubs. The huge interface is very complex too and some flags result actually not used or redundant.

In order to address these problems, the idea is to use the same algorithm alredy used with mutexes, rwlocks and sxlocks: a cookie (containing waiters flags and others) manipulated through atomic operations (and memory barriers) and direct calls to sleepqueue(9) functions. Using directly sleepqueue() addresses the problem of spourious wakeups too, since we can use its multi-queue system in order to correctly split between different classes of waiters. The interface is intended to be splitted in auto-esplicative functions (very similar to what we alredy have for sx locks), even if for the first moment a a global lockmgr() macro is expected to be used for compatibility.


Detailed plan

The lockmgr struct should be smaller than the current one, including only a cookie, a lock object for keeping track of sleeping operations and sanity checks, a recursion counter, a name pointer and eventually variables for timo and priority (for these last two a preliminary analisys is due in order to see if they are both effectively used).

As previously said, API is going to be performed in a more correct way but initially a 'lockmgr-oldstyle' wrapper will be provided in order to port lockmgr consumer to the new interface. The new interface previews the introduction of following functions:

* void lockmgr_init(struct lock *lck, const char *wmesg)

* void lockmgr_init_flags(struct lock *lck, const char *wmesg, int opts, int timo, int pri)

* void lockmgr_destroy(struct lock *lck)

* int lockmgr_xlock(struct lock *lck, struct mtx *ilock, int opts)

* int lockmgr_slock(struct lock *lck, struct mtx *ilock, int opts)

* int lockmgr_unlock(struct *lck)

* void lockmgr_downgrade(struct lock *lck)

* int lockmgr_try_upgrade(struct lock *lck, int opts)

* int lockmgr_drain(struct lock *lck, struct mtx *ilock, int opts)

* void lockmgr_undrain(struct lock *lck)

* void lockmgr_set_timopri(struct lock *lck, int pri, int timo)

Note: this scheme can slightly change dependently by some analysis of edge situations

Passed opts to lock operations can be:

LCK_INTERLOCK

LCK_NOWAIT

LCK_SLEEPFAIL

LCK_TIMELOCK

While the options LCK_CANRECURSE and LCK_KERNPROC are only valid for lockmgr_xlock(). Among these the new one is LCK_KERNPROC which will let owner be the old LK_KERNPROC (otherwise curthread is always choosen). The option LCK_KERNPROC is the only one option allowed by lockmgr_try_upgrade().

Flags for lockmgr_init_flags() are the same, usual, of mutex, rw and sx:

LCK_DUPOK

LCK_NOPROFILE

LCK_NOWITNESS

LCK_QUIET


AttilioRao (last edited 2008-06-17 21:38:23 by localhost)