EPOCH
EPOCH is a concurrency primitive used in the FreeBSD base system. It is provided by Concurrency Kit, a library offering safe memory reclamation mechanisms and non-blocking data structures for the research, design and implementation of high performance concurrent systems.
Contents
Getting started with EPOCH
EPOCH(9) may seem like a black box, but when you are finished reading this document, hopefully you will have a greater understanding.
The Basics
There are different kinds of EPOCH implementations:
#
Using shared memory
Preemptive
1
YES
NO
2
NO
NO
3
YES
YES
4
NO
YES
FreeBSD currently supports preemptive and non-preemtive EPOCH's using shared memory.
One can view an epoch like a resource manager with two states:
- Resource is blocked
- Resource is not blocked
First Try
An epoch supports recursion by keeping track of the number of times a resource was blocked.
Let’s summarize this with an example EPOCH(9) structure:
struct epoch { uint32_t state; uint32_t recursed; };
When we enter an EPOCH(9) resource, the following steps are taken:
if (epoch->recursed++ != 0) return; epoch->state = 1;
When we exit an EPOCH(9) resource, the following steps are taken:
if (epoch->recursed-- == 1) epoch->state = 0;
When we synchronize an EPOCH(9) resource, the following steps are taken:
while (epoch->state != 0) ;
This is pretty much what a mutex does and the simplest EPOCH(9) implementation can use just a single mutex:
Second Try
When we enter an EPOCH(9) resource, the following steps are taken:
lock();
When we exit an EPOCH(9) resource, the following steps are taken:
unlock();
When we synchronize an EPOCH(9) resource, the following steps are taken:
lock(); unlock();
If EPOCH(9) was that simple, there wouldn't be more to discuss.
EPOCH(9) allows multiple threads to enter the lock at the same time.
Then we can no longer use a regular mutex, but we need a read/write lock.
Third Try
When we enter an EPOCH(9) resource, the following steps are taken:
readlock();
When we exit an EPOCH(9) resource, the following steps are taken:
readunlock();
When we synchronize an EPOCH(9) resource, the following steps are taken:
writelock(); writeunlock();
The issue with this approach is that multiple threads may acquire the read lock simultaneously, meaning the synchronize operation can be blocked forever in the worst case.
Can this be avoided? Let's start from the beginning. What if the state variable was global?
Fourth Try
uint32_t global_state; struct epoch { uint32_t copy_state; uint32_t recursed; };
When we enter an EPOCH(9) resource, the following steps are taken:
if (epoch->recursed++ != 0) return; epoch->state = global_state + 1; global_state += 2;
When we exit an EPOCH(9) resource, the following steps are taken:
if (epoch->recursed-- == 1) epoch->state++;
When we synchronize an EPOCH(9) resource, the following steps are taken:
old_state = global_state; while (1) { uint32_t copy = epoch->state; int32_t diff = old_state - copy; /* * Check that EPOCH is not busy or if the read-section was * already exited at least one time, since this function was * called: */ if ((copy % 2) == 0 || (diff <= 0)) break; }
The issue with this implementation is still that the recurse count must reach zero, which is called grace time, before the global state wraps around 32-bits, but it catches heavy use of read-sections.
If the EPOCH(9) structure would be per-thread, then the grace time would not be a problem. However, then the synchronize function would need to scan all system threads. A compromise is to have per-CPU EPOCH(9) structures, which is what FreeBSD does.
Sometimes you want to free memory inside a read-only EPOCH(9) section. This is most conveniently done using epoch_call(), which is basically a callback function executed on a taskqueue, inserting an EPOCH(9) wait call before calling the callback function.
static void epoch_callback() { epoch_wait(); call_callback_function(); }
Epoch FAQ
- Can I lock a mutex inside a preemptable read only EPOCH(9) section?
Yes you can, if the EPOCH is of the pre-emptable kind. FreeBSD integrates EPOCH(9) with regular mutexes so that the priority propagates when an epoch_wait() call is executing, because a mutex is blocked inside an EPOCH(9) read section.
epoch_enter_preempt(); mtx_lock(); /* XXX this mutex may be owned by another thread for a short time blocking epoch_wait() */ mtx_unlock(); epoch_exit_preempt();
- Can I lock a mutex inside a non-preemptable read only EPOCH section? No, this is not allowed.
Further Information
Presentation: Introduction to Concurrency Kit (PDF, Samy Al Bahra, concurrencykit.org)
This talk is an introduction to the various motivations and features of Concurrency Kit, specifically with-in the context of applications in the FreeBSD kernel.
if_bridge(4): Network Bridge Device: epoch-ification
5x if_bridge Performance Improvement (freebsdfoundation.org)