SysUnit

Contents

SysUnit

SysUnit is a project that is attempting to bring support for writing unit tests for the kernel into FreeBSD.

Motivation and Overview of Unit Testing

Unit testing is a test methodology in which individual "units" of code are tested in isolation. In the case of a C project like the kernel, the "units" are typically individual source files. In the case of SysUnit, the goal is to be able to write a test suite that comprehensively tests a single code unit from the kernel, compile the code unit under test for userland, and then link the two together to produce a userland binary that will test the code unit when run. In order to compile and link the code unit for userland, SysUnit must provide userland libraries that mimic the behaviour of any KPIs that the code unit depends on.

Unit Test Benefits

Unit testing brings a number of benefits that other testing methodologies are unable to provide.

1. Fast Test Cycles

Building a running a unit test is a simple matter of compiling the code unit and the test suite, linking it against the userland KPI libraries and then running the test. The entire test cycle can typically complete in seconds. This allows for much faster test iterations compared to a full system test of the kernel (where it will usually take minutes just to boot into a new kernel).

2. Comprehensive Tests

Because the test unit is being run in an isolated test environment, the writer of the test has full control over how the code unit under test is run. This makes it much easier to test how the code will react to exceptional circumstances like KPIs returning errors (e.g. malloc(9) failing and returning NULL), as the writer of the test can instruct particular calls to the KPI to succeed or fail (e.g. the test writer could tell malloc(9) to succeed 3 times but fail on the fourth invocation).

3. Eliminating Timing Dependencies

A number of bugs only arise if the code is called in a particular sequence of function calls, or if messages arrive in a precise order. In a system test it can be difficult to arrange for the timing to be just right, as interrupts, system daemons and other code running on the system can occur in between the code invoked by the test, potentially upsetting the timing. Additionally, the timing might well be dependent on the precise hardware in use. Timing issues can cause system tests to fail intermittently, which is much harder to diagnose.

In a unit test, arranging for a particular series of function calls is trivial as the test case can just make the function calls in whatever order it chooses (similarly messages can be injected into the code unit in whatever order).

This property means that properly written unit tests will be entirely deterministic, and so make diagnosing bugs much easier.

4. More Precise Bug Localization

When a system test fails, it typically is not clear which part of the kernel actually caused the bug. A failure in a test that calls "read(2)" could plausibly be related to the syscall layer, the filesystem, geom, or the disk driver. Also, the symptoms reported by the test can frequently be only tangentially related to actually mechanics of the bug. If a geom bug causes an inode to be corrupted, that could appear as an EPERM (if the file mode is corrupted) or the file being shorter or longer than expected (if the length in the inode is corrupted).

Because a unit test tests a single unit of code, the bug can be located in that unit immediately. Additionally, most unit test cases call only a small subset of the functions exported in a unit, so the bug can be further narrowed to only those functions. This significantly reduces the scope of the investigation.

5. Simpler Debugging

Attaching a debugger to the kernel and single-stepping through it can be quite painful. Recompiling the kernel to add additional logging can be tricky as logging in a heavily used piece of code runs the risk of livelocking the system or rolling over a KTR buffer too quickly to see the problem. DTrace is very helpful for debugging the kernel, but it can only probe function entry and exit points or explicit probes in the code. As a unit test is a standalone program, you have a much wider range of debugging techniques available.

Unit Testing Limitations

Unit testing, while very useful, cannot be and is not intended to be a full replacement for system tests. There are large, important categories of bugs that cannot be caught with unit tests. For example, integration bugs, multithreading bugs (e.g. race conditions) and performance issues are all very unlikely to be found via a unit test. Unit testing is a complement to system testing that aims to make it easier to find certain classes of bugs that are very difficult to catch with a system test, as well as decrease the length of the test cycle during initial development and testing. It is still very important to test the system as a whole once unit tests are passing.

Userland KPI Libraries

Main Article: SysUnit/StubsFakesAndMocks

In unit testing, there are a number of mechanisms that can be used to implement an API. Some of the most common techniques are stubbing, faking and mocking the API. The definitions of these terms are not universal across all unit test frameworks, but in SysUnit they are defined as follows:

1. Stubs

A stub is a function implementation whose behaviour never varies. Frequently a stub will always return a single hard-coded value. Stubs are most frequently used when the code unit needs to link against a particular symbol, but the subset of the code unit that is actually being tested never calls that symbol. For example, if a kernel source file defines some sysctl handlers that call sysctl_handle_int() but your test suite does not invoke the sysctl handlers, a stub for sysctl_handle_int() would be an appropriate mechanism. In the specific case where the tester believes that a stub will never be called and it is only being provided to make the unit test link, it can be a good idea to have the stub unconditionally call abort(). That way, if the code under test unexpectedly calls the stub the tester gets immediate feedback that their understanding of how the code under test is being invoked was wrong.

2. Fakes

A fake is a function implementation whose behaviour varies based on the parameters passed in. A fake may only provide a subset of the functionality of the real API if the code under test doesn't require the full range of functionality. The distinction between a fake and a stub is fairly blurry, but typically if the function has any logic at all it would be termed a fake rather than a stub. A plausible example of a fake of malloc(9) would be a function that calls the real libc malloc() and then conditionally zeroes the memory (based on the M_ZERO flag).

3. Mock

A mock is function implementation whose behaviour is configurable by the test case. Mocks can, for example, verify that function parameters meet requirements set by the test case, return canned values based on programmable criteria (e.g. return failure on the third call or return failure if an argument is a certain value).

Mock APIs are typically not coded by hand, as providing a full breadth of programmable behaviour would be prohibitively time consuming. Instead, a mocking framework library is used to implement the the mock. The mocking framework can provide a generic set of tools for customizing mock behaviour, and then the implementor of the mock API merely has to provide glue code to map the API to the mock.

SysUnit uses GoogleMock as its mocking framework.

Writing a New Test Suite

SysUnit tests are written in the GoogleTest framework. You can follow the GoogleTest documentation to write your test, with the following minor modifications:

All SysUnit tests must use a test fixture and define test cases with the TEST_F() or TYPED_TEST() macro. Test cases defined with the TEST() macro are not supported.
The test fixture must derive from SysUnit::TestSuite (defined in "sysunit/TestSuite.h") rather than testing::Test
- SysUnit::TestSuite is a subclass of testing::Test
- SysUnit::TestSuite defines its own versions of SetUp() and TearDown(), and these cannot be overridden. The functionality of these methods has been replaced by TestCaseSetUp() and TestCaseTearDown(); your test fixture can override these

Example code can be found in this sample test case: https://github.com/rysto32/sysunit/blob/master/netinet/tcp_lro_sample.gtest.cpp

SysUnit offers some APIs for writing unit tests. These includes generic fake, mock and stub implementations suitable for use in most test cases, as well as other APIs for easily generating useful test data. These APIs are documented at SysUnit/APIs.

Current Status

A proof-of-concept is being developed on GitHub: https://github.com/rysto32/sysunit

In order to build and run the tests, you need the gmake, googletest, googlemock and git ports. Before you can build the tests, you must initialize the submodule with "git submodule init; git submodule update". You can build and run the tests with "gmake test" from the top-level directory.

Near-Term Goals

Integrate sysunit (along with gmock/gtest) into FreeBSD tree and build infrastructure
Integrate sysunit tests with kyua