Kernel Bug Report Triaging Guide
Kernel bug reports fall into several classes:
- Not a bug
- Kernel panic
- Diagnostic warnings
- Broken functionality
- Performance regression/problem
- Feature request
Not a bug
Sometimes kernel bug reports come from a user or documentation failure -- perhaps the feature doesn't in fact work the way the user thinks, perhaps because the documentation is wrong, or perhaps because the user is confused. It makes sense, when looking at a kernel bug report, to keep an open mind about what the source of the problem is, and confirm that the feature is intended to work the way the user expects.
A kernel panic occurs when it detects that a run-time invariant has failed -- that is, a check has been added to the kernel source to look for unexpected situations that reflect serious bugs and the kernel has halted.
A panic report can be a very productive type of bug report, as it includes information about what a problem is, and where it was detected. Some other types of bug reports, such as hangs or broken functionality reports, can be converted to panic reports by asking the user to enable INVARIANTS or other run-time diagnostics that test for more types of errors, for example leading to a panic instead of a hang.
Common sources of panics are:
- Page fault, in which an invalid pointer is accessed by the kernel (NULL pointer, freed memory, corrupted pointer).
- Double fault, when a page fault is improperly handled
- A failed kernel assertion, typically as a result of a "if (x) panic(message)" or "KASSERT(!x, message)|.
- Memory scrubbing detects use-after-free
When diagnosing a panic, it is useful to have:
- Machine architecture and general configuration
- General information about what the machine is being used for, and what lead to the panic (if known)
- The panic message
- If this is a page fault, there will be a trap frame including addresses
- Kernel stack trace of the running thread at the time of panic, ideally with source code locations
- Other useful information such as a list of threads, locks held, etc.
- Values of local variables in the stack frames around the panic
This information is often extracted using some combination of DDB, GDB, live debugging, and kernel dumps. A minimal dump report should include at least the panic message and stack trace, which may be sufficient to determine what general subsystems are involved, and possibly entirely diagnose the bug. Once it's possible to point the finger at a particular subsystem, it's easier to find a developer who might be able to help diagnose it.
As with kernel panics, these bugs are reported as a result of a developer anticipating a possible problem and generating debugging output that does not halt the machine (unlike a panic) when that situation occurs. A common source of non-panicking diagnostics is WITNESS, which detects possible deadlock situations and outputs to the console. Sometimes diagnostic warnings can be converted to panics or a drop into the debugger by reconfiguring the source of the warnings, which while more disruptive for the user, can lead to more debugging output. For example, WITNESS can be caused to drop into the debugger by setting the debug.witness.kdb sysctl to 1, and then used to gather stack and lock trace information before either panicking or continuing the kernel.
Anyone running -CURRENT is expected to be able to identify and be able to deal with diagnostic warnings, but we shouldn't necesarily expect this from users of -STABLE branches.
Hangs occur when the kernel gets into a state where it is unable to continue processing, but a kernel invariant has not detected the case and therefore not panicked the system. Hangs of various types are possible, and each requires different consideration:
- The completely hard hang. This might occur if the kernel gets into a loop on one CPU, and then other CPUs end up waiting for that CPU possibly with interrupts disabled. Check for this by hitting the caps lock key to see if the light turns on and off, or pinging the machine. A serial or console break won't get into the debugger, but if the box has a hardware watchdog or NMI button, that may be able to get you to the debugger.
- Deadlock. This occurs due to a cycle in locks or other locking or resource bug occurs, and while it is similar to a completely hard hang, there's a good chance of getting into the debugger using a console/serial break. Deadlocks happen when all threads that could be doing work end up waiting indefinitely for something that will never happen.
- Livelock. Like a deadlock, but rather than the system halting because no work can be done, the system is busy doing work but not making any progress. Likewise, a break will often get into the kernel.
- Other kinds of hangs that remind users of kernel hangs, but aren't. For example, if X11 locks up, the system itself may operate fine (i.e., SSH in) but no progress on the video console. It could be a kernel bug ... but it could be a bug in an application. Other similar problems involve network outages, where a system appears to stop responding but actually it's the path to the system.
When the kernel hangs, the goal is to get into the kernel debugger and figure out what it's doing. As with panics, this involves stack traces, lock information, kernel thread information, etc. From this one can identify a potentially responsible subsystem and find a developer to help.
This is a broad class of failures where the system doesn't behave as documented -- perhaps an unexpected error, a feature doesn't work, etc. There are no general rules here, except perhaps initially to step through the user's configuration and make sure it's not a user error. The good news is that often it's easy to identify the responsible subsystem, or at least a subsystem involved in the problem, and then find a developer who might be able to help.
This class of reports has to do with something either performing worse over time (MySQL got twice as many transactions/sec yesterday than today), or something that performs worse than expected (Linux gets twice as many transactions/sec). These reports can be hard to deal with because understanding performance problems is very difficult. There are a few questions you should ask yourself (and possibly also the bug reporter) before beginning work on a performance problem report:
- If a comparison between two systems is being made, is it an apples-to-apples comparison?
Are you comparing FreeBSD 6 and FreeBSD 7 on the same hardware, or different hardware?
Are you using the same version of MySQL on Linux and FreeBSD? If so, are you using the same database storage format?
Could you downgrade your version of Firefox on FreeBSD 7 to match what's on FreeBSD 6 and see if behaves the same way?
- Can we narrow down what has "changed" between "before" and "after" in a before/after scenario?
Sometimes obvious changes in a configuration will have no effect on performance, and sometimes very subtle variables will have a big impact. For example, they upgraded MySQL versions and also replaced a failed RAID controller with a seemingly-identical one from the same vendor. Could they try downgrading the MySQL version and see if they still see the performance drop?
With a bit of Q&A, you might discover that they changed the controller and that it has a different firmware minor revision that fixed a bug that lead to improved performance but greater risk of data loss...
On a similar note, people often upgrade their OS and applications at the same time, and then blame either the OS or the applications for a performance loss. Perhaps it was the other one...
- Can the performance change be captured in a simple test scenario without reproducing the entire setup?
Often times performance problems are experienced in real-world installs involving complex configuration or confidential business data. If the reporter can reproduce the problem using a more simple benchmark setup and help us reproduce it the same way, we're much more likely to be able to help them with performance. The FreeBSD Project has centralized testing resources that can then be used to analyze and optimize performance using the benchmark, which is a very effective way to make progress.
- Is the benchmark a bad or misapplied benchmark?
Often we receive reports of performance problems using a microbenchmark tool that don't actually correspond to real-world performance problems. A few classic examples of this to illustrate the point:
- bsdtar reading data from a hard disk and writing to /dev/null more slowly than gnutar. It turns out gnutar optimizes writing tarballs to /dev/null by not calling the write() system call on the data, undoubtly to help gnutar developers measure the internal performance of gnutar rather than the performance of /dev/null. As a result, bsdtar "benchmarks" much more slowly than gnutar, but might well perform fine in practice with a real workload -- for example, writing to a tarball in a file system.
- There is significant variation in the performance of a disk based on where on the disk you are writing. The outside of a disk spins at a higher speed, and more data can be fit on it contigously, so it performs better. If someone installs two versions of FreeBSD on the same hard disk using two different disk partitions, one will inevitably be closer to the edge of the disk than the other. If swap or a file system is part of the performance measurement, than the version closer to the edge may perform better because the hardware it's running on is simply faster, even though it appears to the end-user as though there's no configuration difference.
- Is the user pointing the finger in the wrong place?
Some more examples:
- The user says "the network stack is too slow!" but really they mean "FreeBSD's if_em driver is negotiating 100mbps instead of 1000mbps to my switch".
- The user says "the file system is too slow!" but really they mean "FreeBSD's ATA driver is negotiating a slower UDMA mode than Linux".
- The user says "ftp is really slow!" but the actually mean "The resolver library isn't handling a case right, so name lookups take one extra timeout".
Often reports of broken functionality are actually feature requests, whether the reporter means that or not. No general rules here, except that our bug reporting system isn't a great place to put feature requests as they tend to get forgotten about. Usually the way to get a feature implemented is to snag a developer who does want to implement it or implement it yourself. There are various strategies to do the former. "Submit patches" is an unhelpful reply that may be taken poorly, so a more polite reply is suggested.
(We should probably have a PR classification for 'feature request').