Instruction-level dynamic tracing

Student: ChristosMargiolis (christos@)
Mentor: MarkJohnston (markj@)

Project description

The objective of this project is to implement a new DTrace provider for dynamically tracing all instructions in a given function. FBT (Function Boundary Tracing) is a DTrace provider providing probes for the entry and return points of a kernel function. The new provider – we’ll call it kinst – will reuse parts of the FBT mechanism, but extend it to be able to put probes on arbitrary points in a given kernel function. This provider will be especially useful for tracing long kernel functions. It will also set the building blocks for inline function tracing.

Deliverables

Required

Implement support in dtrace(1). The new provider will be called in the form of dtrace -n 'kinst::<function>:<offset>' where function is the name of the kernel function we want to trace, and offset is the specific instruction. The offset can be extracted using the function's disassembly.
Implement the new provider in the kernel. The way this provider is going to achieve tracing of all instructions is by using the following algorithm:
- Disassemble function.
- Verify that offset is an instruction boundary. If not, dtrace(1) exits with an error, else fetch the instruction at offset.
- Allocate a "trampoline" (executable memory) where it will store the instruction and its address, so that it can jmp back when done.
- Overwrite instruction with interrupt 3, which is the breakpoint exception in x86 architectures.
- If the breakpoint handler was triggered by a breakpoint installed by DTrace, send the trace data.
- iret to the trampoline.
- Execute instruction and jmp back to the original address in order to continue execution.

Optional

Implement the provider for other architectures as well (arm64, riscv, ...).
Write a standalone C program which takes a function as input, and prints all inline calls (address ranges) in that function, using DWARF via dwarf(3).
Write a standalone program which takes a C file and a line number, and resolves <file>:<line> tuples to <function>:<offset> ones (to be used as arguments in kinst), in case <line> corresponds to a kernel function. This lets us avoid the hassle of manually debugging and scanning ELF symbol tables. The program could use DWARF as well.