DTrace Examples
This page contains a quick-start guide to DTrace and some useful examples of how to use DTrace to instrument your kernel for performance or functional analysis.
Contents
-
DTrace Examples
- Histogram of time spent in each callout
- Measure total time spent in each callout handler over 1-second intervals
- Measure total time spent in non-MPSAFE callouts over 10-second intervals
- Collect stack traces leading to failed priv_check_cred() calls
- Histogram of time spent waiting on NFSv3 RPCs by system call
- Trace of NFS RPCs and attribute/access cache interactions by system call
- Distribution of name cache entry string lengths
Histogram of time spent in each callout
This script simply samples time spent in callout handlers and builds a histogram with the results so that you can quickly see if there are long-running callouts to investigate.
callout_execute:::callout-start { self->cstart = timestamp; } callout_execute:::callout-end { @length = quantize(timestamp - self->cstart); }
Submitted by RobertWatson.
cinnamon-freebsd# dtrace -s d.d dtrace: script 'd.d' matched 2 probes ^C value ------------- Distribution ------------- count 1 | 0 2 | 7 4 | 9 8 | 14 16 | 20 32 | 34 64 | 44 128 |@ 69 256 |@ 99 512 |@ 112 1024 |@ 133 2048 |@ 131 4096 |@@@@ 377 8192 |@@@@@@@@@@@@ 1322 16384 |@@@@@@ 689 32768 |@@@@ 416 65536 |@@@ 362 131072 |@@@ 304 262144 |@ 124 524288 | 22 1048576 | 7 2097152 | 2 4194304 | 1 8388608 | 3 16777216 | 1 33554432 | 0
Measure total time spent in each callout handler over 1-second intervals
This script is loosely based on a similar one from OpenSolaris's sdt Provider page, but due to slight differences in types and probe naming, we can't use theirs verbatim:
#pragma D option quiet callout_execute:::callout-start { self->cstart = timestamp; } callout_execute:::callout-end { @callouts[((struct callout *)arg0)->c_func] = sum(timestamp - self->cstart); } tick-1sec { printa("%40a %10@d\n", @callouts); clear(@callouts); printf("\n"); } BEGIN { printf("%40s | %s\n", "function", "nanoseconds per second"); }
Submitted by RobertWatson.
Note: Probes were named callout_start and callout_end in older versions.
This script works by measuring the time at the start of a callout, then again on termination and billing that time incrementally to the callout function that ran. Once a second, we print the results and reset the counters.
cinnamon-freebsd# dtrace -s d.d function | nanoseconds per second kernel`ipport_tick 19215 kernel`nd6_timer 21521 kernel`lance_watchdog 31848 kernel`kbdmux_kbd_intr_timo 34418 kernel`logtimeout 106167 kernel`pffasttimo 149003 kernel`scrn_timer 157988 kernel`tcp_isn_tick 161720 kernel`pfslowtimo 201723 kernel`dcons_timeout 309218 0xc42deeb0 443589 kernel`lim_cb 584293 kernel`atkbd_timeout 789599 0xc42d34f0 807851 kernel`sleepq_timeout 4198977 ^C
Measure total time spent in non-MPSAFE callouts over 10-second intervals
The same as the previous script, but only include callouts that use Giant as their lock:
#pragma D option quiet #pragma D option dynvarsize=64m callout_execute:::callout_start /(((struct callout *)arg0)->c_lock == &`Giant.lock_object)/ { self->start = timestamp; } callout_execute:::callout_end /self->start/ { @callouts[((struct callout *)arg0)->c_func] = sum(timestamp - self->start); self->start = 0; } tick-10sec { normalize(@callouts, 10); printa("%40a %10@d\n", @callouts); printf("\n"); clear(@callouts); } BEGIN { printf("%40s | %s\n", "function", "nanoseconds per second"); }
Submitted by RobertWatson
As before, except that only callouts using Giant are measured. Note the "`" before Giant in order to name a kernel global variable.
cinnamon-freebsd# dtrace -s d.d dtrace: dynamic variable size lowered to 16m function | nanoseconds per second kernel`nd6_timer 19388 kernel`kbdmux_kbd_intr_timo 141079 kernel`scrn_timer 345507 kernel`logtimeout 418742 kernel`atkbd_timeout 574425 ^C
Collect stack traces leading to failed priv_check_cred() calls
This script captures an array of counters, one for each stack trace leading to a failed kernel privilege check:
priv:::priv-err { @traces[stack()] = count(); }
Submitted by RobertWatson
dtrace: script 'priv.d' matched 1 probe dtrace: aggregation size lowered to 6m ^C kernel`priv_check_cred+0xc3 kernel`vaccess+0x104 kernel`ufs_access+0x1e6 kernel`VOP_ACCESS_APV+0xa5 kernel`nfsrv_access+0xfa kernel`nfsrv3_access+0x2fc kernel`nfssvc_program+0x397 kernel`svc_run_internal+0x942 kernel`svc_run+0x7f kernel`nfssvc_nfsd+0xad kernel`nfssvc+0x235 kernel`syscall+0x2d3 kernel`0xc0b43790 32 kernel`priv_check_cred+0xc3 kernel`vaccess+0x179 kernel`ufs_access+0x1e6 kernel`VOP_ACCESS_APV+0xa5 kernel`nfsrv_access+0xfa kernel`nfsrv3_access+0x23f kernel`nfssvc_program+0x397 kernel`svc_run_internal+0x942 kernel`svc_run+0x7f kernel`nfssvc_nfsd+0xad kernel`nfssvc+0x235 kernel`syscall+0x2d3 kernel`0xc0b43790 32 ...
Histogram of time spent waiting on NFSv3 RPCs by system call
This script measures all time spent waiting on RPC replies for each system call, and then generates a histogram of those times sorted by system call name.
Exists in FreeBSD as /usr/src/share/dtrace/nfsclienttime.
syscall:::entry { self->count = 0; } nfsclient:nfs3::start { self->timestamp = timestamp; } nfsclient:nfs3::done { self->count += (timestamp - self->timestamp); } syscall:::return /self->count != 0/ { @syscalls[probefunc] = quantize(self->count); }
Submitted by RobertWatson
dtrace: script 'd.d' matched 1040 probes dtrace: aggregation size lowered to 6m ^C fstatfs value ------------- Distribution ------------- count 131072 | 0 262144 |@@@@@@@@@@@@@ 2 524288 |@@@@@@@@@@@@@ 2 1048576 |@@@@@@@@@@@@@ 2 2097152 | 0 open value ------------- Distribution ------------- count 4 | 0 8 |@ 3 16 | 1 32 |@ 2 64 | 1 128 | 0 256 | 0 512 | 0 1024 | 0 2048 | 1 4096 | 1 8192 | 0 16384 | 0 32768 | 0 65536 | 0 131072 | 0 262144 |@@@@@ 11 524288 |@@@@@@@@@@@@@@@@ 35 1048576 |@@@@@@@@@@@@ 26 2097152 |@ 3 4194304 |@ 2 8388608 | 0 lstat value ------------- Distribution ------------- count 64 | 0 128 | 1 256 | 1 512 | 0 1024 | 1 2048 | 1 4096 | 0 8192 | 0 16384 | 1 32768 | 0 65536 | 0 131072 | 1 262144 |@@@@@ 15 524288 |@@@@@@@@@@@@@@@@@@@@@@@@@@@ 86 1048576 |@@@@@ 17 2097152 | 1 4194304 | 0 8388608 | 1 16777216 | 0
Trace of NFS RPCs and attribute/access cache interactions by system call
This script creates a trace of NFS RPCs, NFS attribute cache activity, and NFS access cache activity, along with the system call that instigated the activity. Notice that NFS events may happen outside of the context of a system call, most likely due to the VM system paging from NFS, in which case the system call name is reported as "-".
Exists in FreeBSD as /usr/src/share/dtrace/nfsattrstats.
syscall:::entry { self->syscallname = probefunc; } syscall:::return { self->syscallname = ""; } nfsclient::: /self->syscallname != 0 && self->syscallname != ""/ { trace(probemod); trace(arg0); trace(execname); trace(self->syscallname); } nfsclient::: /self->syscallname == 0 || self->syscallname == ""/ { trace(probemod); trace(arg0); trace(execname); trace("-"); }
Submitted by RobertWatson
1 531 get:hit attrcache 3281906832 ls lstat 1 527 get:hit accesscache 3281906832 ls open 1 531 get:hit attrcache 3281906832 ls open 1 527 get:hit accesscache 3281906832 ls open 1 530 flush:done attrcache 3281906832 ls open 1 532 get:miss attrcache 3281906832 ls open 1 568 getattr:start nfs3 3281906832 ls open 1 569 getattr:done nfs3 3281906832 ls open 1 533 load:done attrcache 3281906832 ls open 1 527 get:hit accesscache 3281906832 ls fchdir 1 531 get:hit attrcache 3281906832 ls stat 1 527 get:hit accesscache 3281906832 ls open 1 530 flush:done attrcache 3281906832 ls open 1 532 get:miss attrcache 3281906832 ls open 1 568 getattr:start nfs3 3281906832 ls open 1 569 getattr:done nfs3 3281906832 ls open 1 533 load:done attrcache 3281906832 ls open 1 531 get:hit attrcache 3281906832 ls fstat 1 602 fsstat:start nfs3 3281906832 ls fstatfs 1 603 fsstat:done nfs3 3281906832 ls fstatfs 1 533 load:done attrcache 3281906832 ls fstatfs 1 531 get:hit attrcache 3281906832 ls fstat 1 527 get:hit accesscache 3281906832 ls fchdir 1 531 get:hit attrcache 3281906832 ls getdirentries 1 527 get:hit accesscache 3281906832 ls lstat 1 531 get:hit attrcache 3376191584 ls lstat 1 531 get:hit attrcache 3376191584 ls lstat 1 531 get:hit attrcache 3281906832 ls getdirentries 1 527 get:hit accesscache 3281906832 ls fchdir 1 527 get:hit accesscache 3281906832 ls pathconf 1 531 get:hit attrcache 3376191584 ls pathconf
Distribution of name cache entry string lengths
This example script captures a distribution of the length of strings added to the VFS name cache.
vfs:namecache:enter:done { @distribution = quantize(strlen((string)arg1)); }
Submitted by RobertWatson
value ------------- Distribution ------------- count 0 | 0 1 | 2 2 |@@ 296 4 |@@@@@@@@@@@ 1879 8 |@@@@@@@@@@@@@@@@ 2719 16 |@@@@@@@@@@@ 1974 32 | 69 64 | 2 128 | 0
A similar script to look at the distribution of lengths in hits could be generated using the probe vfs:namecache:lookup:hit. Notice that vfs:namecache:lookup:miss includes additional pathname components, due to an implementation detail of the probe, and lends itself to this analysis less.