Multipage allocations
- Main consumer: ZFS ARC
- Addressed in a naive way: go directly into the VM
- First-fit (no prevention of fragmentation)
- Pages mapped/unmapped all the times
Benchmark used
iozone -r 32 -s 8192m -l 16 -u 16 -i 0
Informations about the benchmark
Record Size 32 KB File size set to 33554432 KB Command line used: iozone -r 32 -s 32768m -l 16 -u 16 -i 0 Output is in Kbytes/sec Time Resolution = 0.000001 seconds. Processor cache size set to 1024 Kbytes. Processor cache line size set to 32 bytes. File stride size set to 17 * record size. Min process = 16 Max process = 16 Throughput test with 16 processes Each process writes a 33554432 Kbyte file in 32 Kbyte records
Informations about the system:
CPU: Intel(R) Xeon(R) CPU X5670 @ 2.93GHz (2933.40-MHz K8-class CPU) FreeBSD/SMP: Multiprocessor System Detected: 12 CPUs hw.physmem: 25739268096 vfs.zfs.arc_min: 2984637440 vfs.zfs.arc_max: 23877099520
Allocation distribution
The following script captures the 'size' argument passed to uma_large_malloc() and build a distribution.
Dtrace script
fbt:kernel:uma_large_malloc:entry { @counts[arg0 / 4096] = count(); }
Results
- sorted with 'sort -nrk 2'
- First column: number of pages requested
- Second column: total number of requests
32 5115028 4 155907 2 10386 3 4282 5 1455 8 1247 6 1033 7 721 12 620 9 481 10 464 11 394 16 269 13 263 14 228 15 189 20 143 17 143 18 114 26 104 19 83 21 81 22 52 24 48 23 32 27 25 28 23 31 22 30 16 29 15 25 11
Profiling with hwpmc(4)
- pmcstat -S instructions
- zfs checksum disabled
%SAMP IMAGE FUNCTION CALLERS 57.6 kernel __mtx_lock_sleep _vm_map_lock 14.4 kernel pmap_enter kmem_back 2.3 kernel cpu_search_highest cpu_search_highest 1.4 kernel _sx_xlock 1.2 kernel _sx_xunlock 0.8 libc.so.7 bsearch 0.7 kernel vm_page_splay 0.6 kernel _mtx_lock_spin_cooki pmclog_reserve 0.6 zfs.ko lzjb_compress zio_compress_data
Cost of allocations
Average cost per allocation, measured wrapping the code for allocations between two calls to rdtsc():
Head
debug.calls_free: 4394689 debug.cycles_free: 2397454367445 debug.calls_alloc: 4396755 debug.cycles_alloc: 2853281095838 Avg per alloc: 648951 Avg per free: 545534
UMA
Make allocations going through UMA rather than the VM layer creating zone for 2, 4, 8, 16, 32 pages (in sys/kern/kern_malloc.c)
debug.calls_free: 5010701 debug.cycles_free: 2976686362 debug.calls_alloc: 5012661 debug.cycles_alloc: 18496362281 Avg per alloc: 3689 Avg per free: 594