ZFS Tuning Guide

(Work in Progress)

It has been suggested that vfs.zfs.prefetch_disable=1 belongs in loader.conf at this time. See http://marc.info/?l=freebsd-current&m=119970004403608&w=2

To use ZFS, at least 1GB of memory is recommended (for all architectures) but more is helpful as ZFS needs *lots* of memory. Depending on your workload, it may be possible to use ZFS on systems with less memory, but it requires careful tuning to avoid panics from memory exhaustion in the kernel. An amd64 system is preferred due to its larger address space and better performance on 64bit variables, which ZFS uses a lot. It is not known how nicely ZFS plays with PAE in 7.0-RELEASE.

By default, kmem address space (the one used by in-kernel malloc(9)) is configured to a maximum of ~300MB, which is *way* too low for ZFS. If you leave the default, you will eventually see a kernel panic resembling the following in /var/log/messages:

Apr  7 21:09:07 nas savecore: reboot after panic: kmem_malloc(114688): kmem_map too small: 324825088 total allocated 

For every architecture you should increase it to at least 512MB. You can do it by adding:

to your /boot/loader.conf file. I was able to generate the following kernel panic in less than a minute by copying files from a linux server connected via gigabit crossover:

Apr  8 06:46:08 nas savecore: reboot after panic: kmem_malloc(131072): kmem_map too small: 528273408 total allocated 

As you can see, this is *with* vm.kmem_size="512M"! Conclusion: as of 7.0-RELEASE, you will need to increase it even futher to avoid panics from memory exhaustion. See also: ZFSKnownProblems.

To acheive 24 hours worth of uptime while being pounded with I/O operations over NFS, the system in question (4 Gigs of physical RAM) above required the following settings (and a kernel recompile, see i386 below):

i386

On i386 systems you will also need to recompile your kernel with increased KVA_PAGES option, to increase the size of the kernel address space before vm.kmem_size can be increased beyond 512M. Add the following line to your kernel configuration file to increase available space for vm.kmem_size to at least 1 GB:

options KVA_PAGES=512

By default the kernel receives 1GB of the 4GB of address space available on the i386 architecture, and this is used for all of the kernel address space needs, not just the kmem map. By increasing KVA_PAGES you can allocate a larger proportion of the 4GB address space to the kernel (2 GB in the above example), allowing more room to increase vm.kmem_size. The trade-off is that user applications have less address space available, and some programs (e.g. those that rely on mapping data at a fixed address that is now in the kernel address space, or which require close to the full 3GB of address space themselves) may no longer run.

It is also highly recommended to strip out as many unused drivers and options from the kernel. Unused drivers are not demalloced (according to ColemanKane), effectively wasting kernel memory that would otherwise be available for a larger vm.kmem_size. Using an unmodified 7.0-RELEASE kernel, relatively sparse drivers as required for my hardware and options KVA_PAGES=512, I was able to use vm.kmem_size="1536M" in loader.conf and achieve decent system stability under load.

Note: Perhaps there is way to calculate / measure how large of a vm.kmem_size setting can be used with a particular kernel, but I do not know it. Experimentation does work. :) However, if you set vm.kmem_size too high in loader.conf, the kernel will panic on boot. You can fix this by dropping to the boot loader prompt and typing set vm.kmem_size="512M" (or a similar smaller number known to work.)

amd64

If you can afford it, you may also want to raise kernel memory usage (vm.kmem_size) to around 1 GB:

This might help if the machine is also loaded with other tasks, such as network activity (a file server), etc. Tuning KVA_PAGES is not required on amd64.

To increase performance, you may increase kern.maxvnodes (/etc/sysctl.conf) way up if you have the RAM for it (e.g. 400000 for a 2GB system). Keep an eye on vfs.numvnodes during production to see where it stabilizes. For vnodes amd64 uses direct mapping, so you don't have to worry about address space for vnodes on this architecture (as opposed to i386).

ZFSTuningGuide (last edited 2008-04-25 16:33:43 by IvanVoras)