Adrian's Xen Hackery

Overview

This page is intended as a set of informal notes about my foray into FreeBSD/Xen. Just to clarify:

If you'd like to talk to someone about commercial FreeBSD/Xen improvements, the man to talk to is Kip Macy. He's done the bulk of the FreeBSD/Xen port. I'm just trying to coax it to work.

Background

I've been attempting to make FreeBSD/Xen (under FreeBSD-current) work. My test environment is a CentOS 5.3 i386 server, single CPU AMD XP 2000+ with 2GB of RAM.

My commercial hosting services use Xen and Linux virtual domains. I'd like to eventually make all of this work well enough to offer stable FreeBSD Xen virtual domains on the same platform but as I said above, this is all by and large personal interest at the moment.

I've been attempting to mostly mirror the same deployment for FreeBSD/Xen as I have with Linux - using pygrub to boot a bootloader; supplying separate LVM exported logical volumes as filesystems rather than a whole disk image (so extending the filesystems is made much easier.)

Block Device Naming

Xen hijacks the Linux block device major/minor numbering scheme and stuffs it into the block devices available for the DomU's. So if you use "sda1" for your Xen jail, The Linux Xen DomU block device driver would hijack the major number from the scsi device driver and use it.

FreeBSD/Xen used to simply create an "xbd" device with the raw device number after it. This changed recently (thanks to dfr?) to somewhat emulate what Linux was doing. From what I can tell, using hdX and partitions (hdXa, hdXb, etc) will result in the relevant ATA devices being created in the kernel (ad0s1, ad1s1, etc.) Simiarly for SCSI - sdX became daX in FreeBSD.

To get "xbd" (which I was using to make sure I had separately named devices that absolutely didn't look like they should deserve a normal DOS label) I need to use the major "202" (0xCA) with unit numbers being minor >> 4. So 0xCA00 is xbd0, 0xCA10 is xbd1, 0xCA20 is xbd2, etc. This gives me xbd0 -> xbd15 before I run out of "xbd" slices. (Actually that isn't true but I won't assume I'll get any more. Those who are interested should read sys/dev/xen/blkfront/blkfront.c:blkfront_vdevice_to_unit() to see.)

How it all boots

Xen has two main boot methods - either loading a kernel image from the Dom0 filesystem or via "pygrub". Pygrub will fondle the disk image to find a /boot/grub/menu.lst file; load the kernel from that into the Dom0 filesystem and execute it. It also grabs some parameters from the kernel configuration line and appends them as arguments.

I first tried the former. This is an example configuration file:

memory = 256
name = "freebsd"
vif = [ 'mac=00:bd:c4:12:00:ef,bridge=xenbr0' ]
disk = [ 'phy:/dev/hosting2_data2/XEN_freebsd,hda,w' ]
on_crash  = 'preserve'
extra = "boot_verbose=1"
extra += ",vfs.root.mountfrom=ufs:/dev/ad0s1a"
extra += ",kern.hz=100"

A linux boot will take a "ramdisk" and "root" parameter set to define the ramdisk and root filesystem/options. FreeBSD/Xen doesn't seem to understand these; note how normal kernel environment hints are used via "extra".

Also note that the virtual disk is just that - a completely virtual DOS disk, complete with DOS slices and a FreeBSD disklabel on ad0s1. Building this was relatively easy:

# truncate -s 10G disk.img
# mdconfig -f disk.img
(this outputs the attached unit ; assume its md0 here)
# fdisk -i md0
(follow the prompts to create one FreeBSD slice covering the entire disk)
# disklabel -i md0s1
(this creates a single partition "a" covering the whole FreeBSD slice.)

Making the Xen Filesystem

I just used a normal world/kernel build and install plus a distribution install to setup the basic filesystem. There's a couple of changes which are needed once you've done this.

# cd /path/to/world && make buildworld && make buildkernel KERNCONF=XEN

Assuming that you're using the above "md" file method to populate a filesystem:

# mount /dev/md0s1a /mnt
# cd /path/to/world && make DESTDIR=/mnt installworld && make DESTDIR=/mnt installkernel KERNCONF=XEN && make DESTDIR=/mnt distribution

Then, you need to edit a couple of files - /mnt/etc/fstab and /mnt/etc/ttys. Add a normal root drive entry to /mnt/etc/fstab and then add the following to /mnt/etc/ttys:

xc0     "/usr/libexec/getty Pc"         vt100   on  secure

Then, unmount, un-md, and copy over:

# umount /mnt
# mdconfig -d -u <unit number>

Using separate slices

It is possible to use separate slices for each filesystem (and swap.) I'm sure you can do it with mdconfig - just don't put a DOS/FreeBSD label and directly newfs/mount the straight md device (eg /dev/md0.) I haven't tried this though; I ended up grabbing "sysutils/makefs" to create FFS filesystems for me.

# makefs -M512m root.fs /path/to/install/root

My configuration file for this has a slightly different disk device line:

disk = [ 'file:/home/adrian/xen/root.fs,0xCA00,w' ]

Note the entry uses a device id of 0xCA00 rather than the strangely over-loaded Linux-y device names (hda, etc.) This becomes "/dev/xbd0" in the Xen, so your extra line would then become:

extra += ",vfs.root.mountfrom=ufs:xbd0"

Using pygrub

Xen installs these days use "pygrub", a GRUB style bootloader mostly written in Python. Xen's "bootloader" support allows an external program to determine the relevant Xen configuration sections.

I've had no luck trying to make pygrub work with a DOS disk image and FreeBSD disklabel - it needs to be taught to read FreeBSD slices (it has support for Solaris slices; so someone with some care could probably copy that and make it work.)

Instead, it -does- have support for raw UFS partitions and the UFS code seems to handle an UFS1 partition from "makefs" just fine. I've not yet tried UFS2.

"pygrub" does the following:

To boot a FreeBSD install on /dev/xbd0, I have the following Xen config file:

bootloader = "/usr/bin/pygrub"
memory = 256
name = "freebsd"
vif = [ 'mac=00:bd:c4:12:00:ef,bridge=xenbr0' ]
disk = [ 'file:/home/adrian/xen/root.fs,0xCA00,w' ]
on_crash  = 'preserve'

Inside the FreeBSD install I have one file, /boot/grub/menu.lst, with the following:

title FreeBSD
root (hd0,0)
kernel /boot/kernel/kernel vfs.root.mountfrom=ufs:xbd0,kern.hz=100,boot_verbose=1

Finally, to test, you can run pygrub from the command line:

# pygrub /path/to/root.fs

It will then run interactively, pop up a menu, and then spit out a configuration snippet. The above menu.lst generates this config snippet:

linux (kernel /var/lib/xen/boot_kernel.XU_kel)(args "vfs.root.mountfrom=ufs:xbd0,kern.hz=100,boot_verbose=1")

Finally, the running kernel environment, printed via kenv:

# kenv
vfs.root.mountfrom="ufs:xbd0"
kern.hz="100"
boot_verbose="1"

Now. A few things to note:

Timekeeping

TODO - this needs looking at..

Console

The only support at the moment is the serial-y like "xencons". This doesn't hijack the normal cons25 device (ttyv0) so you must add an entry to /etc/ttys or no getty will be started:

xc0     "/usr/libexec/getty Pc"         vt100   on  secure

Remote GDB

The kernel remote GDB won't function. Kip has some basic instructions somewhere on bootstrapping Xen from source and then using the "xen-gdbserver" stuff to provide remote GDB for a domain. I haven't yet tried this so I can't document how to set it up or use it. Kip does say it works though.

Growing partitions online

TODO.. (haven't tried yet.)

Adding/Changing/Removing network devices online

TODO.. (haven't tried yet.)

Adding/changing/removing block devices online

TODO.. (haven't tried yet.)

I've grown the block device (lvresize), rebooted into single user mode and used growfs. Ugly but it works.

Memory Balloon Driver

The Xen code -has- a balloon driver to grow/shrink the DomU memory but I haven't yet tried it to ensure it works.

There's a bunch of stuff in netfront which I should play with. Specifically tuning mbufs, the way it implements TCP segment offloading/checksum stuff (which I -believe- requires the Dom0 hardware to export it somehow to the DomU..) and other bits and pieces.

Rebooting issues

I think "reboot" doesn't force the Xen control to re-run pygrub and suck in a new kernel; I need to investigate what is going on there. I've noticed that my DomU crashes if I replace the kernel image and then restart. Normal restart'ing (without replacing the kernel) works fine.

AdrianChadd/XenHackery (last edited 2009-05-18T19:34:18+0000 by AdrianChadd)