FreeBSD Developer Summit: Hardening UFS Group

Thursday June 8th 15:30 to 16:30 in DMS 1130

Overview

The goal of this session is to come up with strategies for hardening UFS/FFS.

If you would like to participate, contact the working group chairs below and CC <devsummit AT FreeBSD DOT org>. You will be then added to this page. Please include a list of things you want to talk about or the areas you are interested in. This helps us in planning the session and to bring people together with common interests.

It is possible to bring in people who cannot attend in person via video conference or chat tools. Notes during the session will be published later on for the whole community to see what we discussed.

Goals

Some possible ideas: In the event of unrecoverable write errors, downgrade the filesystem to read-only rather than panic'ing. In the event of loss of the underlying media, forcibly unmount the filesystem rather than panic'ing. Review all the panic's in the filesystem and replace them with one of the two above approaches. This is not an exhaustive list and if you feel there is something missing that you want to talk about, contact one of the session chairs and we will include your topic here. Note that the numbering of the topics does not represent an ordering or importance indication of any kind, but rather a reference to the second table with the "topic of interest" column.

Topics

#

Topic Description

Note: General presentations about work you have done that does not require further discussions should be submitted for the FreeBSD Developer Summit track at BSDcan (see general developer summit page).

Attending

In order to attend you need register for the developer summit as well as by email for the session and be confirmed by the working group organizer. Follow the guidelines described on the main page or what you received by email. For questions or if in doubt ask the session chairs.

Please do NOT add yourself here. Your name will appear automatically once you received the confirmation email. You need to put your name on the general developer summit attendees list though.

#

Name

Username / Affiliation

Topics of Interest

Notes

1

KirkMcKusick

<mckusick AT mckusick DOT com>

Organizer

2

Bryan Drewery

<bdrewery AT FreeBSD DOT org>

3

Warner Losh

<imp AT bsdimp DOT com>

4

Scott Long

<scottl AT netflix DOT com>

5

Reid Linnemann

<linnemannr AT gmail DOT com>

guest of ken@

6

Eric van Gyzen

<vangyzen AT FreeBSD DOT org>

7

Terence Telkamp

<Terence_Telkamp AT Dell DOT com>

guest of vangyzen@

8

Steve Wills

<swills AT FreeBSD DOT org>

9

Jason Eggleston

<eggnet AT gmail DOT com>

guest of swills@

10

JustinHibbits

<jhibbits AT FreeBSD DOT org>

11

Mike Karels

<karels AT FreeBSD DOT org>

12

Kevin Bowling

<kbowling AT llnw DOT com>

guest of swills@

=== Results ===

Prior to the meeting, several folks that will not be able to attend submitted the following comments for consideration by the group:

Conrad Meyer <cem AT FreeBSD DOT org> comments:

I suspect we should consider, if we haven't already, actively hostile filesystems. What happens when a user picks a USB stick off the ground and plugs it in? There are similar attack vectors in cloud/jail space — what happens if root, in a jail, creates an md device with malicious contents and mounts it? In any event, we don't want mounting such filesystems to lead to code execution in kernel context.

Konstantin Belousov <kib AT freebsd DOT org> comments:

My immediate thought, after I reviewed a list above, is that there are quite common issues which lead to crash and which cannot be handled in the way you proposed. I myself saw it several times at my previous $JOB, with the hardware raid controllers (LSI brand).

Problem is that the failure mode, unfortunately, is not the binary drive working/it is failed. I saw several times a situation where controller started returning wrong block (i.e. asked for cg block, buffer is filled with the user data), and several times I saw when controller returned zero block. The later more often occurs with USB-attached disks under load.

I never saw a situation when a controller returned complete garbage instead of _some_ real block content or zeroes, but of course this cannot be excluded as well.

The result of any such malfunction is the kernel crash due to UFS trusting the content of the metadata blocks. I am not sure that it is possible to code around the problems to make the kernel survive, but even limited sanity checks for each metadata element obtained from the disk would cause expensive slowdown, I believe.

Kirk responds: As at least a partial solution, we could calculate checksums for superblocks, cylinder groups, and inodes (stored in those structures) which we could check each time they are read and updated each time they are written. For block pointers, we could check that they are in range for the filesystem.

Kostik responds: Indeed this is only a partial solution. E.g. substitutions of one cg block by another would go unnocited with catastrophic consequences. Same for the inode blocks.

It might be more feasible to run UFS over some crypto-strong authentification layer to verify that blocks are pristine, eg. geli.

Kirk responds: Cylinder group blocks contain their number (cg_x) which could also be checked along with its checksum to ensure we got the one that we expected. As you note, inode blocks have the same substitution problem (and with higher probability than cylinder group blocks). We could solve that in the same way if we added the inode number to each inode. To save space, we could encode the inode number in the checksum (e.g., calculate the checksum, then XOR in the inode number when writing and do the reverse when doing read verification).

Edward Tomasz Napierala <trasz AT freebsd DOT org> comments:

I'm not sure I'll be able to attend, but the patch below might be relevant. It does three things: It adds a method for GEOM to notify the filesystem that the device went away, it adds a filesystem-independent vfs_orphan() function to prevent all future access to the vnodes (without unmounting it; doing that would make the future writes succeed, but silently go to the wrong filesystem), and makes UFS and msdosfs use it.

https://people.freebsd.org/~trasz/old/callback-6.diff

Notes from session

DevSummit/201706/HardeningUFS (last edited 2021-04-26T02:57:55+0000 by JethroNederhof)