Ext2/3/4 Filesystem

The Extended Filesystem was designed, with much inspiration on UFS, to replace the minix filesystem used in early versions of linux. The filesystem is known to be very fast and continues to be very popular despite the availability of other interesting filesystems (JFS, reiserfs, XFS, etc).

In general Ext2fs is very similar to UFS but lacks support for fragments and as a consequence attempting to use bigger blocksizes penalizes the wasted space and also makes fragmentation a problem. The linux developers have also opted for supporting the faster async mode and compensate the eventual risks with more robust filechecking tools (fsck). The ext2 developers have tried to counter the weaknesses of the filesystem by adding new features in ext3 (journalling) and ext4 (extents) but the basic design seems to be exhausted and its designers are apparently considering more recent designs like btrfs. For freebsd supporting ext2fs and it variants is still important for interoperability and as a general experimentation tool.

The initial FreeBSD ext2fs implementation was based on the BSD Lites version by Godmar Back. The initial approach was to reuse the BSD interfaces and the similarities between UFS and ext2 and merge the specific block management routines from linux. NetBSD did a complete reimplementation from UFS sources.

While on NetBSD the ext2fs implementation shares code with UFS, in FreeBSD the code is independent and has it's own sys/fs/ext2fs area in the kernel. Opengrok is a great alternative to study the code.

Main developments

During a long time NetBSD's and FreeBSD's implementation were maintained independently and while they basically worked, development was stuck. The main difference between both implementations was basically the cleaner license in NetBSD implementation and the relatively good performance of the FreeBSD implementation. In both cases, running the BSD implementations in async mode (the default in linux) is considerably slower in the BSDs compared to the linux implementation.

On year 2009, there was a Google SOC project:

Improving Second Extended File system (ext2fs) and making it GPL free

This merged the block allocation code from NetBSD and a process of merging bugfixes and enhancement from FreeBSD's UFS1 was started. Notably the filesystem was made MPsafe and a feature called "Orlov allocator" (known also as the dirpref changes in UFS circles) were brought in.

Ext2fs development on FreeBSD is now much easier thanks to the similarities with the traditional UFS and merging features and fixes from UFS is an ongoing process.

In year 2010, a second Google SoC Project took place:

Enhance ext2fs to support preallocation and read ext4 file systems

Preallocation was implemented but after extensive testing it was determined that block reallocation was a better alternative and was implemented based on UFS code. Read-only support for (extents-based) ext4fs was also developed but was only brought into the tree until 2013. Simultaneously to this project the merging of fixes from UFS1 was finished. This brought basic O_DIRECT support for async mode mounting and several adjustments.

In year 2012, NetBSD has a different Google Summer of Code project:

HTree directory indexing for Ext3

ZhengLiu did a port of Vyacheslav Matyushin's HTree implementation. the HTree code in Linux requires a lot of workarounds for the possibility of a hash collision which were not considered for the FreeBSD port. The code basically worked was but unstable and finally got fixed by DamjanJovanovic on SVN r294504.

Recent additions include support for nanosecond/birthtime timestamps (2012), the benefit of a generic SEEK_DATA/SEEK_HOLE lseek() extension (2013) and the support for sparse files from ext4 (2016).

Features

In FreeBSD ext2, ext3 and ext4 are not different filesystems: ext2 is the base filesystem and some features from ext3 and ext4 are supported. All features in FreeBSD's implementation follow UFS semantics and this can sometimes impose important differences.

- FreeBSD 9.x+ fully supports the async mode, which is the default on linux. Unlike linux, the default is to use the sync mode which will make your filesystem more reliable at the cost of some performance.

- FreeBSD 10.1+ uses by default reallocblk. This comes from UFS and helps preventing defragmentation issues. Linux doesn't have an exact equivalent but Ext4 does delayed allocation which is similar in concept.

- FreeBSD also supports most ext4 features in read-only mode. If your ext4 filesystem was originally converted from ext3 it is likely that you will be able to mount it and take most of the metadata, including complete timestamps and huge files. If the fs was fresh-formatted with ext4 it is still likely FreeBSD will be able to read the information.

NOTE: If you are trying to mount Ext4 partitions remember to specifically mount the filesystem read-only.

Known issues

- We lack support for journalling which is inconvenient but this perhaps not really a problem if you are OK with running in sync mode, which is usually safe and the default on FreeBSD. Furthermore, on linux, journalling is known to reduce performance.

- We lack support for Extended Attributes and ACLs: these are not very commonly used.

- We don't support trim: the code to merge this feature from UFS is relatively simple but while on Linux this is a mount option, in FreeBSD's UFS this is set in tunefs. For convenience we always choose to keep the same semantics as in UFS, so it's unsupported (for now).

Coding guidelines

As the rest of the FreeBSD code, we follow style(9). We do try to keep the code in sync where possible with UFS/FFS to ensure new UFS fixes and features can be applied to Ext2. When implementing new features it is important to give a thought on the layout: the preference should always be to avoid invasive changes in the files that are shared to some extent with UFS and add new functionality in new files.

The linux driver is, of course, copyleft, so code cannot be generally copied from one implementation to the other. Luckily our implementation is completely different and much more similar to UFS but developers are encouraged to read the existing (public) documentation before starting to work on new features.

Future Projects

Depending on developer's interest there are some possibilities for future development:

- Ext4fs RW support. Some Windows drivers support basic writing in ext4: there may be some features that we can't implement but adding write support and implementing more ext4 features remains possible.

- It would be good to support Extended Attributes and ACLs. EAs are different from the ones in FreeBSD's UFS and while it is not clear if it would be possible to do some translation from either to the BSD equivalents, NetBSD has some support for them that we should check out.

- On Ext4, linux has recently widened the dirindex hash space from 31 to 63 bits. We have had experimental patch for this but it is unclear how much it is useful since they didn't update the on-disk format.

- Ext4 has support for encryption.

- Journalling is a characteristic feature of ext3+. Unfortunately using FreeBSD's gjournal, which could have been a clean option, is incompatible with the format used by linux. Haiku did an adaptation of their own journalling layer. A while ago Jeff Read started implementing an Ext3-compatible journalling for NetBSD.

- Performance testing. Benchmarks against UFS and the original linux ext2 (or the one in FreeBSD 8.x) are always interesting. For research purposes we would like to keep the performance as similar as possible to the linux version.

- Some options never picked up much interest in linux but may be interesting for hacking pleasure. One such option is e2compr which brings basic compression to ext2fs.

- Filesystem virtualization: Running ext2fs as ZFS' ZVOLS is an interesting possibility for linux applications running in jails or bhyve. While it is not clear if such approach brings any specific advantage, it may be useful for some applications to use a real linux filesystem format.

Documentation

The most update information for the Ext4 implementation is available through the kernel project itself:

Ext4 (and Ext2/Ext3) Wiki

Some classic documentation for the linux implementation:

The new ext4 filesystem: current status and future plans

State of the Art: Where we are with the Ext3 filesystem

State of the Art: Where we are with the Ext3 filesystem (presentation)

A Directory Index for Ext2

For FreeBSD's specific ext2fs driver the general documentation for UFS is useful.

The BSDCan talk in May 2014 gave an overview of the FreeBSD implementation.

Ext2fs (last edited 2016-08-13 14:46:29 by PedroGiffuni)