Ext2 filesystem

The Extended Filesystem was designed, with much inspiration on UFS, to replace the minix filesystem used in early versions of linux. The filesystem is known to be very fast and continues to be very popular despite the availability of other interesting filesystems (JFS, reiserfs, XFS, etc).

In general Ext2fs is very similar to UFS but lacks support for fragments and as a consequence attempting to use bigger blocksizes penalizes the wasted space and also makes fragmentation a problem. The linux developers have also opted for supporting the faster async mode and compensate the eventual risks with more robust filechecking tools (fsck). The ext2 developers have tried to counter the weaknesses of the filesystem by adding new features in ext3 (journalling) and ext4 (extents) but the basic design seems to be exhausted and it's designers are apparently considering more recent designs like btrfs. For freebsd supporting ext2fs and it variants is still important for interoperability and as a general experimentation tool.

The initial FreeBSD ext2fs implementation was based on the BSD Lites version by Godmar Back. The initial approach was to reuse the BSD interfaces and the similarities between UFS and ext2 and merge the specific block management routines from linux. NetBSD did a complete reimplementation from UFS sources.

Main developments

During a long time NetBSD's and FreeBSD's implementation were maintained independently and while they basically worked, newer development was abandoned. The main difference between both implementations was basically the cleaner license in NetBSD implementation and the relatively good performance of the FreeBSD implementation. In both cases, running the BSD implementations in async mode (the default in linux) is considerably slower in the BSDs compared to the linux implementation.

On year 2009, there was a Google SOC project:

Improving Second Extended File system (ext2fs) and making it GPL free

This merged the block allocation code from NetBSD and a process of merging bugfixes and enhancement from FreeBSD's UFS1 was started. Notably the filesystem was made MPsafe and a feature called "Orlov allocator" (known also as the dirpref changes in UFS circles) were brought in.

In year 2010, a second Google SoC Project took place:

Enhance ext2fs to support preallocation and read ext4 file systems

Preallocation was implemented but after extensive testing it was determined that block reallocation was a better alternative and was implemented based on UFS code. Read-only support for (extents-based) ext4fs was also developed but was only brought into the tree until 2013. Simultaneously to this project the merging of fixes from UFS1 was finished. This brought basic O_DIRECT support for async mode mounting and several adjustments.

In year 2012, NetBSD has a different Google Summer of Code project:

HTree directory indexing for Ext3

ZhengLiu did a port of Vyacheslav Matyushin's HTree implementation. To use it, the directory index option has to be enabled with tune2fs. This code is also useful for the ext4 implementation where it is used by default.

Ext2 development on FreeBSD is now much easier thanks to the similarities with the traditional UFS and merging features and fixes from UFS is an ongoing process. Recent additions include support for nanosecond/birthtime timestamps (2012) and the benefit of a generic SEEK_DATA/SEEK_HOLE lseek() extension (2013).

Features

In FreeBSD ext2, ext3 and ext4 are not different filesystems: ext2 is the base filesystem and some features from ext3 and ext4 are supported. All features in FreeBSD's implementation follow UFS semantics and this can sometimes impose important differences. In the case of trim support, for example, the code to merge this feature from UFS is relatively simple but while on Linux this is a mount option, in FreeBSD's UFS this has to be set in tunefs, so it's basically unsupported in the BSD ext2fs.

- FreeBSD 9.x+ fully supports the async mode, which is the default on linux. Unlike linux, the default is to use the sync mode which will make your filesystem more reliable at the cost of some performance.

- FreeBSD 11 (and soon 10.1, 9.3), FreeBSD enables by default the dir_index extension. Linux distributions are also starting to enable the feature by default on ext3 and ext4. Unlike linux, we also accept using this feature on ext2.

- FreeBSD 11 also uses by default reallocblk. This comes from UFS and helps preventing defragmentation issues. Linux doesn't have an exact equivalent but Ext4 does delayed allocation which is similar in concept.

- FreeBSD also supports most ext4 features in read-only mode. If your ext4 filesystem was originally converted from ext3 it is likely that you will be able to mount it and take most of the metadata, including complete timestamps and huge files.

NOTE: If you are trying to mount Ext4 partitions remember to specifically mount the filesystem read-only.

Known issues

- dir_index: the HTree code requires a lot of workarounds for the possibility of a hash collision. Linux does a workaround for the problems it causes to have hash order in readdir(); we currently have not implemented such workarounds. If you find inaccessible files you might have to disable the dir_index feature from the filesystem to access them.

- We lack support for Extended Attributes and ACLs: EAs are different from the ones in FreeBSD's UFS. Fortunately EAs are not very common.

- We lack support for journalling but this is not really a problem because we run in sync mode, which is usually safe, and the sysutils/e2fsprogs port works well. Furthermore, on linux, journalling is known to reduce performance.

Coding guidelines

The linux driver is, of course, copyleft, so code cannot be copied from one implementation to the other. Luckily our implementation is completely different and much more similar to UFS but developers are encouraged to read the existing (public) documentation before starting to work on new features.

We do try to keep the code in sync were possible with UFS/FFS to ensure new UFS fixes and features can be applied to Ext2. When implementing new features it is important to give a thought on the layout: the preference should always be to avoid invasive changes in the files that are shared to some extent with UFS and add new functionality in new files.

Future Projects

Depending on developer's interest there are some possibilities for future development:

- On Ext4, linux has recently widened the dirindex hash space from 31 to 63 bits. The change was backported to Ext3 and some Ext3 implementations are also enabling dir_index by default now. We have an experimental patch for this.

- Ext4fs RW support. Some Windows drivers support basic writing in ext4 (without extents for now) so adding write support and implementing more ext4 features (flex_bg seems important) is perfectly possible.

- For some reason journalling is a very requested feature: Jeff Read has started working on Ext3-compatible journalling for NetBSD. Haiku had success in a similar approach. Unfortunately, using gjournal, which would be easier, is incompatible with the format used by linux.

- Filesystem virtualization: Running ext2fs as ZFS' ZVOLS can be an interesting possibility for linux applications running in jails but it still requires some research.

- Performance testing. Benchmarks against UFS and the original linux ext2 (or the one in FreeBSD 8.x) would be interesting.

- Some options never picked up much interest in linux but may be interesting for hacking pleasure. One such option is e2compr which brings basic compression to ext2fs.

Documentation

The general documentation for UFS is useful for understanding FreeBSD's ext2fs driver.

For the linux implementation, there are several articles of varying usefulness:

The new ext4 filesystem: current status and future plans

State of the Art: Where we are with the Ext3 filesystem

State of the Art: Where we are with the Ext3 filesystem (presentation)

A Directory Index for Ext2

Ext3 Inode Versioning

Ext2fs (last edited 2014-03-04 03:27:43 by PedroGiffuni)