Why FreeBSD

FreeBSD is an Open-Source Operating System, with advanced features, a large community, and a small footprint. FreeBSD can run on multiple platforms, and has over 33,000 packages.

In this document, we will go over the use cases of FreeBSD for scientific purposes.

Infrastructure provided by FreeBSD

FreeBSD has multiple subsystems that makes it perfect for scientific computing; here’s a basic list of what it offers:

ZFS

ZFS is an advanced filesystem with volume management capabilities. If there’s a filesystem feature that you want, then ZFS has it!

Some of ZFS’ features make it perfect for scientific computing, such as:

Feature

Description

Example use

Quotas

Set quotas to projects, so they won’t fill the disk by accident

zfs set quota=10T zsci/proj/dna_data

Reservations

Minimum amount of space guaranteed to a dataset

zfs set reservation=15T zsci/proj/future_data

Snapshots

almost-instant snapshots

zfs snap zsci/proj/src@major_change

Clones

Snapshot-based, diff-only clones

zfs clone zsci/proj/src@major_change zsci/proj/srcV2

Replication

stream-based, (optionally) incremental replication

zfs send zsci/proj@today | ssh zfs recv backups

Compression

Native and transparent data compression

zfs set compression=on zsci/proj/large_proj

Encryption

Native encryption, to keep your data secure at rest!

check zfs-load-key(8)

Jails

FreeBSD Jails are a mechanism to create systems isolated from the host, that have their own root directory, process tree and (optionally) network stack.

Jails are an amazing utility if you want to run:

They are known as “the old-school containers”, if you’ve heard of that buzz-word. FreeBSD Jails have been around since March 14, 2000, so they are VERY stable and secure!

You can learn about the basics of FreeBSD Jails in the FreeBSD Handbook: Jails and Containers

To setup a Jail right now, you can either:

  1. Setup a Jail manually
  2. Use Jail Managers

DTrace

The DTrace facility allows users to locate performance bottlenecks in production systems. In addition to diagnosing performance problems, DTrace can be used to help investigate and debug unexpected behavior in both the FreeBSD kernel and userland programs.

More importantly, the DTrace facility traces programs without the need to restart them.

Let’s say a scientist informs you that their program is “not working”. As a system operator, you understand that the software is doing something wrong, but need to dig deeper to understand what the root cause is.

Assuming the misbehaving program’s name is do_science, we can do the following:

# dtrace -n 'pid$target:::entry{ustack()}' -p `pgrep do_science`
[…]

  0  85517                      fopen:entry                                                                                                                                                                            
              libc.so.7`fopen+0x1                                                                                                                                                                                      
              do_science`initialize+0x2b                                                                                                                                                                               
              do_science`main+0x14                                                                                                                                                                                     
              do_science`_start+0x100                                                                                                                                                                                  
              `0x821408008                                                                                                                                                                                             
[…]

  1  83667                      sleep:entry 
              libc.so.7`sleep+0x1
              do_science`initialize+0x12
              do_science`main+0x14
              do_science`_start+0x100
              `0x821408008

# dtrace -n '::fopen:entry{trace(copyinstr(arg0))}' -p `pgrep do_science`
dtrace: description '::fopen:entry' matched 1 probe
CPU     ID                    FUNCTION:NAME
  0  85517                      fopen:entry   foo.txt                          
  0  85517                      fopen:entry   foo.txt

To learn more about DTrace check the Handbook and wiki (One-Liners).

Ports

The FreeBSD Ports tree is a collection of software ported to FreeBSD. There are over 30,000 ports.

All ports have categories, including scientific ones!

Category

Count

Astro

155

Biology

262

Geography

119

Math

1312

Parallel

33

Science

571

Many ports have multiple categories.

More importantly, you can port software to FreeBSD and submit your work for addition to the ports tree, to make it available on every FreeBSD machine out there, including yours!

If you do have your own software, you can setup a FreeBSD Ports tree for yourself, which can be distributed as a custom FreeBSD Package Repository, so all your scientists can do

to install the latest and greatest version of your program.

For porting to FreeBSD, check the Porters’ Handbook and Poudriere.

Resource Control

In a scientific environment, resources are always scarce and limiting their use is hard. FreeBSD provides a set of facilities to manage resources, such as the old-school Login Classes as well as the modern Resource Limits Database.

If you’ve already configured jail proj0, you can limit its memory (RAM) resource using the rctl command:

# rctl -a jail:proj0:memoryuse:deny=20G/jail

Another amazing facility is the CPU Sets, where specific CPUs can be bound to a Jail, so they don’t bother others!

Print Jails: their JIDs, names and CPUSet IDs

# jls jid name cpuset.id
1 proj0 5

Jail proj0 has JID 1 and CPUSet ID 5

Print the current CPUSet policy of Jail with ID 1:

# cpuset -g -j 1
jail 1 mask: 0, 1, 2, 3, 4, 5, 6, 7
jail 1 domain policy: first-touch mask: 0

Assign CPUSet 5 with the CPUs numbered 6 to 7

# cpuset -l 6-7 -s 5

done!

Scheduling Priority

FreeBSD comes with tools to manage the realtime/idletime priority of processes. This allows administrators to run tools in the background without disturbing other processes.

Priority is an integer between 0 and 31, where 0 is highest.

Example: Run build_database.sh in the background using Idle Priority

# idprio 31 ./build_database.sh

Run reconnect_nodes.sh with highest priority

# rtprio 31 ./reconnect_nodes.sh

Advanced Networking

Here's a list of networking features that come out-of-the-box with FreeBSD:

Challenges with FreeBSD in Science

As you have guessed, FreeBSD has awesome infrastructure to help administrators run their operations.

However, most scientific applications are not written with FreeBSD in mind. Most of the time they are written for Linux. In some cases, they are written for runtimes that work only on Linux, such as Docker.

Common Solutions

There are two common solutions for these problems:

Linux Jails!

FreeBSD can run Linux binaries using Linuxulator, which emulates the Linux ABI. If you want to run Linux Jails, check the following guide: FreeBSD Jail booting & running Devuan GNU+Linux with OpenRC and LinuxJails.

Please keep in mind that you might still have some issues, such as crashes while debugging (for example, invoking jstack to debug Java applications)

Porting!

A more time-consuming but stable approach is to port the software to FreeBSD.

Case Studies

Data Science on FreeeBSDARM64 by Maciej Czekaj

Scientific (last edited 2023-09-28T01:09:00+0000 by GrahamPerrin)