FreeBSD Google Summer of Code Ideas

Below are ideas for FreeBSD GSoC, which the FreeBSD community have suggested. They fit roughly into the GSoC timelines, but Summer of Code applicants should contact idea owners or suitable mentors early so that they can get feedback during the application process. We want to ensure that projects can be completed within one of the GSoC timeframes.

These are only suggestions; copy/pasting the text in your proposal doesn't guarantee we will accept it. You can propose ANY FreeBSD-related idea/project you want but we want to see that you have researched the project and that you have a clear plan. We like original ideas, since we know you'll be most interested in working on something you came up with and are passionate about. Starting in 2022, there is more flexibility with project timelines, but a clear schedule of project milestones is still essential for success.

For projects where no mentor is listed, or for project proposals created by applicants, we recommend that you try to contact committers working in the area and discuss the project with them. If you are unsure who to contact, reach out to us by one of these methods.

  • Join the #freebsd-soc IRC channel on the Libera.Chat network. We use this channel for communication during both the application period and the coding period. Due to timezone differences, there may be delays before questions are answered, so please be patient. You can join the channel either using an IRC client like irssi, or using Libera.Chat's web client. You can read about how to connect to the Libera.Chat network.

  • Email the <freebsd-hackers AT FreeBSD DOT org> mailing list.

  • If you do not get a response on IRC or on the mailing list, you can email <soc-admin AT FreeBSD DOT org>/

We also have a generic FreeBSD ideas page at IdeasPage and WantedPorts. Note that many of these may not be suitable Summer of Code projects as-is, but may provide inspiration for ideas of your own. The projects listed on this page are not restricted to GSoC participants, of course.

For a feel for projects from previous years, visit: SummerOfCode2022, SummerOfCode2021, SummerOfCode2020, SummerOfCode2019, SummerOfCode2018, SummerOfCode2017, SummerOfCode2016, SummerOfCode2015, SummerOfCode2014, SummerOfCode2013, SummerOfCode2012, SummerOfCode2011, SummerOfCode2010.

How to use this idea list

For applicants

Skill levels

In 2022, GSoC projects can be 175 or 350-hour projects with the ability to extend the program from the standard 12 weeks to 22 weeks. Choose a project that you feel you can realistically complete within one of those timelines. If you feel unsure, please contact the listed mentor(s) to get their opinion on how much effort and experience a given project may require.


Kernel Projects


Unified kernel tracing interface

Mentor

Mark Johnston <markj AT FreeBSD DOT org>

Skills

C (advanced), Kernel

Mid-term deliverable

Implementation available for amd64

Duration

350 hours

Difficulty

Hard

Expected Outcome

Existing consumers converted to use the new interface

Description

The FreeBSD kernel contains several subsystems which add hooks to core pieces of the kernel. For example, the context switch code in the scheduler contains this snippet:

        if (td != newtd) {                                                                                                                                                                                                                                                                                                    
#ifdef  HWPMC_HOOKS                                                                                                                                                                                                                                                                                                           
                if (PMC_PROC_IS_USING_PMCS(td->td_proc))                                                                                                                                                                                                                                                                      
                        PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_OUT);                                                                                                                                                                                                                                                               
#endif                                                                                                                                                                                                                                                                                                                        
                SDT_PROBE2(sched, , , off__cpu, newtd, newtd->td_proc);                                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                                                              
#ifdef KDTRACE_HOOKS                                                                                                                                                                                                                                                                                                          
                /*                                                                                                                                                                                                                                                                                                            
                 * If DTrace has set the active vtime enum to anything                                                                                                                                                                                                                                                        
                 * other than INACTIVE (0), then it should have set the                                                                                                                                                                                                                                                       
                 * function to call.                                                                                                                                                                                                                                                                                          
                 */                                                                                                                                                                                                                                                                                                           
                if (dtrace_vtime_active)                                                                                                                                                                                                                                                                                      
                        (*dtrace_vtime_switch_func)(newtd);                                                                                                                                                                                                                                                                   
#endif                                                                                                                                                                                                                                                                                                                        
                td->td_oncpu = NOCPU;                                                                                                                                                                                                                                                                                         
                cpu_switch(td, newtd, mtx);                                                                                                                                                                                                                                                                                   
                cpuid = td->td_oncpu = PCPU_GET(cpuid);                                                                                                                                                                                                                                                                       
                                                                                                                                                                                                                                                                                                                              
                SDT_PROBE0(sched, , , on__cpu);                                                                                                                                                                                                                                                                               
#ifdef  HWPMC_HOOKS                                                                                                                                                                                                                                                                                                           
                if (PMC_PROC_IS_USING_PMCS(td->td_proc))                                                                                                                                                                                                                                                                      
                        PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_IN);                                                                                                                                                                                                                                                                
#endif

There are three hooks that may be called before switching into the new thread, and two hooks that may be called after the switch. They are used by DTrace and HWPMC to collect information about context switch events. These hooks are disabled most of the time, but each hook introduces overhead even when disabled since we must check whether it is enabled each time the code is executed.

The goal of the project is to identify useful kernel tracepoints, and design and implement a unified interface that can be used by different consumers to collect information about the event corresponding to a particular tracepoint. This would make it easier for new subsystems to collect information from existing tracepoints, rather than modifying the core kernel to add more hooks. An additional aim would be to ensure that such tracepoints have minimal overhead, probably by using hot-patching to enable and disable a particular tracepoint. FreeBSD-CURRENT now uses clang 10.0, which implements the "asm goto" construct that could be useful here.


Add Hypervirtualized clocks - Host side for both kvm and hyperv

Mentor

JohnBaldwin <jhb AT FreeBSD DOT org>

Skills

C (intermediate)

Duration

175 hours

Difficulty

Medium

Expected Outcome

Description

To come.


Add audit(4) support to NFS

Mentor

Rick Macklem rmacklem@FreeBSD.org

Skills

C (intermediate), Kernel (intermediate), Security (intermediate)

Mid-term deliverable

Ability to generate an audit trail for any NFSv3 RPC

Duration

350 hours

Difficulty

Medium

Expected Outcome

Ability to generate an audit trail for any NFS RPCs

Description

Security Event Audit is a facility to provide fine-grained, configurable logging of security-relevant events, and is intended to meet the requirements of the Common Criteria (CC) Common Access Protection Profile (CAPP) evaluation. It can record an audit record for various events, such as process creation, network activity, or file system activity. However, it mostly works at the syscall level. Since the NFS server is implemented within the kernel, that means NFS RPCs don't generate any audit records. audit(4) can still be used within NFS networks, but auditd must be run on every NFS client. It would be a great addition if each NFS RPC could be audited. That would allow auditd(8) to run just on the NFS server, and still audit all NFS activity within the network.


Port squashfuse to the FreeBSD kernel

Mentor

Ed Maste <emaste AT FreeBSD DOT org>

Skills

C (expert), fusefs (expert), kernel

Mid-term deliverable

Basic squashfs.ko functionality available with mount(8)

Duration

350 Hours

Difficulty

Hard

Expected Outcome

MFSBSD support

Description

squashfs is a read-only filesystem targeted for small embedded environments, where memory and disk space is constrained. squashfuse is a BSD-licensed FUSE implementation of squashfs. The project would be to port this implementation to the FreeBSD kernel, with the aim of being able to boot FreeBSD from an in-memory squashfs filesystem.


Implement MPLS support for FreeBSD

Mentor

Alexander Chernikov <melifaro AT FreeBSD DOT org>

Skills

C (intermediate), networking (intermediate)

Mid-term deliverable

MPLS forwarding works on static labels

Duration

350 hours

Difficulty

Medium

Expected Outcome

MPLS forwarding, encap/decap works, MPLS labels can be programmed via Netlink

Description

Multiprotocol Label Switching (MPLS) is an overlay network technology based on labels instead of IP addresses. The project would be to bring basic MPLS support for FreeBSD. Roughly, the implementation can be split into the following chunks:

The OpenBSD's MPLS stack or the NetBSD's MPLS stack overviews of other *BSD implementations can provide more datapoints.


Improve netgraph concurrency

Mentor

Alexander Chernikov <melifaro AT FreeBSD DOT org>

Skills

C (intermediate), networking (intermediate)

Mid-term deliverable

Traffic is able to pass in a lockless fashion in 2-node netgraph topology

Duration

350 hours

Difficulty

Medium

Expected Outcome

Traffic is able to pass in a lockless fashion between ng_<ppp|car|tee|iface>, compatibility retained with non-converted nodes

Description

Netgraph is a traffic-processing subsystem based on the dynamically configured graph of nodes and directed edges. Each node apply a single specific manipulation to the packet. The core idea is similar to VPP. Currently netgraph uses topology lock and node/hook atomic refcounts to ensure safe passing of the packets between the nodes. The goal of the project is to make passing data between the nodes lockless. The necessary primitives like epoch-based object reclamation and lockless datastructures are available in the base system. The rough implementation sketch may look like the following:


Add support for building Linux-only network drivers

Mentor

Mahdi Mokhtari <mmokhi AT FreeBSD DOT org>

Skills

C (intermediate), Kernel (intermediate), Network

Mid-term deliverable

Have netlink base functionalities working

Duration

350 hours

Difficulty

Hard

Expected Outcome

Be able to build a sample Linux-only network driver (Freifunk?) for FreeBSD

Description

Currently we have a Linux KPI for FreeBSD which mostly helps drm-next graphical stack to work.

The big-picture idea is to have network functionalities implemented in that way, so that we can build "Linux-only" network drivers for FreeBSD too. To accomplish this, he first step is to implement the API (KPI) most Linux drivers rely upon. Netlink is one example.

The steps (the first or at max 2nd one are for current/2022 GSoC)


Userland projects


Kernel LLDB debugger improvements, module support and integration

Mentor

Ed Maste <emaste AT FreeBSD DOT org>

Skills

C++ (intermediate)

Mid-term deliverable

Parse and report the kernel module list in a coredump

Duration

175 or 350 hours

Difficulty

Moderate

Expected Outcome

crashinfo functional with LLDB rather than kgdb

Description

LLDB is the debugger in the LLVM family of projects. It has supported userland debugging on FreeBSD for a long time, but kernel debugging support was added recently under FreeBSD Foundation sponsorship. One outstanding item remaining with kernel debug support is to handle kernel modules - that is, parse the loaded module data provided by the kernel (live or core dump) and loading the kernel module objects. This project will require at least intermediate skill with C++ (the language in which LLDB is written), as well as familiarity with C (the language used by the FreeBSD kernel).

This project may be short or long. Minimum requirement for a short project is support for one type of kernel module ELF object and one supported FreeBSD architecture. A long project would include both kernel module types and multiple FreeBSD architectures, as well as integration with the rest of the system.


UFS4fuse: support FreeBSD's UFS2 with fusefs

Mentor

?

Skills

C++ or Rust (expert), fusefs (expert)

Mid-term deliverable

Read-only support for userland

Duration

350 hours

Difficulty

Hard

Expected Outcome

Fully functional UFS with basic read and write support

Description

This project is challenging: any proposal must include a coding plan and should include credible evidence of userland filesystem development.


syzkaller improvements

Mentors

Andrew Turner <andrew AT FreeBSD DOT org>, Mark Johnston <markj AT FreeBSD DOT org>

Skills

golang (intermediate), kernel (intermediate)

Duration

350 hours

Difficulty

Hard

Expected Outcome

Complete FreeBSD syzkaller extenstions described below

syzkaller is a suite of tools that performs coverage-guided system call fuzzing. Originally targeting Linux, it can now fuzz many different operating system kernels and has been extremely effective at finding bugs, many with security implications. It creates, monitors and fuzzes a set of virtual machines running the target kernel. More information can be found in its documentation, and in these slides. Google kindly runs a public syzkaller instance targeting FreeBSD.

For a while it has been possible to run syzkaller on FreeBSD; that is, fuzz FreeBSD on FreeBSD. syzkaller makes use of ZFS and bhyve (or QEMU) to do so. This makes development and testing of FreeBSD-specific syzkaller features much easier.

Though syzkaller has found quite a few bugs in FreeBSD, it does not cover as much as it does on Linux, so it is virtually guaranteed that there are plenty of bugs waiting to be found. This project consists of several subtasks that would improve FreeBSD's coverage:

Note: contributing to syzkaller requires signing the Google CLA. Please make sure that this is acceptable before attempting this project. Note also that working with syzkaller is probably easiest on a dedicated hardware system with a reasonably large amount of disk space (several hundred GB should be sufficient), ideally running FreeBSD on ZFS. syzkaller can instantiate VMs on Google Compute Engine and fuzz them, so this may be an option as well. Please contact the project mentors for details.


Capsicumization of the base system

Mentor

Mariusz Zaborski <oshogbo AT FreeBSD DOT org>

Co-Mentor

Mark Johnston <markj AT FreeBSD DOT org>

Skills

C (intermediate), familiarity with the UNIX programming environment

Mid-term deliverable

Sandbox a few of the target applications

Duration

350 hours

Difficulty

Medium

Expected Outcome

Sandbox the complete list of target applications

Description

Capsicum is a sandboxing technology used in FreeBSD. It is complemented by Casper, a framework for defining application-specific capabilities. A large number of utilities in the FreeBSD base system have been converted to run under Capsicum and Casper, but many programs have yet to be converted. The project would consist of identifying several high-profile daemons and utilities in the base system or ports, and modifying them to run in capability mode. One good candidate is syslogd, the system logging daemon.

As part of this work it may be necessary or useful to add additional Casper services to the base system. For example, we do not yet have a Casper service which allows an application to make outbound network connections.

Note: Converting applications to run under Capsicum/Casper can take a lot of effort, especially when they are large or when they are not designed with privilege separation in mind. Some applications, like a shell, can't really be run in capability mode at all. Before attempting to sandbox a given application, take care to study the ways in which it interacts with the system. For example, does the application need to open any files? If so, are the file names statically defined or are they derived from user input? This will provide insight into the difficulties that will arise when sandboxing the application.


Network Configuration Libraries

Mentor

?

Skills

C (intermediate), knowledge of networking and NAT

Mid-term deliverable

Library to configure IPFW NAT to allow a bhyve VM guest to reach the Internet

Duration

350 hours

Difficulty

Medium

Expected Outcome

A library to manage NAT configuration for VMs and Jails

Deliverables

A simple tool to configure the network on a laptop to allow a bhyve VM to access the internet.

Use Cases:

Stretch goal: Extend the tool to support configuring network access for standard and VIMAGE jails

Description

Build a libipfw to enable programmatic configuration of the firewall, implement basic functionality to add rules and configure NAT instances.

Optional: relocate most functionality available in the command line interface into the library and replace the replace the command line interface with a thin wrapper around the new library.

Build a libbsdnat that can be used by bhyve VM managers and Jail management tools to configure NAT on the host to allow the guest access to the internet via the host's network. This library will then be extended to handle creating 'port forwarding' rules to expose services in the guest to the public network via the host. Network mappings will be ephemeral and will need to be recreated by the management tools when the VM or jail is restarted.

Possible design for final tool:


Audit Base for Read-Only and NFS Mount Compatibility

Mentor

Emmanuel Vadot manu@FreeBSD.org

Skills

/bin/sh (moderate), System and Network configuration, sqlite, berkelydb and similar (moderate), hier(7) and general FreeBSD base awareness

Mid-term deliverable

Identification of in-base components that are r/o or NFS incompatible

Duration

175 hours

Difficulty

Medium

Expected Outcome

Report on the nature of each incompatibility, ideally with proposed solutions. Kyua test suite modules for each issue identified

Description

FreeBSD has supported read-only and "root on NFS" for decades but has not experienced consistent testing for compatibility under these circumstances. Issues can include the failure of utilities to create temporary files and locking failures with "databases" such as /etc/pwd.db, /var/db/pkg/local.sqlite and /var/db/xenstore/dbf. Temporary workarounds include tmpfs(5) mounts for directories such as /var/db/<utility>, MD/MFS mounts, iSCSI/FC/ggate devices, or network-based authentication. Long-term solutions can follow the model of pkg(8)'s NFS_WITH_PROPER_LOCKING=yes option.

Search terms to help understand the problem: "NFS flock", "sqlite on NFS". Motivation: "Cloud" and "Container" environments often achieve statelessness via simple mechanisms like read-only and network file system mounts.


Build infrastructure


Integrate MFSBSD into the release building tools

Mentor

? (Email hackers@FreeBSD.org list for direction)

Skills

C (beginner), make (intermediate), shell (intermediate)

Mid-term deliverable

'make release' will build MFSBSD

Duration

175 hours

Difficulty

Easy

Expected Outcome

Completed integration so all new releases of FreeBSD will include mfsbsd media

Description

MFSBSD is a version of FreeBSD designed to run from memory. It is often used as a rescue system, or the basis for automated installers.

It is currently maintained by a FreeBSD developer, Martin Matuška

The issue is that only images for the release versions are usually produced, and MFSBSD tends to get out of sync with the tools in base. There is a desire to have MFSBSD images of the weekly snapshots of -current and -stable that are created by the release engineering team. This requires that the process be automated as an additional target in the makefile in src/release. It would be similar to the process used now to generate VM images. Adding flexibility to the MFSBSD build system to control additional features would be a bonus


Finish upstreaming bsd-user enhancements

Mentor

Warner Losh <imp AT FreeBSD DOT org>

Skills

C (intermediate)

Mid-term deliverable

All system calls into a patch stream for upstream, and can build packages with poudriere

Duration

175 hours

Difficulty

Easy

Expected Outcome

All bsd-user changes from our fork are upstream

Description

bsd-user is a set of modifications to the QEMU project. It is similar to the linux-user feature that's integrated into the QEMU project. Years ago, we forked our own tree, and had difficulty getting the changes at the time upstream. Since then, upstream attitudes have changed and they are very eager to get the bsd-user changes committed.

The current bsd-user fork is up-to-date with the latest qemu-project repo (as of the 7.2 release). The task here would be to break down the differences between our tree and upstream.

There's three groups of changes: (1) Changes to the binary loading. These are mostly the same upstream, but were originally based on linux-user. linux-user has evolved, but some of the relevant changes have not been merged over from linux-user. It is also desirable that refactoring the common items, like elf loading, could result in more code reuse and bsd-user's ability to keep up faster than it can today. (2) bsd-user has a number of architectures (powerpc, riscv and aarch64) that aren't in upstream today. These changes need to be broken down in to 50-100 line chunks, reviewed by the upstream and contributed to the maintainer to get into upstream. In addition, any adjustments or enhancements to the ABIs that have happened since the code was originally written may require adjustment to the current code. (3) bsd-user only has about half of the system calls implemented in the fork upstreamed. These also need to be chunked into small patches that can be reviewed upstream. There are some system calls that are poorly placed as well and may need changes to bsd-user's fork to correct.

A successful project will result in all or most of the changes being upstreamed. The successful candidate will also need to understand where to find FreeBSD's implementation for the architectural specific bits, as well as an understanding of setting up a poudriere build world, using poudriere jails and ways to effectively test subsets of the all the packages to ensure problems are corrected. A/B testing with the current really old version of qemu bsd-user based on 3.2 will help control regressions.


CI test harness for boot loader

Mentor

Warner Losh <imp AT FreeBSD DOT org>

Skills

shell-script or python (intermediate)

Mid-term deliverable

An end-to-end test for at least 2 of the FreeBSD boot environments

Duration

175 hours or 350 hours

Difficulty

Medium

Expected Outcome

An end-to-end test for all FreeBSD boot environments that can be added to our CI infrastructure

Description

FreeBSD can boot in about 40 different boot environments when you count all the combination of architecture, disk partitioning schemes, root filesystem,

A successful application will include the above matrix (send email to the Mentor before the deadline to ensure its correctness, or to specifically call out any environments that won't be done in the scope of the project). It will include plans for how to accomplish this task. Bonus points for a good plan for how to integrate into the FreeBSD CI system and/or github / cirrus-ci system.


CategoryGsoc

SummerOfCodeIdeas (last edited 2023-07-07T16:23:58+0000 by JosephMingrone)