FreeBSD Google Summer of Code Ideas
Below are ideas for FreeBSD GSoC, which the FreeBSD community have suggested. They fit roughly into the GSoC timelines, but Summer of Code applicants should contact idea owners or suitable mentors early so that they can get feedback during the application process. We want to ensure that projects can be completed within one of the GSoC timeframes.
These are only suggestions; copy/pasting the text in your proposal doesn't guarantee we will accept it. You can propose ANY FreeBSD-related idea/project you want but we want to see that you have researched the project and that you have a clear plan. We like original ideas, since we know you'll be most interested in working on something you came up with and are passionate about. Starting in 2022, there is more flexibility with project timelines, but a clear schedule of project milestones is still essential for success.
For projects where no mentor is listed, or for project proposals created by applicants, we recommend that you try to contact committers working in the area and discuss the project with them. If you are unsure who to contact, reach out to us by one of these methods.
Join the #freebsd-soc IRC channel on the Libera.Chat network. We use this channel for communication during both the application period and the coding period. Due to timezone differences, there may be delays before questions are answered, so please be patient. You can join the channel either using an IRC client like irssi, or using Libera.Chat's web client. You can read about how to connect to the Libera.Chat network.
Email the <freebsd-hackers AT FreeBSD DOT org> mailing list.
If you do not get a response on IRC or on the mailing list, you can email <soc-admin AT FreeBSD DOT org>/
We also have a generic FreeBSD ideas page at IdeasPage and WantedPorts. Note that many of these may not be suitable Summer of Code projects as-is, but may provide inspiration for ideas of your own. The projects listed on this page are not restricted to GSoC participants, of course.
For a feel for projects from previous years, visit: SummerOfCode2022, SummerOfCode2021, SummerOfCode2020, SummerOfCode2019, SummerOfCode2018, SummerOfCode2017, SummerOfCode2016, SummerOfCode2015, SummerOfCode2014, SummerOfCode2013, SummerOfCode2012, SummerOfCode2011, SummerOfCode2010.
Contents
How to use this idea list
For applicants
- The list of projects here should be regarded as inspiration, not as the exclusive list of all projects that will be considered. We are happy to try to find mentors for projects that will benefit the FreeBSD project. If you're unsure if your idea fits this description, try asking in #freebsd-soc on Libera.Chat or #bsddev on EFNet.
- You should propose the design of your idea first, and only start coding once your project is accepted. Producing a decent design will save you lots of effort.
- Most mentors are working professionals. Please use their time effectively, with up-front study of issues and problems, and try to provide as much data as possible.
We have a page at SummerOfCodeDevEnvironments where we detail some possible ideas for how to set up your developer environment.
We also have several pieces of advice on applying at SummerOfCodeStudentChecklist
Skill levels
In 2022, GSoC projects can be 175 or 350-hour projects with the ability to extend the program from the standard 12 weeks to 22 weeks. Choose a project that you feel you can realistically complete within one of those timelines. If you feel unsure, please contact the listed mentor(s) to get their opinion on how much effort and experience a given project may require.
C: For C-related ideas, at least some experience coding in C is necessary. Generally, it is easier to program/debug userland software than it is when working on the kernel. Kernel programming/debugging is harder, and can involve skills ranging from understanding internals of operating systems construction to understanding the computer architecture itself.
Kernel: The FreeBSD kernel is written in C, so at least a basic proficiency in C is required to choose a kernel project. Various levels of other skills may be necessary, but you can certainly assume that knowing how to do "Hello world" in the kernel and knowing how to recompile the kernel should be an essential starting point.
Network: Understanding of TCP/IP will be required for these projects. It'd be good if you understood socket API for any programming language, and know stuff like "sockets".
Security: For these tasks you must have general understanding of security issues, UNIX permission model, basics of cryptography and its applications.
Scripting: For scripting, unless otherwise stated we won't enforce which language you choose, however there are certain factors for this choice: if you're willing to work with /bin/sh (+awk/sed/etc), there is a higher chance that whatever you produce could be committed to the base system. If you pick something like bash, which we don't include in the base system, it may still be possible to create a convincing case for the tool to get into the tree, with some level of commitment and convincing. For other things in languages too large for the base system and not present there (e.g.: perl, python, ruby), these can still be offered as useful tools to others, but (e.g.) these would be bad language choices for things that need to be usable with just the base system (e.g. the installer). Tools are equally important as every part of the OS. If you create a tool for debugging, analyzing and visualizing complex data, it might help dozens of FreeBSD hackers around the world to be 10x more productive.
Kernel Projects
Unified kernel tracing interface
Mentor |
Mark Johnston <markj AT FreeBSD DOT org> |
Skills |
C (advanced), Kernel |
Mid-term deliverable |
Implementation available for amd64 |
Duration |
350 hours |
Difficulty |
Hard |
Expected Outcome |
Existing consumers converted to use the new interface |
Description
The FreeBSD kernel contains several subsystems which add hooks to core pieces of the kernel. For example, the context switch code in the scheduler contains this snippet:
if (td != newtd) { #ifdef HWPMC_HOOKS if (PMC_PROC_IS_USING_PMCS(td->td_proc)) PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_OUT); #endif SDT_PROBE2(sched, , , off__cpu, newtd, newtd->td_proc); #ifdef KDTRACE_HOOKS /* * If DTrace has set the active vtime enum to anything * other than INACTIVE (0), then it should have set the * function to call. */ if (dtrace_vtime_active) (*dtrace_vtime_switch_func)(newtd); #endif td->td_oncpu = NOCPU; cpu_switch(td, newtd, mtx); cpuid = td->td_oncpu = PCPU_GET(cpuid); SDT_PROBE0(sched, , , on__cpu); #ifdef HWPMC_HOOKS if (PMC_PROC_IS_USING_PMCS(td->td_proc)) PMC_SWITCH_CONTEXT(td, PMC_FN_CSW_IN); #endif
There are three hooks that may be called before switching into the new thread, and two hooks that may be called after the switch. They are used by DTrace and HWPMC to collect information about context switch events. These hooks are disabled most of the time, but each hook introduces overhead even when disabled since we must check whether it is enabled each time the code is executed.
The goal of the project is to identify useful kernel tracepoints, and design and implement a unified interface that can be used by different consumers to collect information about the event corresponding to a particular tracepoint. This would make it easier for new subsystems to collect information from existing tracepoints, rather than modifying the core kernel to add more hooks. An additional aim would be to ensure that such tracepoints have minimal overhead, probably by using hot-patching to enable and disable a particular tracepoint. FreeBSD-CURRENT now uses clang 10.0, which implements the "asm goto" construct that could be useful here.
Add Hypervirtualized clocks - Host side for both kvm and hyperv
Mentor |
JohnBaldwin <jhb AT FreeBSD DOT org> |
Skills |
C (intermediate) |
Duration |
175 hours |
Difficulty |
Medium |
Expected Outcome |
|
Description
To come.
Add audit(4) support to NFS
Mentor |
Rick Macklem rmacklem@FreeBSD.org |
Skills |
C (intermediate), Kernel (intermediate), Security (intermediate) |
Mid-term deliverable |
Ability to generate an audit trail for any NFSv3 RPC |
Duration |
350 hours |
Difficulty |
Medium |
Expected Outcome |
Ability to generate an audit trail for any NFS RPCs |
Description
Security Event Audit is a facility to provide fine-grained, configurable logging of security-relevant events, and is intended to meet the requirements of the Common Criteria (CC) Common Access Protection Profile (CAPP) evaluation. It can record an audit record for various events, such as process creation, network activity, or file system activity. However, it mostly works at the syscall level. Since the NFS server is implemented within the kernel, that means NFS RPCs don't generate any audit records. audit(4) can still be used within NFS networks, but auditd must be run on every NFS client. It would be a great addition if each NFS RPC could be audited. That would allow auditd(8) to run just on the NFS server, and still audit all NFS activity within the network.
Port squashfuse to the FreeBSD kernel
Mentor |
Ed Maste <emaste AT FreeBSD DOT org> |
Skills |
C (expert), fusefs (expert), kernel |
Mid-term deliverable |
Basic squashfs.ko functionality available with mount(8) |
Duration |
350 Hours |
Difficulty |
Hard |
Expected Outcome |
MFSBSD support |
Description
squashfs is a read-only filesystem targeted for small embedded environments, where memory and disk space is constrained. squashfuse is a BSD-licensed FUSE implementation of squashfs. The project would be to port this implementation to the FreeBSD kernel, with the aim of being able to boot FreeBSD from an in-memory squashfs filesystem.
Implement MPLS support for FreeBSD
Mentor |
Alexander Chernikov <melifaro AT FreeBSD DOT org> |
Skills |
C (intermediate), networking (intermediate) |
Mid-term deliverable |
MPLS forwarding works on static labels |
Duration |
350 hours |
Difficulty |
Medium |
Expected Outcome |
MPLS forwarding, encap/decap works, MPLS labels can be programmed via Netlink |
Description
Multiprotocol Label Switching (MPLS) is an overlay network technology based on labels instead of IP addresses. The project would be to bring basic MPLS support for FreeBSD. Roughly, the implementation can be split into the following chunks:
- create MPLS routing tables, analogous to AF_INET[6] routing tables
- add the logic to handle MPLS packets (mpls_input(), mpls_forward(), mpls_output()) at various stages
- add the code to perform MPLS encap for IPv4/IPv6 routes, extending nexthops functionality
- add Netlink support for working with IP-MPLS, MPLS-MPLS and MPLS-IP routes
- Add userland support for working with MPLS routes to route(8) and netstat(8)
- [stretch] add fast fib lookup module to enable high-performance concurrent label lookups
The OpenBSD's MPLS stack or the NetBSD's MPLS stack overviews of other *BSD implementations can provide more datapoints.
Improve netgraph concurrency
Mentor |
Alexander Chernikov <melifaro AT FreeBSD DOT org> |
Skills |
C (intermediate), networking (intermediate) |
Mid-term deliverable |
Traffic is able to pass in a lockless fashion in 2-node netgraph topology |
Duration |
350 hours |
Difficulty |
Medium |
Expected Outcome |
Traffic is able to pass in a lockless fashion between ng_<ppp|car|tee|iface>, compatibility retained with non-converted nodes |
Description
Netgraph is a traffic-processing subsystem based on the dynamically configured graph of nodes and directed edges. Each node apply a single specific manipulation to the packet. The core idea is similar to VPP. Currently netgraph uses topology lock and node/hook atomic refcounts to ensure safe passing of the packets between the nodes. The goal of the project is to make passing data between the nodes lockless. The necessary primitives like epoch-based object reclamation and lockless datastructures are available in the base system. The rough implementation sketch may look like the following:
- Enable delayed reclamation of netgraph hooks and nodes under existing NET_EPOCH
Make core API like ng_address_hook() leverage existing NET_EPOCH and avoid taking topology locks / refcounts
- Test the implementation with a number of stateless nodes like ng_patch or ng_tee and ng_ipfw
- Evaluate and convert nodes on per-node basis on their reliance on the topology lock
Add support for building Linux-only network drivers
Mentor |
Mahdi Mokhtari <mmokhi AT FreeBSD DOT org> |
Skills |
C (intermediate), Kernel (intermediate), Network |
Mid-term deliverable |
Have netlink base functionalities working |
Duration |
350 hours |
Difficulty |
Hard |
Expected Outcome |
Be able to build a sample Linux-only network driver (Freifunk?) for FreeBSD |
Description
Currently we have a Linux KPI for FreeBSD which mostly helps drm-next graphical stack to work.
The big-picture idea is to have network functionalities implemented in that way, so that we can build "Linux-only" network drivers for FreeBSD too. To accomplish this, he first step is to implement the API (KPI) most Linux drivers rely upon. Netlink is one example.
The steps (the first or at max 2nd one are for current/2022 GSoC)
- Implement netlink
- Implement base functionalities (ioctl behavior)
- Implement Netlink protocols (NETLINK_ROUTE, FIREWALL, etc)
- Build a sample driver using new API
Userland projects
Kernel LLDB debugger improvements, module support and integration
Mentor |
Ed Maste <emaste AT FreeBSD DOT org> |
Skills |
C++ (intermediate) |
Mid-term deliverable |
Parse and report the kernel module list in a coredump |
Duration |
175 or 350 hours |
Difficulty |
Moderate |
Expected Outcome |
crashinfo functional with LLDB rather than kgdb |
Description
LLDB is the debugger in the LLVM family of projects. It has supported userland debugging on FreeBSD for a long time, but kernel debugging support was added recently under FreeBSD Foundation sponsorship. One outstanding item remaining with kernel debug support is to handle kernel modules - that is, parse the loaded module data provided by the kernel (live or core dump) and loading the kernel module objects. This project will require at least intermediate skill with C++ (the language in which LLDB is written), as well as familiarity with C (the language used by the FreeBSD kernel).
This project may be short or long. Minimum requirement for a short project is support for one type of kernel module ELF object and one supported FreeBSD architecture. A long project would include both kernel module types and multiple FreeBSD architectures, as well as integration with the rest of the system.
UFS4fuse: support FreeBSD's UFS2 with fusefs
Mentor |
? |
Skills |
C++ or Rust (expert), fusefs (expert) |
Mid-term deliverable |
Read-only support for userland |
Duration |
350 hours |
Difficulty |
Hard |
Expected Outcome |
Fully functional UFS with basic read and write support |
Description
This project is challenging: any proposal must include a coding plan and should include credible evidence of userland filesystem development.
syzkaller improvements
Mentors |
Andrew Turner <andrew AT FreeBSD DOT org>, Mark Johnston <markj AT FreeBSD DOT org> |
Skills |
golang (intermediate), kernel (intermediate) |
Duration |
350 hours |
Difficulty |
Hard |
Expected Outcome |
Complete FreeBSD syzkaller extenstions described below |
syzkaller is a suite of tools that performs coverage-guided system call fuzzing. Originally targeting Linux, it can now fuzz many different operating system kernels and has been extremely effective at finding bugs, many with security implications. It creates, monitors and fuzzes a set of virtual machines running the target kernel. More information can be found in its documentation, and in these slides. Google kindly runs a public syzkaller instance targeting FreeBSD.
For a while it has been possible to run syzkaller on FreeBSD; that is, fuzz FreeBSD on FreeBSD. syzkaller makes use of ZFS and bhyve (or QEMU) to do so. This makes development and testing of FreeBSD-specific syzkaller features much easier.
Though syzkaller has found quite a few bugs in FreeBSD, it does not cover as much as it does on Linux, so it is virtually guaranteed that there are plenty of bugs waiting to be found. This project consists of several subtasks that would improve FreeBSD's coverage:
- Extend syzkaller's FreeBSD system call descriptions. For each system call, syzkaller requires a set of annotations that describe the system call's arguments. It is missing many of FreeBSD's system calls. syzkaller similarly needs to be taught about device-specific ioctls.
- Support fuzzing of FreeBSD's Linux system call compatibility layer. Since syzkaller can of course fuzz Linux natively, it should be straightforward to run a Linux fuzzer on FreeBSD.
- Support external injection of USB traffic.
- Support running the fuzzer in a jail, optionally with various resource limits in place.
- Test syzkaller with a ZFS root filesystem instead of UFS. Work with the syzkaller developers to get a FreeBSD+ZFS target running in syzbot.
Port support for automated patch testing and crash bisection to FreeBSD. Some details are listed here.
- Work with the project mentor to triage and possibly fix any kernel bugs found in the process.
Note: contributing to syzkaller requires signing the Google CLA. Please make sure that this is acceptable before attempting this project. Note also that working with syzkaller is probably easiest on a dedicated hardware system with a reasonably large amount of disk space (several hundred GB should be sufficient), ideally running FreeBSD on ZFS. syzkaller can instantiate VMs on Google Compute Engine and fuzz them, so this may be an option as well. Please contact the project mentors for details.
Capsicumization of the base system
Mentor |
Mariusz Zaborski <oshogbo AT FreeBSD DOT org> |
Co-Mentor |
Mark Johnston <markj AT FreeBSD DOT org> |
Skills |
C (intermediate), familiarity with the UNIX programming environment |
Mid-term deliverable |
Sandbox a few of the target applications |
Duration |
350 hours |
Difficulty |
Medium |
Expected Outcome |
Sandbox the complete list of target applications |
Description
Capsicum is a sandboxing technology used in FreeBSD. It is complemented by Casper, a framework for defining application-specific capabilities. A large number of utilities in the FreeBSD base system have been converted to run under Capsicum and Casper, but many programs have yet to be converted. The project would consist of identifying several high-profile daemons and utilities in the base system or ports, and modifying them to run in capability mode. One good candidate is syslogd, the system logging daemon.
As part of this work it may be necessary or useful to add additional Casper services to the base system. For example, we do not yet have a Casper service which allows an application to make outbound network connections.
Note: Converting applications to run under Capsicum/Casper can take a lot of effort, especially when they are large or when they are not designed with privilege separation in mind. Some applications, like a shell, can't really be run in capability mode at all. Before attempting to sandbox a given application, take care to study the ways in which it interacts with the system. For example, does the application need to open any files? If so, are the file names statically defined or are they derived from user input? This will provide insight into the difficulties that will arise when sandboxing the application.
Network Configuration Libraries
Mentor |
? |
Skills |
C (intermediate), knowledge of networking and NAT |
Mid-term deliverable |
Library to configure IPFW NAT to allow a bhyve VM guest to reach the Internet |
Duration |
350 hours |
Difficulty |
Medium |
Expected Outcome |
A library to manage NAT configuration for VMs and Jails |
Deliverables
A simple tool to configure the network on a laptop to allow a bhyve VM to access the internet.
Use Cases:
- A bhyve VM running on a laptop NAT'd out the laptop's wifi connection
- A bhyve VM bridged to the hosts Ethernet network
Stretch goal: Extend the tool to support configuring network access for standard and VIMAGE jails
Description
Build a libipfw to enable programmatic configuration of the firewall, implement basic functionality to add rules and configure NAT instances.
Optional: relocate most functionality available in the command line interface into the library and replace the replace the command line interface with a thin wrapper around the new library.
Build a libbsdnat that can be used by bhyve VM managers and Jail management tools to configure NAT on the host to allow the guest access to the internet via the host's network. This library will then be extended to handle creating 'port forwarding' rules to expose services in the guest to the public network via the host. Network mappings will be ephemeral and will need to be recreated by the management tools when the VM or jail is restarted.
Possible design for final tool:
- libifconfig - used to create and manage bridge interface for bhyve, epairs for jails
- libbsdnat - Configure NAT for outbound, and port forward rules for inbound.
- libipfw - Used to insert rules routing traffic via above NAT instances.
Audit Base for Read-Only and NFS Mount Compatibility
Mentor |
Emmanuel Vadot manu@FreeBSD.org |
Skills |
/bin/sh (moderate), System and Network configuration, sqlite, berkelydb and similar (moderate), hier(7) and general FreeBSD base awareness |
Mid-term deliverable |
Identification of in-base components that are r/o or NFS incompatible |
Duration |
175 hours |
Difficulty |
Medium |
Expected Outcome |
Report on the nature of each incompatibility, ideally with proposed solutions. Kyua test suite modules for each issue identified |
Description
FreeBSD has supported read-only and "root on NFS" for decades but has not experienced consistent testing for compatibility under these circumstances. Issues can include the failure of utilities to create temporary files and locking failures with "databases" such as /etc/pwd.db, /var/db/pkg/local.sqlite and /var/db/xenstore/dbf. Temporary workarounds include tmpfs(5) mounts for directories such as /var/db/<utility>, MD/MFS mounts, iSCSI/FC/ggate devices, or network-based authentication. Long-term solutions can follow the model of pkg(8)'s NFS_WITH_PROPER_LOCKING=yes option.
Search terms to help understand the problem: "NFS flock", "sqlite on NFS". Motivation: "Cloud" and "Container" environments often achieve statelessness via simple mechanisms like read-only and network file system mounts.
Build infrastructure
Integrate MFSBSD into the release building tools
Mentor |
? (Email hackers@FreeBSD.org list for direction) |
Skills |
C (beginner), make (intermediate), shell (intermediate) |
Mid-term deliverable |
'make release' will build MFSBSD |
Duration |
175 hours |
Difficulty |
Easy |
Expected Outcome |
Completed integration so all new releases of FreeBSD will include mfsbsd media |
Description
MFSBSD is a version of FreeBSD designed to run from memory. It is often used as a rescue system, or the basis for automated installers.
It is currently maintained by a FreeBSD developer, Martin Matuška
The issue is that only images for the release versions are usually produced, and MFSBSD tends to get out of sync with the tools in base. There is a desire to have MFSBSD images of the weekly snapshots of -current and -stable that are created by the release engineering team. This requires that the process be automated as an additional target in the makefile in src/release. It would be similar to the process used now to generate VM images. Adding flexibility to the MFSBSD build system to control additional features would be a bonus
Finish upstreaming bsd-user enhancements
Mentor |
Warner Losh <imp AT FreeBSD DOT org> |
Skills |
C (intermediate) |
Mid-term deliverable |
All system calls into a patch stream for upstream, and can build packages with poudriere |
Duration |
175 hours |
Difficulty |
Easy |
Expected Outcome |
All bsd-user changes from our fork are upstream |
Description
bsd-user is a set of modifications to the QEMU project. It is similar to the linux-user feature that's integrated into the QEMU project. Years ago, we forked our own tree, and had difficulty getting the changes at the time upstream. Since then, upstream attitudes have changed and they are very eager to get the bsd-user changes committed.
The current bsd-user fork is up-to-date with the latest qemu-project repo (as of the 7.2 release). The task here would be to break down the differences between our tree and upstream.
There's three groups of changes: (1) Changes to the binary loading. These are mostly the same upstream, but were originally based on linux-user. linux-user has evolved, but some of the relevant changes have not been merged over from linux-user. It is also desirable that refactoring the common items, like elf loading, could result in more code reuse and bsd-user's ability to keep up faster than it can today. (2) bsd-user has a number of architectures (powerpc, riscv and aarch64) that aren't in upstream today. These changes need to be broken down in to 50-100 line chunks, reviewed by the upstream and contributed to the maintainer to get into upstream. In addition, any adjustments or enhancements to the ABIs that have happened since the code was originally written may require adjustment to the current code. (3) bsd-user only has about half of the system calls implemented in the fork upstreamed. These also need to be chunked into small patches that can be reviewed upstream. There are some system calls that are poorly placed as well and may need changes to bsd-user's fork to correct.
A successful project will result in all or most of the changes being upstreamed. The successful candidate will also need to understand where to find FreeBSD's implementation for the architectural specific bits, as well as an understanding of setting up a poudriere build world, using poudriere jails and ways to effectively test subsets of the all the packages to ensure problems are corrected. A/B testing with the current really old version of qemu bsd-user based on 3.2 will help control regressions.
CI test harness for boot loader
Mentor |
Warner Losh <imp AT FreeBSD DOT org> |
Skills |
shell-script or python (intermediate) |
Mid-term deliverable |
An end-to-end test for at least 2 of the FreeBSD boot environments |
Duration |
175 hours or 350 hours |
Difficulty |
Medium |
Expected Outcome |
An end-to-end test for all FreeBSD boot environments that can be added to our CI infrastructure |
Description
FreeBSD can boot in about 40 different boot environments when you count all the combination of architecture, disk partitioning schemes, root filesystem,
- and encryption. Currently, there's a hodge-podge of shell scripts that are able to build some of these environments, test in others. But there's no unified system to do the building, testing and reporting the results.
A successful application will include the above matrix (send email to the Mentor before the deadline to ensure its correctness, or to specifically call out any environments that won't be done in the scope of the project). It will include plans for how to accomplish this task. Bonus points for a good plan for how to integrate into the FreeBSD CI system and/or github / cirrus-ci system.
Linuxulator on powerpc64
Mentor
Justin Hibbits <jhibbits AT FreeBSD DOT org>, Brandon Bergren <bdragon AT FreeBSD DOT org>
Skills
C (intermediate), kernel (intermediate)
Duration
175 hours
Difficulty
Medium
Expected Outcome
Simple Linux binary support on FreeBSD/powerpc64
Description
FreeBSD provides optional binary compatibility with Linux, allowing users to install and run unmodified Linux binaries. The Linux binaries can be started in the same way native FreeBSD binaries can. They behave almost exactly like native processes and can be traced and debugged the usual way. The "linuxulator" is available for the i386, amd64, and arm64 architectures. This aim of this project is to add preliminary linuxulator support for FreeBSD/powerpc64.