FreeBSD 8.x TCP Parallelism Project

This page describes the on-going project by RobertWatson, in collaboration with KrisKennaway and others, to improve parallelism in the TCP/IP and UDP/IP stacks in FreeBSD 8.x.

History

FreeBSD 5.3 introduced an MPSAFE network stack.

FreeBSD 6.0 introduced significant performance optimizations and stability improvements.

FreeBSD 7.0 introduced further performance optimizations, lock granularity improvements, and strengthened invariants.

The Problem

By FreeBSD 7.0, all transmit paths allowed fully parallel execution for independent TCP connections on different processors, although during bind()/connect() acquired a global lock in order to manage TCP connections. However, the input path relies heavily on the global tcbinfo lock to protect global connection lists, but also to protect against potential state changes that might manipulate global lists. In practice, this means a brief exclusive global lock acquire for regular data/ACK segments, but holding the global lock for all processing involving SYN/FIN/RST segments. Also, all timers and connection-oriented paths (such as sendto() with an address to connect to) acquire the tcbinfo lock. With a single input thread for network processing, this isn't horrific, although a high rate of connections being opened/closed without much data sent may lead to high contention. With a parallel input path, such as with multi-queue network cards, multiple input interfaces, or parallel loopback processing, this presents a series problem for scalability.

The Plan: TCP

The objective is to expand support for, and opportunities for, fully parallel processing of independent TCP connections on multiple CPUs, as well as to continue to support pipelined processing of input data across CPUs.

The plan is to refactor TCP global data structures and the tcbinfo lock as follows:

The Plan: UDP

Unlike TCP, UDP is not stateful for most operations, and so synchronization requirements are even weaker. Much of the infrastructure is shared with TCP, so all TCP plans apply, but we also plan to:

Progress and Results

All TCP work is currently in the planning stages. A new benchmark, tcpp, has been developed to explore the impact of parallelism and concurrency on TCP performance. We are also interested in benchmarks involving web serving, web proxying/caching, NFS over TCP, and database sessions over TCP.

The conversion to rwlocks for pcbinfo and inpcb has been completed in the rwatson_udp branch, with most UDP operations converted to read-only operations. This leads to significant performance improvements for nsd and bind, which make extensive parallel use of a single socket from many threads.

TCPParallelism (last edited 2008-06-17T21:38:06+0000 by anonymous)