Project/routing status
Contents
Stage1: conversion
There are the following goals:
- Remove llentry read lock from fast path
- Use read rmlock for looking up lle data in fast path
- Remove rtentry lock from all data paths
- Use read rmlock for looking up routing next-hop data
Outstanding reviews
D5009 (IPv4 fast forwarding conversion)
D5010 (IPv6 forwarding conversion)
D4794 (Deal with per-ifa output counters)
D4962 (new LLE lookup functions, no sockaddrs in lltable data path)
D4751 (move all lltable code to separate files)
LLTABLE llentries
These changes are mostly done. Most interesting commits:
Removal of llentry read lock from fast path: IPv4: D3688 r291853, IPv6: D3780 r292155
Struct llentry reshaping (addresses instead of sockaddrs, all necessary data in first 64 bytes): r286624
Simplify attachin link-level header in IPv6 & make nd6_resolve() behave like arpresolve(): D1469 r287861
Make lltable code use unified per-af callbacks: r286577
Split IPv4/IPv6 lltable init/hash/lookup permitting having different hash functions/hash size for each address family: r286616
LLTABLE rmlock(9)
No major depends on performing conversion. Waiting on D4751 to timeout be approved and the new LLE lookup functions patch D4962.
The only "tricky" thing is that lltable needs to be protected by its own (per-lltable) set of locks, instead of (ab)using AFDATA interface lock.
Data path changes
The goal was to exclude struct rtentry and struct llentry from if_output() routines and provide an abstract way of passing arbitrary prepend data. Mostly done.
The last remaining step is to deal with per-ifa output traffic accounting (numbers from netstat -i).
Given that it won't be possible to do it fast (no stable ifa pointer can be returned) (at least for IPv4), there is a proposal to remove these counters: D4794
Most interesting commits:
Link precomputation API (remove lle cache from packet output routines): D4102 r292978
Struct route changes (remove rtentry dereference from packet output routines): r275196 r293544
New routing KPI conversion status
All rtalloc*() callers (along with code dereferencing struct rtentry) has to be converted to the new KPI. This is the pre-requisite for stage 2 (multipath, other advanced routing features). This is also the requirement for switching data path read lock to rmlock(9).
Most interesting commits:
Introduce fibX KPI: 291993
Transparently support current RADIX_MPATH implementation in fibX: 293657
Last user changing rte (rt_mtu) w/o proper locking eliminated: D1125 274611
Non-output consumers
Here we have lightweight routing KPI consumers: simple uRPF (reverse path versification), mtu checks, gateway or source address checks, etc.. Most of them are already converted.
The only huge problem is SCTP which caches struct rtentry and (which is much worse) actively uses its fields.
Done |
Desc |
Rev |
Code |
✓ |
Need to return mask (e.g. RIB) |
r293914 |
sys/netgraph/netflow/netflow.c: rt = rtalloc1_fib((struct sockaddr *)&sin, 0, 0, r->fib); |
✓ |
PF: basic is OK |
D4763, r293311 |
sys/netpfil/pf/pf.c: in_rtalloc_ign(&ro, 0, rtableid); |
✓ |
IPFW: basic is OK |
r293626 |
sys/netpfil/ipfw/ip_fw_table_algo.c: return (rtalloc1_fib(s, 0, 0, fib)); |
✓ |
TOE: uRPF checks |
r293309 |
sys/dev/cxgb/ulp/tom/cxgb_cpl_io.c: * rtalloc1, RT_UNLOCK on rt. |
|
TOE: tod_connect() |
https://people.freebsd.org/~melifaro/toe_rt_to_nh.diff (sent to np@) |
sys/netinet/tcp_offload.c: rt = rtalloc1(nam, 0, 0); |
✓ |
ND6: proxyarp... |
r293159 |
sys/netinet6/nd6_nbr.c: rtalloc_mpath_fib((struct route *)&ro, ntohl(taddr6.s6_addr32[3]), |
✓ |
ND6: need rib (return prefix) |
r293159 |
sys/netinet6/nd6.c: rt = in6_rtalloc1((struct sockaddr *)&pr->ndpr_prefix, |
✓ |
MTU checks, basic is OK, nearly done |
r293169 |
sys/netinet6/ip6_output.c: in6_rtalloc(ro_pmtu, fibnum); |
✓ |
rtsock RIB, should be easy |
r293159 |
sys/net/rtsock.c: rtalloc_ign_fib(&gw_ro, 0, fibnum); |
✓ |
TCP MTU, _ext is OK |
r294712 |
sys/netinet/tcp_subr.c: in_rtalloc_ign(&sro, 0, inc->inc_fibnum); |
|
pcb: ext is OK |
|
sys/netinet/in_pcb.c: in_rtalloc_ign(&sro, 0, inp->inp_inc.inc_fibnum); |
|
SCTP: NHOPS |
Discussing w/ rrs@/tuexen@ |
sys/netinet/sctp_pcb.c: SCTP_RTALLOC((sctp_route_t *) & net->ro, |
|
IB: _ext should be OK |
|
sys/ofed/drivers/infiniband/core/addr.c: rte = rtalloc1(dst_in, 1, 0); |
✓ |
rib lookup |
r293159 |
sys/netinet/in.c: |
✓ |
NFS: IPv6 SAS |
r294084 |
sys/fs/nfsclient/nfs_clport.c: rt = rtalloc1_fib((struct sockaddr *)&sad, 0, 0UL, |
✓ |
IPF: ext is OK |
D4764, r293628 |
sys/contrib/ipfilter/netinet/ip_fil_freebsd.c: in_rtalloc(ro, M_GETFIB(m0)); |
✓ |
gif: uRPF |
r291993 |
sys/netinet/in_gif.c: rt = in_rtalloc1((struct sockaddr *)&sin, 0, |
✓ |
mcast: ifp lookup |
r292015 |
sys/netinet/in_mcast.c: in_rtalloc_ign(&ro, 0, inp ? inp->inp_inc.inc_fibnum : 0); |
✓ |
lltable: rib rtcheck |
r292015 |
sys/netinet/if_ether.c: rt = in_rtalloc1((struct sockaddr *)&sin, 0, 0UL, 0); |
✓ |
mcast: ifp lookup |
r292015 |
sys/netinet6/in6_mcast.c: rtalloc_ign_fib((struct route *)&ro6, 0, |
✓ |
in6_ hlim |
r292015 |
sys/netinet6/in6_src.c: in6_rtalloc(&ro6, in6p->inp_inc.inc_fibnum); |
✓ |
icmp6 redirect |
r292015 |
sys/netinet6/icmp6.c: rt = in6_rtalloc1((struct sockaddr *)&sin6, 0, 0UL, RT_DEFAULT_FIB); |
✓ |
ipfw: uRPF |
r291993 |
sys/netpfil/ipfw/ip_fw2.c: in_rtalloc_ign(&ro, 0, fib); |
✓ |
if_stf: uRPF |
r292331 |
sys/net/if_stf.c: rt = rtalloc1_fib((struct sockaddr *)&sin, 0, |
Output/Forwarding
Here is the code which is actually doing packets output. IPv4 fastforwarding/IPv6 forwarding could be converted (more or less) easily, for other code there are 2 showstoppers:
rtentry caching in SCTP/flowtable. The plan is to implement simple nexthops (initally planned for stage 2) and cache nexthop references instead of caching struct rtentry.
Per-ifa output packet accounting. There is no good/easy way to make this accounting work. D4794 is the review with the discussion.
Done |
Desc |
Rev |
Code |
|
fib6_lookup_nhop |
sys/netinet6/ip6_forward.c: rin6.ro_rt = in6_rtalloc1((struct sockaddr *)dst, 0, 0, M_GETFIB(m)); |
|
|
selectroute(). NHOPS + lookup_prepend() |
|
sys/netinet6/in6_src.c: in6_rtalloc(ron, fibnum); /* multi path case? */ |
|
NHOPS + lookup_prepend() |
|
sys/net/flowtable.c: rtalloc_mpath_fib(ro, hash, fibnum); |
|
fib4_lookup_prepend() |
sys/netinet/ip_fastfwd.c: in_rtalloc_ign(ro, 0, M_GETFIB(m)); |
|
|
ip_forward() -- no so easy: need to pass cached route to ip_output(). NHOPS + lookup_prepend() |
|
sys/netinet/ip_input.c: rtalloc_mpath_fib(&ro, |
|
NHOPS + lookup_prepend() |
|
sys/netinet/ip_output.c: rtalloc_mpath_fib(ro, |
Routing rmlock(9)
Depends on on rtalloc* functions conversion