This is page about subversion branch projects/ifnet, or "opaque ifnet" project.
- Introduction: why do we need it, what's already in head
- General ideas for new ''struct ifnet''
- Converting drivers to new KPI
- Heavy tasks, not yet done, but must be done
- Maintaining compat shims in the future
Introduction: why do we need it, what's already in head
The ifnet(9) structure represents a network interface. Right now, it is exported across all NIC drivers and all protocols. Hence, whenever we change the structure, we are forced at least to recompile all drivers that generally involves changing the drivers' code. This significantly slows downs the development in this area and also makes it hard or sometimes even impossible to merge changes back to the stable branches.
After years of development, it becomes clear that the structure should be opaque to drivers and, in theory, to the network protocols. Such an opaqueness could be achieved in two ways:
Quick way Just hide the structure and manually substitute direct access to structure fields with helper functions. Such a converting process could be automated with sed(1).
Proper way Try to design a proper KPI that it gives as much freedom as possible when changing the struct ifnet afterwards, still providing NIC drivers with all desired access level to the ifnet data.
The disadvantage of the quick way is that despite of avoiding recompilation when adding fields to the struct, the KPI actually forces the structure's layout. Anyway, the quick way has been already committed to FreeBSD as Juniper's drvapi patch:
Nevertheless, the long term plan is to go the proper way. The proper way is going in the subversion repository:
General ideas for new ''struct ifnet''
The <net/if_var.h> has the struct ifnet declaration with all other structs and methods being opaque to NIC drivers. In future, the entire file should be under _KERNEL ifdef.
The <net/if.h> has all KPI for drivers definitions. NIC drivers must not include if_var.h. In-system interface implementations (netgraph, lagg(4), vlan(4), etc.) may include if_var.h, and thus be aware of struct ifnet and internal methods, however this is discouraged and should be avoided if possible.
Historically, the struct ifnet is a storage for all kind of stuff that applies to a network interface. Therefore, struct ifnet has grown over 1 Kbyte on 64-bit platforms. However, the most of information containing in this structure is not specific for an interface instance being driver-specific instead. Another portion of structure concludes pointers to software contexts of various networking facilities that are normally not used. All these parts could be removed from the structure. The softc pointers could be optimized later when the structure is already opaque. In contrast, the driver-specific fields should be removed early at the stage of KPI redesign. Here is the approximate list of driver-specific fields:
- All interface methods: input, output, ioctl, etc ...
- Driver name, pointer to if_clone(9).
- Header length, address length.
- Interface type, DLT type.
- TSO limits.
All this stuff should be moved into struct ifdriver that is allocated statically by a NIC driver with all the instances of given driver point at the same struct ifdriver.
We want to extract interface methods (or operations) from the struct ifdriver into a separate struct ifops. Our goal here is to make operations overloadable for lagg(4) and for the purposes of the proper ALTQ integration into the stack.
struct ifdriver, struct ifops are going to be declared in <net/if.h> and thus is visible to drivers. These structures will have spare fields.
Structure allocation and initialization
Drivers call if_attach() in their device_attach driver(9) DEVMETHOD. The context allows to sleep, therefore if_attach() cannot fail. Note: the only reason for it to fail is interface name conflict, but the vast majority of drivers do not request the specific name. If the name constraint is required, then a driver must be prepared for the possible error returned from if_attach().
The only argument to if_attach() is a pointer to the struct if_attach_args. This structure points to struct ifdriver and also contains all information specific to the instance of an interface being attached. struct ifdriver has a field ifat_version that specifies the version of the KPI at compile time. This allows the kernel to understand what version of KPI the driver uses and optionally fix the driver's struct ifdriver to make it compatible with new network stack.
The if_attach returns pointer of if_t type which is actually an opaque pointer to struct ifnet.
Converting drivers to new KPI
Normal converting procedure
The vast majority of Ethernet drivers in FreeBSD are copy-pasted from each other, so the procedure is quite common for them. The typical TODO list includes the following steps:
- Remove if_var.h, bpf.h, if_arp.h, if_types.h, if_vlan_var.h from the includes list.
- Declare struct ifdriver at the beginning of a file. You might use the already converted drivers as an example.
- Convert from xxx_start(ifp) to xxx_transmit(ifp, m). A conversion itself is quite straightforward:
- In ifdriver declaration, define .ifdrv_maxqlen then take the value of IFQ_SET_MAXLEN() macro.
- In ifdriver ifops declaration, define if_transmit function.
- Rename xxx_start() to xxx_transmit() and change its prototype.
- The new named xxx_transmit() should do the following:
- Try to if_snd_enqueue() for an mbuf or return.
- mtx_lock() for the driver softc.
- Process the queue in loop starting with the if_snd_dequeue(). The loop is based on the previous loop that did IFQ_DRV_IS_EMPTY/IFQ_DRV_DEQUEUE.
- Do not do any statistic accounting in xxx_transmit(). This should be done in TX completion interrupt.
- Drivers are now responsible to account IFCOUNTER_IBYTES. Put the update near update to IFCOUNTER_IPACKETS, in the receive interrupt handler.
- Drivers should now account both IFCOUNTER_OBYTES and IFCOUNTER_OPACKETS in the transmit completion interrupt handlers. A function if_inc_txcounter() can be used.
- Rewrite the device attach method xxx_attach(). New method should do the following:
- Declare the struct if_attach_args on stack and initialize it with static values.
- Perform all important hardware initialization. No modification needed here, except of mii_attach() that now doesn't accept if_t argument.
- When all hardware is successfully initialized, fill in the rest of if_attach_args and call if_attach(). It can't fail.
- Miibus now doesn't modify ifnet(9). This means the following:
- You need to init the baudrate and if_link_state in xxx_attach() and later maintain them both in miibus_statchg method.
- Forget about IFF_DRV_RUNNING. Many drivers utilize this flag intensively as their internal state storage. In this case, simply move this logic into softc flags, for instance by using sed(1).
- Rewrite the xxx_ioctl method. In general, driver no longer writes to if_flags, if_mtu, if_capenable, etc. It only acknowledges or declines new values. All generic checks are handled by the stack. Driver performs the final check, therefore, if xxx_ioctl() returns 0 then a new value is written to ifnet by the stack. Details:
SIOCSIFFLAGS The IFF_CANTCHANGE check is done by the stack. Normally driver wants to store the flags from the request inside its own softc. It is safe since now KPI guarantees that flags cannot be changed without calling of xxx_ioctl(SIOCSIFFLAGS).
SIOCSIFMTU The (IF_MINMTU <= mtu <= IF_MAXMTU) check is performed by the stack. The check whether SIOCSIFMTU has not been changed is also performed by the stack. Storing MTU in softc is safe.
SIOCSIFCAP The check that new capenable are subset of driver capabilities is done by stack. The TSO dependency on hardware checksumming and VLAN tagging is also handled by stack. The no change SIOCSIFCAP is handled by stack. The vlan(4) dependencies are handled by stack. The driver must analyze the ifr_reqcap and perform all important hardware initialization. The driver may alter ifr_reqcap. The driver may take current capabilities from ifr_curcap. If driver doesn't return error it may save ifr_reqcap in its softc. The driver must fill in ifr_hwassist with a new value. If a driver supports polling(4) then it should switch its interrupts on or off depending on the IFCAP_POLLING flag. However, registering and unregistering within the polling infrastructure is handled by the stack.
On unknown ioctl the driver must return EOPNOTSUPP.
- If driver wants to traverse multicast address list of the interface, it should use if_foreach_maddr().
- xxx_init() method disappears as part of KPI. It used to be a source of ambiguity, since it was usually called internally and sometimes by the stack. Now it is called only internaly, so drivers must declare xxx_init() static and call it when SIOCSIFFLAGS say that interface goes IFF_UP.
grep for struct ifnet * and substitute it to if_t. This is a cosmetic change needed to force a converted driver to return FALSE when executing 'grep ifnet if_xxx.c', which allows monitoring of the conversion process by a script.
So far, miibus(4) is no longer aware about struct ifnet that was a layering violation between device layer and network stack. This change requires a few extra changes to the NIC drivers that use miibus(4).
Drivers must mii_attach() call before if_attach(). The struct ifnet * argument to mii_attach() is vanished.
- Drivers must do baudrate and link state management themselves in miibus_statchg devmethod.
- Some PHYs require to obtain MTU of the interface. Drivers that expect such PHYs must implement miibus_readvar DEVMETHOD and return MIIVAR_MTU from that method (see bge(4) as an example).
Tracked at separate page projects/ifnet/progress.
Heavy tasks, not yet done, but must be done
Apart from tons of drivers, there are more things to be done:
ALTQ strongly relies on struct ifqueue being embedded into struct ifnet. In the old world order that lead to ALTQ usually conflicting with high performance TX enhancements for NIC drivers. Some NICs dropped the ALTQ support. For others a drbr shim layer was introduced. In the new world order the ifqueue is removed, as well as drbr. The plan for ALTQ is that it hijacks if_ops of an ifnet, and runs all packets through its own altq_transmit, which in its turn may queue the packet, or put it on ALTQs queue. This will require zero changes to drivers, unlike older implementation. This will require working with ALTQ itself quite a lot. Note, that KAME is no more, OpenBSD have deleted ALTQ, and NetBSD have moved it to userland. So, we own our copy of ALTQ, should move it out of contrib and maintain it.
lagg(4) will definitely require special care.
VIMAGE is probably broken at the current state of projects/ifnet. The project is a good chance to improve things around moving ifnets between vnets.
Maintaining compat shims in the future
Since struct if_attach_args begins with version field, we can differentiate a driver compiled for older kernel. We can keep compat shims in net/if_compat.c, where we will keep the history of older struct if_attach_args, struct ifaddr, struct ifops and all the compat shims required to convert them to newer, allowing for older drivers to attach.