Infrastructure Needs
We should consider building out the following infrastructure:
- Reservation system for machines (for Jenkins to use)
- Machines referenced by attributes (1g, 10g, 100g, X Cores, Y CPU Packages)
Network configuration tool (for setting up something like: GlimmerGlass)
- Configuration
Initial buildout
- 3 Machines with 100G NICs connected back to back will allow for a few different configurations:
- TCP and UDP packet forwarding (iperf from machine A-B-C)
- pktgen to test pure forwarding perf without going up the stack (talk to gnn@ about this)
packet serving (Machine B can serve a static file to machines A&C, and run something like hey)
Add a phabricator option for bench result
Here is a crazy idea about the perfect system:
- User propose to review a patch related to performance improvement into phabricator and click on "please bench"
- Once this "bench" option selected, more option available like : "what kind of bench?" with a list proposing "TCP host/ forwarding or firewalling (pf/ipfw/ipf)"
- If it's a firewall, we need to add more option like "how many number of UDP flows?" proposing choice like "1K, 10K, 100K, 1M, 5M"
- And optionnaly, on what kind of hardware? proposing hardware available on the lab
- And an option for "do you want to collect PMC data" ?
- Then we should generate 2 kernels (before and after) and start benching and posting the ministat output.
Bench automatization
Network performance bench needs this input:
A minimum of 2 kernel&distributions set to being comparing
- A configuration set (regarding if you want to test TCP host, or forwarding or a firewall) [related to the hardware to be benched]
- Some packet generator configuration (number of UDP flows for netmap pkt-gen) or other data for TCP tools [if MAC addresse needed, related to the hardware to be benched]
Running the bench
We need to run multiple series (5 minimums) with a reboot between each of test. Warning of the long BIOS POST delay here: On my big hardware the bench itself is between 30 and 60 seconds, but the full reboot take about 4 minutes. Then I prefer "faster boot" hardware like supermicro in place of long boot Dell/HP servers. Once the server will boot using PXE, it can crash. Then we need some "timeout" that will generate an "IPMI power cycle".