29 Oct 2015
MPTCP
Nigel Williams joins the call. MPTCP is now public. The code is up on a web site. The most recent code is a complete rewrite which was completed two months ago. The past four weeks have been all been in testing. The plan is to get into expanded testing. There are still panics on shutdown due to race conditions. That's all being addressed currently. George will get Nigel access to Phabricator so that it can be proposed as a patch. Randall asks if MPTCP can be integrated as part of the general rework to modularize the network stack. We all agree that this is not a bad idea and Randall says he expects to have changes to review in a couple of weeks. Nigel says that his changes are minimally invasive, but he does have a shim mptcp structure that sits between the stock stack and his code. The largest set of changes are in tcp_dosegment(). Mostly tested on VMs and dummynet.
Recent Code Changes
- Hiren has committed his change to initial cwnd.
- Hiren and Randall committed their changes to correctly count inflight/pipe in r290122
More cwnd questions
- Hiren has posted to the mailing list asking what we should do in the SACK case. Randall points out that we only retransmit one SACK segment but that we could move to sending 1/2 the congestion window. There is still significant rework to be done here. Randall suggests that Hiren continue down this path.
- When should the CC module take control? We need to make sure that the CC sets the cwnd. What we have now is only 1/2 right.
- George asks about whether this will all work for the delay based CC algorithms. Randall and Hiren do think that this works correctly in the current code.
- Both Hiren and Randall are saying they can run the new code in production.
- George asks about whether we can share some data that shows how we're doing.
- There is nothing currently to share.
- Lawrence has something based on NS3 to validate data.
- George will write up some strawman DTrace scripts to share with the transport list.
- Michael asks if anyone is looking at single packet loss or something else.
- Benno says that his concerns are around data center workloads and that they care about latency spikes.
- Hiren mentions packet drill and that he wants to look at delaying specific packets.
- Michael says that it would be easy to create a packet drill script for deterministic packet drops. He can't delay a packet but he can script the packet going back. He can inject the return packets in a way of doing delay.
- Jason Eggeleston points out that modifying the network reaction is his preferred way of testing. One interesting idea for dummynet is to have it be able to push packets into different delay queues.
- Hiren talked to Luigi about the above approach. It is not impossible but it is also non-trivial.
- Hiren will look into what Lawrence did with NS and also work on dummynet as well. George says that he will help Hiren out with the dummynet solution.
- Matt Macy has put the RFC notes together and folks should go and review them.
- George suggests that we have people present on the RFCs.
- George will present the work thus far at the Vendor Summit and try to get feedback and buy in.
- Mellanox is planning to commit their new driver in the near future [Soctt].