To improve performance and reduce CPU overhead for network applications, programmable switches and NICs have been introduced in data centers to offload virtualized network functions, transport protocols, key-value stores, distributed consensus and resource disaggregation. Compared to general-purpose processors, programmable switches and NICs have more limited resources and only support a more constrained programming model. To this end, developers typically split a network function into a data plane to process common-case packets and a control plane to handle the remaining cases. The data plane function is then implemented in a packet processing language (e.g. P4) and offloaded into hardware.

Writing packet programs for network application offloading could be hard labor. First, even if the protocol specification (or source code) is available, the developer needs to read the thousand-page book (or code) and figure out which part are the common cases. Second, many implementations have subtle variations from the specification, so the developer often needs to examine packet traces and reverse-engineer the implementationspecific behaviors manually. Third, the offloaded function needs rewrite when the application is updated (e.g. regular expressions in a firewall).

We design P4Coder, a system to automatically synthesis the data plane by learning the behavior of a reference network application. No formal specification or source code is required. The developer only needs to design a few data-plane test cases and run the reference application. P4Coder captures the input and output packets, and searches for a packet program to produce identical output packets for the sequence of input packets. Obviously, passing the test cases does not imply that the program will generalize correctly for other inputs.

We follow the generate and test approach to generate variations of input packets to observe the behavior of the reference application. P4Coder may never able to discover corner cases or learn complicated cases. Such packets are forwarded to the control plane, similar to hand-written data-plane offloading programs. After the control plane modifies data-plane behavior, P4Coder redirects data-plane traffic to the reference application, probes and learns the updated data-plane behavior.

Another problem is that there are infinitely many programs to produce the expected output. We follow the Occam’s Razor principle to choose the program with minimal description length. When there are multiple programs with a same length, P4Coder generates test cases to determine which one is correct. Surely, the program generated by P4Coder cannot guarantee correctness in all cases. That said, P4Coder indeed produces a concise and human-readable representation of the common-case behavior of a network application, saving tremendous human labor in understanding the protocol.

Generally, program synthesis from examples is considered hard due to large search space. Fortunately, packet programs that can be offloaded into hardware are typically simple. Commodity programmable switches and NICs do not support loop and recursion. Each packet only allows one atomic operation per persistent state in data plane. The logical depth from an input field to an output field is limited by the pipeline depth in hardware. These limitations greatly reduce program search space. As an optimization, P4Coder can generate test cases to rule out potential search directions.

P4Coder is capable of specializing a wide range of applications whose data plane can be implemented in P4 programming language. P4Coder can learn packet field mappings (e.g. input source IP corresponds to output destination IP), transformations (e.g. decrement TTL) and constraints (e.g. IP version must be 4). P4Coder can infer the dependency among packet fields (e.g. parse the IP header depending on IP version 4 or 6) and variable-sized fields. P4Coder allows users to define customized transformation functions that are too hard to learn, enabling P4Coder to synthesize crypto protocols. P4Coder can derive persistent states (e.g. packet counter and TCP connection state) and the state machine. Stateful protocols, complicated as Paxos, can be synthesized by P4Coder.