Re: [PATCH bpf-next v2 0/8] Support defragmenting IPv(4|6) packets in BPF

From: Edward Cree
Date: Mon Feb 27 2023 - 15:38:49 EST


On 27/02/2023 19:51, Daniel Xu wrote:
> However, when policy is enforced through BPF, the prog is run before the
> kernel reassembles fragmented packets. This leaves BPF developers in a
> awkward place: implement reassembly (possibly poorly) or use a stateless
> method as described above.

Just out of curiosity - what stops BPF progs using the middle ground of
stateful validation? I'm thinking of something like:
First-frag: run the usual checks on L4 headers etc, if we PASS then save
IPID and maybe expected next frag-offset into a map. But don't try to
stash the packet contents anywhere for later reassembly, just PASS it.
Subsequent frags: look up the IPID in the map. If we find it, validate
and update the frag-offset in the map; if this is the last fragment then
delete the map entry. If the frag-offset was bogus or the IPID wasn't
found in the map, DROP; otherwise PASS.
(If re-ordering is prevalent then use something more sophisticated than
just expected next frag-offset, but the principle is the same. And of
course you might want to put in timers for expiry etc.)
So this avoids the need to stash the packet data and modify/consume SKBs,
because you're not actually doing reassembly; the down-side is that the
BPF program can't so easily make decisions about the application-layer
contents of the fragmented datagram, but for the common case (we just
care about the 5-tuple) it's simple enough.
But I haven't actually tried it, so maybe there's some obvious reason why
it can't work this way.

-ed