Re: PROBLEM: DoS Attack on Fragment Cache

From: Willy Tarreau
Date: Sun Apr 18 2021 - 00:42:57 EST


On Sat, Apr 17, 2021 at 10:26:30PM -0400, Matt Corallo wrote:
> Sure, there are better ways to handle the reassembly cache overflowing, but
> that is pretty unrelated to the fact that waiting 30 full seconds for a
> fragment to come in doesn't really make sense in today's networks (the 30
> second delay that is used today appears to even be higher than RFC 791
> suggested in 1981!).

Not exactly actually, because you forget the TTL here. With most hosts
sending an initial TTL around 64, after crossing 10-15 hops it's still
around 50 so that would result in ~50 seconds by default, even according
to the 40 years old RFC791. The 15s there was the absolute minimum. While
I do agree that we shouldn't keep them that long nowadays, we can't go
too low without risking to break some slow transmission stacks (SLIP/PPP
over modems for example). In addition even cutting that in 3 will remain
trivially DoSable.

> You get a lot more bang for your buck if you don't wait
> around so long (or we could restructure things to kick out the oldest
> fragments, but that is a lot more work, and probably extra indexes that just
> aren't worth it).

Kicking out oldest ones is a bad approach in a system made only of
independent elements, because it tends to result in a lot of damage once
all of them behave similarly. I.e. if you need to kick out an old entry
in valid traffic, it's because you do need to wait that long, and if all
datagrams need to wait that long, then new datagrams systematically
prevent the oldest one from being reassembled, and none gest reassembled.
With a random approach at least your success ratio converges towards 1/e
(i.e. 36%) which is better.

Willy