Re: Routing loops & TTL tracking with tunnel devices

From: Eric Dumazet
Date: Fri Apr 29 2022 - 16:54:36 EST



On 4/28/22 17:37, Jason A. Donenfeld wrote:
Hey Eric,

On Tue, Nov 17, 2015 at 03:41:35AM +0100, Jason A. Donenfeld wrote:
On Mon, Nov 16, 2015 at 11:28 PM, Eric Dumazet <eric.dumazet@xxxxxxxxx> wrote:
There is very little chance we'll accept a new member in sk_buff, unless
proven needed.
I actually have no intention of doing this! I'm wondering if there
already is a member in sk_buff that moonlights as my desired ttl
counter, or if there's another mechanism for avoiding routing loops. I
want to work with what's already there, rather than meddling with the
innards of important and memory sensitive structures such as sk_buff.
Well, 7 years later... Maybe you have a better idea now of what I was
working on then. :)

As an update on this issue, it's still quasi problematic. To review, I
can't use the TTL value, because the outer packet always must get the
TTL of the route to the outer destination, not the inner packet minus
one. I can't rely on reaching MTU size, because people want this to work
with fragmentation (see [1] for my attempt to disallow fragmentation for
this issue, which resulted in hoots and hollers). I can't use the
per-cpu xmit_recursion variable, because I use threads.

What I can sort of use is taking advantage of what looks like a bug in
pskb expansion, such that it always allocates too much, and pretty
quickly fails allocations after a few loops. Only powerpc64 and s390x
don't appear to have this bug. See [2] for a description of this in
depth I wrote a few months ago to you.


Hmm, I will take a look later I think. Thanks for the reminder.



Anyway, it'd be nice if there were a free u8 somewhere in sk_buff that I
could use for tracking times through the stack. Other kernels have this
but afaict Linux still does not. I looked into trying to overload some
existing fields -- tstamp/skb_mstamp_ns or queue_mapping -- which I was
thinking might be totally unused on TX?


if skbs are stored in some internal wireguard queue, can not you use skb->cb[],

like many other layers do ?



Any ideas about this?

Thanks,
Jason

[1] https://lore.kernel.org/wireguard/CAHmME9rNnBiNvBstb7MPwK-7AmAN0sOfnhdR=eeLrowWcKxaaQ@xxxxxxxxxxxxxx/
[2] https://lore.kernel.org/netdev/CAHmME9pv1x6C4TNdL6648HydD8r+txpV4hTUXOBVkrapBXH4QQ@xxxxxxxxxxxxxx/