Re: [PATCH RFC net] net: Prevent sk_bound_dev_if causing packet to be rerouted back into tunnel

From: Steffen Klassert
Date: Thu Apr 17 2025 - 05:07:17 EST


On Tue, Apr 15, 2025 at 04:50:51PM +1200, Thomas Winter wrote:
> We have found a situation where packets going into an IPsec tunnel get
> encapsulated twice. For example, an icmp socket using SO_BINDTODEVICE
> of a tunnel and some mangle rules to implement policy based routing.
> After the first ESP encapsulation and running through the mangle table
> again, a difference in skb->mark causes ip_route_me_harder to be called
> but skb->sk->sk_bound_dev_if is still the tunnel. This causes the ESP
> packet to get routed back into the tunnel and get xfrm'd again using
> the same SA. The double encapsulated is then routed correctly out the
> physical interface.
>
> With a xfrmi interface on the other side, it was dropping the packet
> with LINUX_MIB_XFRMINTMPLMISMATCH. A ipvti interface would accept it.
> However the transmitting side should not have been doing the double
> ESP encapsulation in the first place.
>
> A potential fix for this is to drop the reference to skb->sk using
> skb_orphan before transmission. scrub_packet would do this but only
> if the packet is traversing namespaces. This allows ip_route_me_harder
> to select the correct route for the ESP packet without getting fooled
> by a sk_bound_dev_if of itself and get forwarded out the physical
> interface.
>
> Signed-off-by: Thomas Winter <Thomas.Winter@xxxxxxxxxxxxxxxxxxx>

This looks ok to me.