Re: [PATCH] tun: Fix use-after-free in tun_net_xmit

From: Cong Wang
Date: Mon Apr 29 2019 - 12:39:08 EST


On Sun, Apr 28, 2019 at 7:23 PM Jason Wang <jasowang@xxxxxxxxxx> wrote:
>
>
> On 2019/4/29 äå1:59, Cong Wang wrote:
> > On Sun, Apr 28, 2019 at 12:51 AM Jason Wang <jasowang@xxxxxxxxxx> wrote:
> >>> tun_net_xmit() doesn't have the chance to
> >>> access the change because it holding the rcu_read_lock().
> >>
> >>
> >> The problem is the following codes:
> >>
> >>
> >> --tun->numqueues;
> >>
> >> ...
> >>
> >> synchronize_net();
> >>
> >> We need make sure the decrement of tun->numqueues be visible to readers
> >> after synchronize_net(). And in tun_net_xmit():
> >
> > It doesn't matter at all. Readers are okay to read it even they still use the
> > stale tun->numqueues, as long as the tfile is not freed readers can read
> > whatever they want...
>
> This is only true if we set SOCK_RCU_FREE, isn't it?


Sure, this is how RCU is supposed to work.

>
> >
> > The decrement of tun->numqueues is just how we unpublish the old
> > tfile, it is still valid for readers to read it _after_ unpublish, we only need
> > to worry about free, not about unpublish. This is the whole spirit of RCU.
> >
>
> The point is we don't convert tun->numqueues to RCU but use
> synchronize_net().

Why tun->numqueues needs RCU? It is an integer, and reading a stale
value is _perfectly_ fine.

If you actually meant to say tun->tfiles[] itself, no, it is a fixed-size array,
it doesn't shrink or grow, so we don't need RCU for it. This is also why
a stale tun->numqueues is fine, as long as it never goes out-of-bound.


>
> > You need to rethink about my SOCK_RCU_FREE patch.
>
> The code is wrote before SOCK_RCU_FREE is introduced and assume no
> de-reference from device after synchronize_net(). It doesn't harm to
> figure out the root cause which may give us more confidence to the fix
> (e.g like SOCK_RCU_FREE).

I believe SOCK_RCU_FREE is the fix for the root cause, not just a
cover-up.


>
> I don't object to fix with SOCK_RCU_FREE, but then we should remove
> the redundant synchronize_net(). But I still prefer to synchronize
> everything explicitly like (completely untested):

I agree that synchronize_net() can be removed. However I don't
understand your untested patch at all, it looks like to fix a completely
different problem rather than this use-after-free.

Thanks.