Re: KASAN: use-after-free Read in netdevice_event_work_handler

From: Jason Gunthorpe
Date: Wed Aug 05 2020 - 13:00:48 EST


On Tue, Aug 04, 2020 at 01:00:13PM -0700, Rustam Kovhaev wrote:
> On Sun, Aug 02, 2020 at 07:22:26PM -0300, Jason Gunthorpe wrote:
> > On Fri, Jul 31, 2020 at 02:11:22PM -0700, Rustam Kovhaev wrote:
> >
> > > IB roce driver receives NETDEV_UNREGISTER event, calls dev_hold() and
> > > schedules work item to execute, and before wq gets a chance to complete
> > > it, we return to ip_tunnel.c:274 and call free_netdev(), and then later
> > > we get UAF when scheduled function references already freed net_device
> > >
> > > i added verbose logging to ip_tunnel.c to see pcpu_refcnt:
> > > + pr_info("about to free_netdev(dev) dev->pcpu_refcnt %d", netdev_refcnt_read(dev));
> > >
> > > and got the following:
> > > [ 410.220127][ T2944] ip_tunnel: about to free_netdev(dev) dev->pcpu_refcnt 8
> >
> > I think there is a missing call to netdev_wait_allrefs() in the
> > rollback_registered_many().
> calling it there leads to rtnl deadlock, i think we should call
> net_set_todo(), so that later when we call rtnl_unlock() it will
> execute netdev_run_todo() and there it will proceed to calling
> netdev_wait_allrefs(), but in ip tunnel i will need get
> free_netdev() to be called after we unlock rtnl mutex
> i'll try to send a new patch for review

Oh the whole register is called under rtnl? Yikes..

This is probably a systemic problem with register_netdevice error
unwind, not just ip tunnel

The other way to handle it would be to organize things so that
register cannot fail once it starts calling notifiers?

Jason