Re: Locking Between User Context and Soft IRQs in 2.4.0

From: Andrew Morton (andrewm@uow.edu.au)
Date: Sat Nov 04 2000 - 20:28:55 EST


Andi Kleen wrote:
>
> On Sat, Nov 04, 2000 at 12:07:34PM -0500, Jeff Garzik wrote:
> > Andi Kleen wrote:
> > > All the MOD_INC/DEC_USE_COUNT are done inside the modules themselves. There
> > > is nothing that would a driver prevent from being unloaded on a different
> > > CPU while it is already executing in ->open but has not yet executed the add
> > > yet or after it has executed the _DEC but it is still running in module code
> > > Normally the windows are pretty small, but very long running interrupt
> > > on one CPU hitting exactly in the wrong moment can change that.
> >
> > Module unload calls unregister_netdev, which grabs rtnl_lock.
> > dev->open runs under rtnl_lock.
> >
> > Given this, how can the driver be unloaded if dev->open is running?
>
> It does not help, because when the semaphore synchronizes it is already
> too late -- free_module already did the zero module count check and
> nothing is going to stop it from unloading.

aaarrrggh!!!

          CPU0 CPU1

        rtnl_lock()
         dev_ifsioc()
          dev_change_flags()
           dev_open();
            dev->open();
            vortex_open()
                                        sys_delete_module()
                                        if (!__MOD_IN_USE)
                                         free_module()
                                          mod->cleanup()
                                          vortex_cleanup()
                                           pci_unregister_driver()
            [ time passes ] drv->remove();
                                            vortex_remove_one()
                                             unregister_netdev()
                                              unregister_netdevice()
                                              rtnl_lock() /* blocks */
            ...
            MOD_INC_USE_COUNT;
            ...
        rtnl_unlock()
                                              ...
                                         module_unmap(); /* Not good */

We can't even fix this with a lock_kernel wrapped around
the dev->owner stuff in dev_open(), because the netdevice's
open() can sleep.

<subliminalmessage>prumpf's patch</sumliminalmessage>

Perhaps the best thing to do here is to create a system-wide
semaphore for module unloading. So we do a down()/up()
in sys_delete_module() and do this in dev_open:

        /*
         * Call device private open method
         */

        down(&mod_unload_sem); /* sync with sys_delete_module() */
        if (dev->owner == 0) {
                if (dev->open)
                        ret = dev->open(dev);
        } else {
                if (try_inc_mod_count(dev->owner)) {
                        if (dev->open) {
                                if ((ret = dev->open(dev)) != 0)
                                        __MOD_DEC_USE_COUNT(dev->owner);
                        }
                } else
                        ret = -ENODEV;
        }
        up(&mod_unload_sem);
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Tue Nov 07 2000 - 21:00:16 EST