Re: [PATCH] udev enhancements to use kernel event queue

From: Greg KH (greg@kroah.com)
Date: Thu Jun 12 2003 - 17:50:40 EST


On Thu, Jun 12, 2003 at 03:03:35PM -0700, Andrew Morton wrote:
> Greg KH <greg@kroah.com> wrote:
> >
> > > 3) /sbin/hotplug events can occur out of order, eg: remove event occurs,
> > > /sbin/hotplug sleeps waiting for something, insert event occurs and
> > > completes immediately. Then the remove event completes, after the
> > > insert, resulting in out-of-order completion and a broken /dev. I have
> > > seen this several times with udev.
> >
> > I responded:
> > Yes this happens. I have a fix for this for udev itself. No
> > kernel changes are needed. I'll show it at OLS in July if you
> > want to see it :)
>
> This is a significantly crappy aspect of the /sbin/hotplug callout. I'd be
> very interested in reading an outline of how you propose fixing it, without
> waiting until OLS, thanks.

Sure, I knew someone would probably want to :)

Anyway, here's what I've come up with, feel free to shoot it full of
holes:

<handwaving>
        - serialize the hotplug events in userspace:
                - udev daemon running listening on named pipe
                - small event generator kicked off from /sbin/hotplug
                  call to write event to udev pipe

          This alone solves the major memory issues that people have
          complained about, and allows us to keep a ram database of
          present devices and their names, which a lot of people want to
          have. It also makes the /sbin/hotplug binary even smaller
          than 6k :)

        - apply debounce on events:
                - get event, delay Tbus amount of time.
                - after time expires, check queue to see if we have any
                  other events for this device.
                - if not, this is the only one, act on it.
                - if so, delay Tbus amount of time again.
                - continue delaying until no new events for this device
                  are present.
                - count up events for this device, and throw away the
                  odd ones (even vs. odd, i.e. 2 adds and 1 remove
                  really mean add the device.)
                - if both counts are even, then leave device at its
                  current state (added or removed) but check device
                  attributes to see if we had named it a "special" name.
                  If so, need to make sure "special" name is still
                  correct. If not, fix it.

Now the whole trick is coming up with the Tbus time limit :)

For all physical busses, it takes a decent amount of time to add or
remove a device (in the seconds for PCI and USB). It's pretty hard to
get these events out of order in the first place, except on a _very_
heavily loaded system (I've tried.) It's easier to get events out of
order for virtual devices (like scsi-debug). That's why a different
time value for different busses makes sense.

So if Tbus is too small, we do get events out of order, make Tbus too
big, and we start delaying too long, and get a real deep queue.
So it's better to leave Tbus too big to be safe, more testing of proper
values is essential before I even start to claim that this will work for
all people, but I do think it is possible.

One other thing that I think will work is making it a sliding scale (if
we get an event, for example, for add, and the device is already there,
increase Tbus and throw it back on the queue (and don't delete the other
ones) and sleep again). This too needs a lot of testing.

The above sequence has seemed to work pretty well for me so far, but it
needs a lot more work and tweaking with real life loads.

I've also been sidetracked recently with other work, but should have
code to show by OLS...

</handwaving>

thanks,

greg k-h
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jun 15 2003 - 22:00:34 EST