Re: USB device cannot be reconnected and khubd "blocked for morethan 120 seconds"

From: Alan Stern
Date: Wed Jan 16 2013 - 12:01:46 EST


On Wed, 16 Jan 2013, Tejun Heo wrote:

> Hello, Alan.
>
> On Tue, Jan 15, 2013 at 11:01:15PM -0500, Alan Stern wrote:
> > > The current domain implementation is somewhere inbetween. It's not
> > > completely simplistic system and at the same time not developed enough
> > > to do properly stacked flushing.
> >
> > I like your idea of chronological synchronization: Insist that anybody
> > who wants to flush async jobs must get a cookie, and then only allow
> > them to wait for async jobs started after the cookie was issued.
> >
> > I don't know if this is possible with the current implementation. It
> > would require changing every call to async_synchronize_*(), and in a
> > nontrivial way. But it might provide a proper solution to all these
> > problems.
>
> The problem here is that "flush everything which comes before me" is
> used to order async jobs. e.g. after async jobs probe the hardware
> they order themselves by flushing before registering them, so unless

I don't fully understand this example. What is the point -- to make
sure that asynchronously probed devices are registered in the order of
their discovery?

If so, here's how to do it safely: Start up the async jobs in reverse
order of discovery. Have each job acquire a cookie when it starts.
Then each job needs to wait only for tasks that started after its
cookie was issued.

> we build accurate flushing dependencies, those dependencies will reach
> beyond the time window we're interested in and bring in deadlocks.

The flushing-dependency principle can be very simple: No async task
should ever have to wait for another async task that started before it.
The "cookie" approach satisfies this requirement (unless an earlier
task passes its cookie to a later task or subverts the mechanism in
another way).

> And, as Linus pointed it out, tracking dependency through
> request_module() is tricky no matter what we do. I think it can be
> done by matching the ones calling request_module() and the ones
> actually loading modules but it's gonna be nasty.

This shouldn't matter. Dependencies don't need to be tracked
explicitly, because we know that any async work done by
request_module() must start _after_ request_module() is called. Thus,
if async task A calls request_module(), which starts up async task B,
then we know that A can safely wait for B and B cannot safely wait for
A.

> There aren't too many which use async anyway so changing stuff
> shouldn't be too difficult but I think the simpicity or dumbness is
> one of major attractions of async, so it'd be nice to keep things that
> way and the PF_USED_ASYNC hack seems to be able to hold things
> together for now.

Nesting won't matter for the chronological approach. I really think
you should consider it more fully. It's not a hack, and it doesn't
need to be complicated.

Alan Stern

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/