Re: [RFC v2 3/6] kthread: warn on kill signal if not OOM

From: James Bottomley
Date: Tue Sep 09 2014 - 18:26:11 EST


On Wed, 2014-09-10 at 06:42 +0900, Tejun Heo wrote:
> Hey, James.
>
> On Tue, Sep 09, 2014 at 12:35:46PM -0700, James Bottomley wrote:
> > I don't have very strong views on this one. However, I've got to say
> > from a systems point of view that if the desire is to flag when the
> > module is having problems, probing and initializing synchronously in a
> > thread spawned by init which the init process can watchdog and thus can
> > flash up warning messages seems to be more straightforwards than an
> > elaborate asynchronous mechanism with completion signalling which
> > achieves the same thing in a more complicated (and thus bug prone)
> > fashion.
>
> We no longer report back error on probe failure on module load.

Yes, we do; for every probe failure of a device on a driver we'll print
a warning (see drivers/base/dd.c). Now if someone is proposing we
should report this in a better fashion, that's probably a good idea, but
I must have missed that patch.

> It
> used to make sense to indicate error for module load on probe failure
> when the hardware was a lot simpler and drivers did their own device
> enumeration. With the current bus / device setup, it doesn't make any
> sense and driver core silently suppresses all probe failures. There's
> nothing the probing thread can monitor anymore.

Except the length of time taken to probe. That seems to be what systemd
is interested in, hence this whole thread, right?

> In that sense, we already separated out device probing from module
> loading simply because the hardware reality mandated it and we have
> dynamic mechanisms to listen for device probes exactly for the same
> reason, so I think it makes sense to separate out the waiting too, at
> least in the long term. In a modern dynamic setup, the waits are
> essentially arbitrary and doesn't buy us anything.

But that's nothing to do with sync or async. Nowadays we register a
driver, the driver may bind to multiple devices. If one of those
devices encounters an error during probe, we just report the fact in
dmesg and move on. The module_init thread currently returns when all
the probe routines for all enumerated devices have been called, so
module_init has no indication of any failures (because they might be
mixed with successes); successes are indicated as the device appears but
we have nothing other than the kernel log to indicate the failures. How
does moving to async probing alter this? It doesn't as far as I can
see, except that module_init returns earlier but now we no longer have
an indication of when the probe completes, so we have to add yet another
mechanism to tell us if we're interested in that. I really don't see
what this buys us.

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/