Re: [RFC PATCH v5 2/2] Use kernfs_break_active_protection() for device online store callbacks

From: Rafael J. Wysocki
Date: Wed Apr 23 2014 - 06:42:28 EST


On Wednesday, April 23, 2014 01:03:42 PM Li Zhong wrote:
> On Tue, 2014-04-22 at 16:44 -0400, Tejun Heo wrote:
> > Hello,
> >
> > On Tue, Apr 22, 2014 at 11:34:39AM +0800, Li Zhong wrote:
> > > > Is this assumption true? If so, can we add lockdep assertions in
> > > > places to verify and enforce this? If not, aren't we just feeling
> > > > good when the reality is broken?
> > >
> > > It seems not true ... I think there are devices that don't have the
> > > online/offline concept, we just need to add it, remove it, like ethernet
> > > cards.
> > >
> > > Maybe we could change the comments above, like:
> > > /* We assume device_hotplug_lock must be acquired before
> > > * removing devices, which have online/offline sysfs knob,
> > > * and some locks are needed to serialize the online/offline
> > > * callbacks and device removing. ...
> > > ?
> > >
> > > And we could add lockdep assertions in cpu and memory related code? e.g.
> > > remove_memory(), unregister_cpu()
> > >
> > > Currently, remove_memory() has comments for the function:
> > >
> > > * NOTE: The caller must call lock_device_hotplug() to serialize hotplug
> > > * and online/offline operations before this call, as required by
> > > * try_offline_node().
> > > */
> > >
> > > maybe it could be removed with the lockdep assertion.
> >
> > I'm confused about the overall locking scheme. What's the role of
> > device_hotplug_lock? Is that solely to prevent the sysfs deadlock
> > issue? Or does it serve other synchronization purposes depending on
> > the specific subsystem? If the former, the lock no longer needs to
> > exist. The only thing necessary would be synchronization between
> > device_del() deleting the sysfs file and the unbreak helper invoking
> > device-specific callback. If the latter, we probably should change
> > that. Sharing hotplug lock across multiple subsystems through driver
> > core sounds like a pretty bad idea.
>
> I think it's the latter.

Actually, no, this is not the case if I understand you correctly.

> I think device_{on|off}line is better to be
> done in some sort of lock which prevents the device from being removed,
> including some preparation work that needs be done before device_del().

Quite frankly, you should be confident that you understand the code you're
trying to modify or please don't touch it.

I'll have a deeper look at this issue later today or tomorrow and will get
back to you then.

Thanks!

--
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/