Re: Sysfs and suicidal attributes

From: Tejun Heo
Date: Tue Jul 10 2007 - 11:11:51 EST


Hello,

Alan Stern wrote:
> Instead we should broadcast a notification to all threads which might
> try to add a device, telling them not to do it until the suspend is
> over. After that notification is sent and current additions have
> completed, the driver core can simply fail further calls to
> device_add().
>
>> IMHO, what's needed is a giant rwlock protecting the whole device
>> addition/removal while suspend is in progress (addition/removal during
>> resume should be safe as long as we put the new devices on the already
>> processed list). The lock can probably replace dpm_sem and dpm_list_sem.

Sans notification, the plan proposed in the first paragraph and rwlock
with down_read_trylock() for addition/removal is pretty similar.

> Addition during resume is _not_ safe. The new device can easily get
> added to the already-processed list in the wrong position, so that the
> next time the system is suspended the PM core tries to suspend the new
> device's parent before suspending the new device.

When a bus device is being woken up and rescanning the bus, all
prerequisites for the bus should have been resumed already. As long as
the newly added device is placed after the bus device itself, things
should be safe. Well, that's the theory at least.

>> It's probably necessary to use trylock from addition/removal to avoid
>> deadlock. Drivers which support hotplugging need to rescan/revalidate
>> the bus during resume anyway so failing addition/removal during suspend
>> shouldn't be a problem.
>
> I had envisioned using SRCU to synchronize existing device additions
> with the beginning of a suspend. down_read_trylock should work just as
> well (aside from cacheline bouncing).

How high performance device addition/removal should be? Wouldn't rwlock
be enough?

> Device removal isn't as much of a problem as addition. We could allow
> it with no difficulty. Failing it isn't really an option because
> device_del() returns void. And anyway, my idea was to have the PM core
> acquire all the device semaphores at the start of the suspend -- this
> would automatically block any attempted removal until the semaphore was
> released.

Hmmm... locking all device mutexes sounds scary to me but I tend to be a
chicken and lockdep is great, so it might just work. :-)

>> Regardless of the suspend problem, I like the above idea. Actually, I
>> was thinking about doing exactly that for suicidal nodes when I first
>> found out the problem. All that's needed is an owner field which allows
>> the owning thread to reenter (what was the name for this type of mutex?
>> BSD folks like it). It's probably better to put some restrictions to
>> allow only the specific case tho.
>
> What with all the sysfs changes, I wouldn't know where to begin
> looking. The idea occurred to me mainly because I was considering
> blocking all access to sysfs during a suspend -- the problem being that
> a suspend is itself triggered by a write to a sysfs attribute!
> Somehow that write would have to wait for all _other_ existing accesses
> to complete and block new ones.

Why is that necessary? What happens if there's some long-running
read/write operation using bin attr?

>> I like it because it shifts complexity from the drivers into driver
>> core. IOW, the driver model is kinder to drivers that way - the driver
>> writer doesn't have to care whether something is suicidal or not - and I
>> think that's the way we should be headed although we're not good at it yet.
>
> It doesn't shift all the complexity. The method still has to tell the
> sysfs core that it's going to unregister its attribute and that the
> unregistration has finished. But this is simpler than the callback
> method we use now.

Right, it's nothing major. I just dislike such limitation is visible
from individual drivers and "echo 1 > /sys/whatever/kill-yourself"
returns before the suicide is actually complete.

Thanks.

--
tejun
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/