Re: [PATCH v2 ] devcoredump : Serialize devcd_del work

From: Thomas Gleixner
Date: Mon Apr 25 2022 - 13:00:18 EST


On Mon, Apr 25 2022 at 18:39, Mukesh Ojha wrote:
> v1->v2:
> - Added del_wk_queued to serialize the race between devcd_data_write()
> and disabled_store().

How so?

Neither the flag nor the mutex can prevent the race between the work
being executed in parallel.

disabled_store() worker()

class_for_each_device(&devcd_class, NULL, NULL, devcd_free)
...
while ((dev = class_dev_iter_next(&iter)) {
devcd_del()
device_del()
put_device() <- last reference
error = fn(dev, data) devcd_dev_release()
devcd_free(dev, data) kfree(devcd)
mutex_lock(&devcd->mutex);


There is zero protection of the class iterator against the work being
executed and removing the device and freeing its data. IOW, at the
point where fn(), i.e. devcd_free(), dereferences 'dev' to acquire the
mutex, it might be gone. No?

If disabled_store() really needs to flush all instances immediately,
then it requires global serialization, not device specific serialization.

Johannes, can you please explain whether this immediate flush in
disabled_store() is really required and if so, why?

Thanks,

tglx