Re: [PATCH v2 1/5] HID: bigben_remove: manually unregister leds

From: Benjamin Tissoires
Date: Thu Feb 09 2023 - 03:56:51 EST


Hi Pietro,

On Jan 31 2023, Pietro Borrello wrote:
> Unregister the LED controllers before device removal, as
> bigben_set_led() may schedule bigben->worker after the structure has
> been freed, causing a use-after-free.
>
> Fixes: 4eb1b01de5b9 ("HID: hid-bigbenff: fix race condition for scheduled work during removal")
> Signed-off-by: Pietro Borrello <borrello@xxxxxxxxxxxxxxxx>
> ---
> drivers/hid/hid-bigbenff.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/drivers/hid/hid-bigbenff.c b/drivers/hid/hid-bigbenff.c
> index e8b16665860d..d3201b755595 100644
> --- a/drivers/hid/hid-bigbenff.c
> +++ b/drivers/hid/hid-bigbenff.c
> @@ -306,9 +306,14 @@ static enum led_brightness bigben_get_led(struct led_classdev *led)
>
> static void bigben_remove(struct hid_device *hid)
> {
> + int n;
> struct bigben_device *bigben = hid_get_drvdata(hid);
>
> bigben->removed = true;
> + for (n = 0; n < NUM_LEDS; n++) {
> + if (bigben->leds[n])
> + devm_led_classdev_unregister(&hid->dev, bigben->leds[n]);
> + }
> cancel_work_sync(&bigben->worker);

I don't think this is the correct fix. It would seem that we are
suddenly making the assumption that the devm mechanism would do things
in the wrong order, when the devm_led_classdev_unregister() should be
called *before* the devm_free() of the struct bigben_device.

However, you can trigger a bug, and thus we can analyse a little bit
further what is happening:

* user calls a function on the LED
* bigben_set_led() is called
* .remove() is being called at roughly the same time:
- bigben->removed is set to true
- cancel_work_sync() is called
* at that point, bigben_set_led() can not crash because
led_classdev_unregister() flushes all of its workers, and thus
prevents the call for dev_kfree(struct bigben_device)
* but now bigben_set_led() calls schedule_work()
* led_classdev_unregister() is now done and devm_kfree() is called for
struct bigben_device
* now the led worker kicks in, and tries to access struct bigben_device
and derefences it to get the value of bigben->removed (and
bigben->report), which crashes.

So without your patch, the problem seems to be that we call a
schedule_work *after* we set bigben->removed to true and we call
cancel_work_sync().

And if you look at the hid-playstation driver, you'll see that the
schedule_work() call is encapsulated in a spinlock and a check to
ds->output_worker_initialized.

And this is why you can not reproduce on the hid-playstation driver,
because it is guarded against scheduling a worker when the driver is
being removed.

I think I prefer a lot more the playstation solution: having to manually
call a devm_release_free always feels wrong in a normal path. And also
by doing so, you might paper another problem that might happen on an
error path in probe for instance. Also, this means that the pattern you
saw is specific to some drivers, not all depending on how they make use
of workers.

Would you mind respinning that series with those comments?

Cheers,
Benjamin


> hid_hw_stop(hid);
> }
>
> --
> 2.25.1