Re: Re: [PATCH] drivers: nfc: nfcmrvl: fix double free bug in nfcmrvl_nci_unregister_dev()

From: duoming
Date: Sun Apr 10 2022 - 08:33:46 EST


Hello,

On Sun, 10 Apr 2022 11:27:09 +0200 Krzysztof Kozlowski wrote:

> > There is a potential double bug in nfcmrvl usb driver between
> > unregister and resume operation.
> >
>
> Thank you for your patch. There is something to discuss/improve.
>
> > The race that cause that double free bug can be shown as below:
>
> Your patch solves the most visible race, but because of lack of locking,
> I believe race still might exist:
>
> (FREE) | (USE)
> | nfcmrvl_resume
> | nfcmrvl_submit_bulk_urb
> | nfcmrvl_bulk_complete
> | nfcmrvl_nci_recv_frame
> | nfcmrvl_fw_dnld_recv_frame
> | queue_work
> | fw_dnld_rx_work
> nfcmrvl_disconnect |
> nfcmrvl_nci_unregister_dev |
> nfcmrvl_fw_dnld_deinit |
> wait for the workqueue to finish |
> | fw_dnld_over
> | release_firmware
> | kfree(fw);
> | no synchronization //(1)
> if (fw_download_in_progress)
> - no synchronization, so CPU sees old value
> nfcmrvl_fw_dnld_abort |
> fw_dnld_over | ...
> if (priv->fw_dnld.fw) |
> release_firmware |
> kfree(fw); //(2) |
> ... | fw = NULL;
>
> The kfree() from (2) would still free old value. Even if fw=NULL happens
> earlier, it is not propagated back to the other CPU, unless there are
> some implicit barriers due to workqueue?
>
> Is it safe then to rely on such implicit barriers from workqueue?

I think it is safe to rely on such barriers from destroy_workqueue().
The destroy_workqueue will wait all work items to finish, the code
behind it will not execute until all work items are finished.
The progress is shown below:

destroy_workqueue()-->drain_workqueue()-->flush_workqueue()-->wait_for_completion().

The function drain_workqueue() will wait until the workqueue becomes empty.
It sets wq->flags to __WQ_DRAINING, this could ensure the new coming work items
will not be queued into the draining workqueue. Because __queue_work will check
the __WQ_DRAINING flag.

Then drain_workqueue() calls flush_workqueue() to ensure that any scheduled work
has run to completion. Because wait_for_completion() is called in flush_workqueue(),
it will block current thread and wait work items to completion.

Best regards,
Duoming Zhou.