RE: [PATCHv2] firmware: Correct handling of fw_state_wait_timeout() return value

From: linux-kernel-dev
Date: Wed Jan 18 2017 - 01:35:20 EST


>From: Jakub Kicinski [mailto:jakub.kicinski@xxxxxxxxxxxxx]
>Sent: Dienstag, 17. Januar 2017 22:18
>
>On Tue, Jan 17, 2017 at 12:53 PM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
>wrote:
>> On Tue, Jan 17, 2017 at 10:04:20AM -0800, Jakub Kicinski wrote:
>>> On Tue, Jan 17, 2017 at 9:30 AM, Luis R. Rodriguez <mcgrof@xxxxxxxxxx>
>wrote:
>>> > On Tue, Jan 17, 2017 at 08:30:37AM -0800, Jakub Kicinski wrote:
>>> >> Adding a NULL-check would just paper over the
>>> >> issue and can cause trouble down the line.
>>> >
>>> > We typically bail on errors and use similar code to bail out, and we
>>> > typically do these things. Here its no different. The *real* issue
>>> > is the fact that we have a waiting timeout which can fail race against
>>> > a user imposed error out on the sysfs interface. There is one catch:
>>> >
>>> > We already lock with the big fw_lock and use this to be able to check
>>> > for the status of the fw, so once aborted we technically should not have
>>> > to abort again. A proper way to address then this would have been to
>check
>>> > for the status of the fw prior to aborting again given we also lock on the
>>> > big fw_lock. A problem with this though is the status is part of the buf
>>> > which is set to NULL after we are done aborting.
>>>
>>> Yes, I've seen that too :\ This race seems to have been there prior
>>> to 4.9, though. I guess we could fix both issues with the NULL-check
>>> although I would prefer if we had both patches.
>>>
>>> FWIW I think the NULL-check could be put in the existing conditional:
>>>
>>> * There is a small window in which user can write to 'loading'
>>> * between loading done and disappearance of 'loading'
>>> */
>>> - if (fw_state_is_done(&buf->fw_st))
>>> + if (!buf || fw_state_is_done(&buf->fw_st))
>>> return;
>>>
>>> list_del_init(&buf->pending_list);
>>>
>>> Note that the comment above seems to be mentioning the race we're
>>> trying to solve.
>>
>> Right, I think another approach is to *enable* the state of the buf
>> to be used to avoid further use on the sysfs iterface instead. Fortunately
>> other sysfs interfaces already use fw_state_is_done() to bail out,
>> so all that would be needed I think would be:
>>
>> diff --git a/drivers/base/firmware_class.c b/drivers/base/firmware_class.c
>> index b9ac348e8d33..30ccf7aea3ca 100644
>> --- a/drivers/base/firmware_class.c
>> +++ b/drivers/base/firmware_class.c
>> @@ -558,9 +558,6 @@ static void fw_load_abort(struct firmware_priv
>*fw_priv)
>> struct firmware_buf *buf = fw_priv->buf;
>>
>> __fw_load_abort(buf);
>> -
>> - /* avoid user action after loading abort */
>> - fw_priv->buf = NULL;
>> }
>>
>> static LIST_HEAD(pending_fw_head);
>> @@ -713,7 +710,7 @@ static ssize_t firmware_loading_store(struct device
>*dev,
>>
>> mutex_lock(&fw_lock);
>> fw_buf = fw_priv->buf;
>> - if (!fw_buf)
>> + if (!fw_buf || fw_state_is_aborted(&fw_buf->fw_st))
>> goto out;
>>
>> switch (loading) {
>
>IMHO this one is nice! I think you can even drop the !fw_buf check in
>this case because AFAICS the only case where fw_buf is set to NULL is
>in the abort function.
>
I can confirm, that patch looks nice and is working for my setup, even without the !fw_buf.
Feel free to grab everything you need from my commit log, if it helps.
Unfortunately there is a crazy spam filter between us, so you can't rely on me.