Re: [Linaro-acpi] [PATCH] Mailbox: Complete wait event only if Tx was successful

From: Ashwin Chaugule
Date: Tue Dec 16 2014 - 08:00:48 EST


On 16 December 2014 at 06:36, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
> On Fri, Dec 12, 2014 at 05:47:26PM +0000, Ashwin Chaugule wrote:
>> On 12 December 2014 at 03:43, Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
>> > On Thursday 11 December 2014 01:46 AM, Ashwin Chaugule wrote:
>> >>
>> >> If a wait_for_completion_timeout() call returns due to a timeout,
>> >> the mbox code can still call complete() after returning from the wait.
>> >> This can cause subsequent transmissions on a channel to fail, since
>> >> the wait_for_completion_timeout() sees the completion variable
>> >> is !=0, caused by the erroneous complete() call, and immediately
>> >> returns without waiting for the time as expected by the client.
>> >>
>> >> Fix this by calling complete() only if the TX was successful.
>> >>
>> >> Signed-off-by: Ashwin Chaugule <ashwin.chaugule@xxxxxxxxxx>
>> >> ---
>> >> drivers/mailbox/mailbox.c | 2 +-
>> >> 1 file changed, 1 insertion(+), 1 deletion(-)
>> >>
>> >> diff --git a/drivers/mailbox/mailbox.c b/drivers/mailbox/mailbox.c
>> >> index 17e9e4a..4acaddb 100644
>> >> --- a/drivers/mailbox/mailbox.c
>> >> +++ b/drivers/mailbox/mailbox.c
>> >> @@ -101,7 +101,7 @@ static void tx_tick(struct mbox_chan *chan, int r)
>> >> if (mssg && chan->cl->tx_done)
>> >> chan->cl->tx_done(chan->cl, mssg, r);
>> >>
>> >> - if (chan->cl->tx_block)
>> >> + if ((!r) && chan->cl->tx_block)
>> >> complete(&chan->tx_complete);
>> >
>> >
>> > Just curious to check if there's another possible race which is
>> > a different issue.
>> >
>> > Suppose the timer fired and indicated that the Tx is complete, then
>> > it tries to execute complete while the wait_for_completion_timeout timed
>> > out. Does that make sense ?
>> >
>> > So if yes, how about adding !completion_done(..) to the check while you
>> > are at this ?
>>
>> Yea. Seems like another race condition. I'll add it along with this..
>>
>
> Thanks !

IIUC, it looks like adding the !completion_done() will not really fix
this race. Once the lock inside wait_for_completion.. is released,
completion_done() will return 0, and we'll call complete(), which is
not what we want, since the "wait" is already over (after a timeout).
I think the only right thing here is to increase the timeout in
wait_for_completion_timeout(). Thoughts?

Cheers,
Ashwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/