Re: [PATCH v3] cfi: fix deadloop in cfi_cmdset_0002.c do_write_buffer

From: Boris Brezillon
Date: Fri Mar 01 2019 - 12:43:24 EST


On Sat, 2 Mar 2019 01:59:41 +0900
"Tokunori Ikegami" <ikegami.t@xxxxxxxxx> wrote:

> > [...]
> > >>>>> In function do_write_buffer(), in the for loop, there is a
> > >>>>> case chip_ready() returns 1 while chip_good() returns 0, so
> > >>>>> it never break the loop.
> > >>>>> To fix this, chip_good() is enough and it should timeout if
> > >>>>> it stay bad for a while.
> > >>>>>
> > >>>>> Fixes: dfeae1073583("mtd: cfi_cmdset_0002: Change write
> > >>>>> buffer to check correct value")
> > >>>>> Signed-off-by: Yi Huaijie <yihuaijie@xxxxxxxxxx>
> > >>>>> Signed-off-by: Liu Jian <liujian56@xxxxxxxxxx>
> > >>>>> Reviewed-by: Tokunori Ikegami <ikegami_to@xxxxxxxxxxx>
> > >>>>> ---
> > >>>>> v2->v3:
> > >>>>> Follow Vignesh's advice:
> > >>>>> add one more check for check_good() even when time_after()
> > >>>>> returns
> > >> true.
> > >>>>>
> > >>>>> drivers/mtd/chips/cfi_cmdset_0002.c | 2 +-
> > >>>>> 1 file changed, 1 insertion(+), 1 deletion(-)
> > >>>>>
> > >>>>> diff --git a/drivers/mtd/chips/cfi_cmdset_0002.c
> > >>>>> b/drivers/mtd/chips/cfi_cmdset_0002.c
> > >>>>> index 72428b6..3da2376 100644
> > >>>>> --- a/drivers/mtd/chips/cfi_cmdset_0002.c
> > >>>>> +++ b/drivers/mtd/chips/cfi_cmdset_0002.c
> > >>>>> @@ -1876,7 +1876,7 @@ static int __xipram
> > >>>>> do_write_buffer(struct map_info *map, struct flchip *chip,
> > >>>>> continue;
> > >>>>> }
> > >>>>>
> > >>>>> - if (time_after(jiffies, timeo)
> > >>>>> && !chip_ready(map, adr))
> > >>>>> + if (time_after(jiffies, timeo)
> > >>>>> && !chip_good(map, adr, datum))
> > >>>>
> > >>>> Just another idea to understand easily.
> > >>>>
> > >>>> unsigned long now = jiffies;
> > >>>>
> > >>>> if (chip_good(map, adr, datum)) {
> > >>>> xip_enable(map, chip, adr);
> > >>>> goto op_done;
> > >>>> }
> > >>>>
> > >>>> if (time_after(now, timeo) {
> > >>>> break;
> > >>>> }
> > >>>>
> > >>>
> > >>> Thank you~. It is more easier to understandï
> > >>> If there are no other comments, I will send new patch again ):
> > >>
> > >> Except this version no longer does what Vignesh suggested. See
> > >> how you no longer test if chip_good() is true if time_after()
> > >> returns true. So, imagine the thread entering this function is
> > >> preempted just after the first chip_good() test, and resumed a
> > >> few ms later. time_after() will return true, but chip_good()
> > >> might also return true, and you ignore it.
> > >
> > > I think that the following 3 versions will be worked for
> > > time_after()
> > as a same result and follow the Vignesh-san suggestion.
> > >
> >
> > As Boris explained above version 3 does not really follow my
> > suggestion... Please see below
> >
> > > 1. Original Vignesh-san suggestion
> > >
> > > if (chip_good(map, adr, datum)) {
> > > xip_enable(map, chip, adr);
> > > goto op_done;
> > > }
> > >
> > > if (time_after(jiffies, timeo)) {
> > > /* Test chip_good() if TRUE incorrectly again so
> > > write
> > failure by time_after() can be avoided. */
> > > if (chip_good(map, adr, datum)) {
> > > xip_enable(map, chip, adr);
> > > goto op_done;
> > > }
> > > break;
> > > }
> > >
> >
> >
> > Right, here we check chip_good() once _even when time_after() is
> > true_ to avoid _spurious_ timeout
> >
> > > 2. Liujian-san v3 patch
> > >
> > > /* Test chip_good() if FALSE correctly so write failure
> > > by
> > time_after() can be avoided. */
> > > if (time_after(jiffies, timeo) && !chip_good(map, adr))
> > > break;
> > >
> > > if (chip_good(map, adr, datum)) {
> > > xip_enable(map, chip, adr);
> > > goto op_done;
> > > }
> > >
> >
> > This is a better version of 1
> >
> > > 3. My idea
> > >
> > > /* Save current jiffies value before chip_good() to avoid
> > > write
> > failure by time_after() without testing chip_good() again. */
> > > unsigned long now = jiffies;
> > >
> > > if (chip_good(map, adr, datum)) {
> > > xip_enable(map, chip, adr);
> > > goto op_done;
> > > }
> > >
> >
> > What if thread gets pre-empted at this point and is re-scheduled
> > exactly after timeo jiffies have elapsed? Below check would be true
> > and exit loop
>
> I think that the jiffies value now is save before chip_good() so
> below check would be false and not exit loop.

True, I overlooked that part, and so Vignesh did. This proves one
thing: code is not easier to follow with your version. IMO, if we want
to make things clear, we should add a comment to Liujian's explaining
why the extra test after time_after(jiffies, timeo) is needed.