Re: PROBLEM: CONFIG_NO_HZ could cause software timeouts

From: Marcin Slusarz
Date: Sat Sep 05 2009 - 14:20:02 EST


Norbert van Bolhuis wrote:
>
> The problem occurs when e.g. drivers use time_after(jiffes, timeout).
>
> CONFIG_NO_HZ could make jiffies advance by more than 1.
> This is done by:
> tick_nohz_update_jiffies->tick_do_update_jiffies64->do_timer
>
> If drivers use a timeout value of jiffies+1,
> "time_after(jiffies, timeout)" will be true after 1 interrupt
> (given that it advances jiffies by at least 2).
>
> This is exactly what happens in cfi_cmdset_0002.c:do_write_buffer
> for our case (Powerpc MPC8313, linux-2.6.28, CONFIG_HZ=250,
> CONFIG_NO_HZ=y).
>
> do_write_buffer does the following:
> unsigned long uWriteTimeout = ( HZ / 1000 ) + 1;
> ...
> timeo = jiffies + uWriteTimeout;
> ...
> for (;;) {
> ...
> if (time_after(jiffies, timeo) && !chip_ready(map, adr))
> break;
> if (chip_ready(map, adr)) {
> xip_enable(map, chip, adr);
> goto op_done;
> }
> UDELAY(map, chip, adr, 1);
> }
> /* software timeout */
> ret = -EIO;
> opdone:
> ...
>
> I've seen a few software timeouts after the for-loop
> looped only 13 times (= 13 us delay, i.s.o. the expected 1 ms). Typically

Are you sure? UDELAY may call schedule(), which can return to this thread
after much longer time than 13us...


Marcin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/