Re: [PATCH] sky2: Use deferrable timer for watchdog

From: Arjan van de Ven
Date: Thu Dec 20 2007 - 14:14:08 EST



My interpretation of the api is:
* round_jiffies() - timer wants to wakeup but isn't precise about when so schedule
on next second when system will wake up anyway;
e.g why meetings are usually scheduled on the hour

* deferrable - timer doesn't have to really wakeup but wants to happen near
a particular time. e.g. "I'll meet you at the pub around 8pm"

this is not correct.

deferrable means "if you're busy wake me up at this time. But if not, don't bother waking up for me, get to it
later".

The "later" can be a LONG time later, several seconds easily, if not more.
(timers are on a per cpu bases, and you may end up with a several-core system where the common timers are all on another cpu
than this one)



If this is the case then the whole usage of round_jiffies() is bogus. All users of round_jiffies()
should just be converted to deferrable?? I am a bit concerned that if deferrable gets used everywhere
then a strange situation would occur where all timers were waiting for some other timer to finally
happen, kind of a wierd timelock situation. Like the old chip/dale cartoon:
"you first, no you first, after you mister chip, no after you mister dale,..."



that's a dangerous situation indeed and I'd really like to know what the limits
are for deferring deferrable timers.... Arjan, do you know? Anyone?

there is NO limit to deferring a timer. Do NOT use a deferrable timer if you can't afford the timer to not happen
within.. 10 to 100 seconds! (or more)
They are really meant for things where you CAN afford for it to not happen when you're idle....



I don't see a danger just yet on normal systems - I get something like 10 wakeups
per second from just the kernel (acpi, ahci, usb) on most my systems which
guarantees that the watchdog runs often enough, but for embedded systems and
critical timers in other drivers this may be an issue quickly

on my work desktop test box the average time between cpu wakeups is 1.4 seconds
(and that's single core). It would be higher if it wasn't for some hpet limit issues.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/