Re: [PATCH 3/3] clocksource: fix can not set tsc as clocksourcebug

From: Thomas Gleixner
Date: Thu Jul 04 2013 - 06:55:35 EST


On Thu, 4 Jul 2013, Alex Shi wrote:
> commit 5d33b883aed81c6fbcd09c6f7c3619eee850a7e2
> clocksource: Always verify highres capability
>
> This commit will reject a clock to be system clocksource if it has no
> CLOCK_SOURCE_VALID_FOR_HRES flags. Then the tsc to be rejected as

No. It rejects the clocksource if the system is in oneshot mode and
the clocksource does not support HIGHRES.

So at boot time, the tick device mode is PERIODIC and the clocksource
is set to jiffies.

In clocksource_done_booting() we select the best clocksource from the
registered list. We are still in PERIODIC mode, so the selection in
clocksource_find_best() should grab TSC whether the HIGHRES_VALID flag
is set or not. The relevant check in find_best() is:

if (oneshot && !(cs->flags & CLOCK_SOURCE_VALID_FOR_HRES))
continue;

And oneshot should be definitely false at that point. So we don't care
about the HRES flag at all.

So if we select TSC from clocksource_done_booting() that prevents the
system from switching into highres mode as long as the VALID_FOR_HRES
flag is not set by the watchdog. If it gets set then
tick_clock_notify() tells the tick code to reevaluate.

So now the question is why you are observing that HPET is selected in
the first place.

Can you add instrumentation to the code and provide the data please? I
try to reproduce myself. What's the environment you're using?

> diff --git a/kernel/time/clocksource.c b/kernel/time/clocksource.c
> index 9d6c333..3ad9f29 100644
> --- a/kernel/time/clocksource.c
> +++ b/kernel/time/clocksource.c
> @@ -308,6 +308,8 @@ static void clocksource_watchdog(unsigned long data)
> * transition into high-res mode:
> */
> tick_clock_notify();
> + if (finished_booting)
> + schedule_work(&watchdog_work);

This only makes sense, when the clocksource which gets the FLAG set
has the highest rating, but was not selected because the system was in
ONESHOT mode already.

And I can't see why that should suddenly happen.

> static int clocksource_watchdog_kthread(void *data)
> {
> struct clocksource *cs, *tmp;
> @@ -412,11 +415,14 @@ static int clocksource_watchdog_kthread(void *data)
>
> mutex_lock(&clocksource_mutex);
> spin_lock_irqsave(&watchdog_lock, flags);
> - list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list)
> + list_for_each_entry_safe(cs, tmp, &watchdog_list, wd_list) {
> if (cs->flags & CLOCK_SOURCE_UNSTABLE) {
> list_del_init(&cs->wd_list);
> list_add(&cs->wd_list, &unstable);
> }
> + if (cs->flags & CLOCK_SOURCE_VALID_FOR_HRES)
> + clocksource_select();

Unlikely, but if we have 3 watched clocksources which have the HRES
bit set, you'd call 3 times clocksource_select().

Thanks,

tglx

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/