Re: [patch] CFS scheduler, -v19

From: Linus Torvalds
Date: Wed Jul 18 2007 - 12:05:26 EST




On Tue, 17 Jul 2007, Ingo Molnar wrote:
>
> * Ian Kent <raven@xxxxxxxxxx> wrote:
> >
> > In several places I have code similar to:
> >
> > wait.tv_sec = time(NULL) + 1;
> > wait.tv_nsec = 0;

Ok, that definitely should work.

Does the patch below help?

> ah! It passes in a low-res time source into a high-res time interface
> (pthread_cond_timedwait()). Could you change the time(NULL) + 1 to
> time(NULL) + 2, or change it to:
>
> gettimeofday(&wait, NULL);
> wait.tv_sec++;

This is wrong. It's wrong for two reasons:

- it really shouldn't be needed. I don't think "time()" has to be
*exactly* in sync, but I don't think it can be off by a third of a
second or whatever (as the "30% CPU load" would seem to imply)

- gettimeofday works on a timeval, pthread_cond_timedwait() works on a
timespec.

So if it actually makes a difference, it makes a difference for the
*wrong* reason: the time is still totally nonsensical in the tv_nsec field
(because it actually got filled in with msecs!), but now the tv_sec field
is in sync, so it hides the bug.

Anyway, hopefully the patch below might help. But we probably should make
this whole thing a much more generic routine (ie we have our internal
"getnstimeofday()" that still is missing the second-overflow logic, and
that is quite possibly the one that triggers the "30% off" behaviour).

Ingo, I'd suggest:
- ger rid of "timespec_add_ns()", or at least make it return a return
value for when it overflows.
- make all the people who overflow into tv_sec call a "fix_up_seconds()"
thing that does the xtime overflow handling.

Linus

---
Subject: time: make sure sys_gettimeofday() and sys_time() are in sync
From: Ingo Molnar <mingo@xxxxxxx>

make sure sys_gettimeofday() and sys_time() results are coherent.

Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
---
kernel/time/timekeeping.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

Index: linux/kernel/time/timekeeping.c
===================================================================
--- linux.orig/kernel/time/timekeeping.c
+++ linux/kernel/time/timekeeping.c
@@ -92,6 +92,19 @@ static inline void __get_realtime_clock_
} while (read_seqretry(&xtime_lock, seq));

timespec_add_ns(ts, nsecs);
+ /*
+ * Make sure xtime.tv_sec [returned by sys_time()] always
+ * follows the gettimeofday() result precisely. This
+ * condition is extremely unlikely, it can hit at most
+ * once per second:
+ */
+ if (unlikely(xtime.tv_sec != ts->tv_sec)) {
+ unsigned long flags;
+
+ write_seqlock_irqsave(&xtime_lock, flags);
+ update_wall_time();
+ write_sequnlock_irqrestore(&xtime_lock, flags);
+ }
}

/**

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/