Re: [git pull] scheduler updates

From: Ingo Molnar
Date: Sat Nov 08 2008 - 13:58:26 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> For that one, i chickened out, because we have this use in
> arch/x86/kernel/vsyscall_64.c:
>
> now = vread();
> base = __vsyscall_gtod_data.clock.cycle_last;
> mask = __vsyscall_gtod_data.clock.mask;
> mult = __vsyscall_gtod_data.clock.mult;
> shift = __vsyscall_gtod_data.clock.shift;
>
> which can be triggered by gettimeofday() on certain systems.
>
> And i couldnt convince myself that this sequence couldnt result in
> userspace-observable GTOD time warps there, so i went for the
> obvious fix first.
>
> If the "now = vread()"'s RDTSC instruction is speculated to after it
> reads cycle_last, and another vdso call shortly after this does
> another RDTSC in this same sequence, the two RDTSC's could be mixed
> up in theory, resulting in negative time?

the fuller sequence is:

now = vread();
base = __vsyscall_gtod_data.clock.cycle_last;
mask = __vsyscall_gtod_data.clock.mask;
mult = __vsyscall_gtod_data.clock.mult;
shift = __vsyscall_gtod_data.clock.shift;

tv->tv_sec = __vsyscall_gtod_data.wall_time_sec;
nsec = __vsyscall_gtod_data.wall_time_nsec;
} while (read_seqretry(&__vsyscall_gtod_data.lock, seq));

now here we could have another race as well: on another CPU we have a
timer IRQ running, which updates
__vsyscall_gtod_data.wall_time_[n]sec.

now __vsyscall_gtod_data updates are protected via the
__vsyscall_gtod_data.lock seqlock, but that assumes that all
instructions within that sequence listen to the barriers.

Except for RDTSC, which can be speculated to outside that region of
code.

RDTSC has no 'explicit' data dependency - there's no MESI-alike
coherency guarantee for stuffing a cycle counter into a register and
then putting that into __vsyscall_gtod_data.clock.cycle_last. So we
create one, by using the combination of LFENCE and SFENCE. (because
RDTSC implementations on Intel and AMD CPUs listen to different
sequences.)

all in one, i think it's still needed to avoid negative GTOD jumps.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/