Re: [RFC][PATCH 1/2] Remove stop_machine from change_clocksource

From: john stultz
Date: Thu Jul 29 2010 - 16:49:38 EST

On Thu, 2010-07-29 at 09:11 +0200, Martin Schwidefsky wrote:
> On Wed, 28 Jul 2010 09:12:49 -0700
> john stultz <johnstul@xxxxxxxxxx> wrote:
> >
> > I do agree that there can be subtle side effects when dealing with
> > clocksources (part of why I'm being so cautious introducing this
> > change), and when the stop_machine code was added it seemed reasonable.
> > But given the limitations of stop_machine, the more I look at the
> > clocksource_change code, the more I suspect stop_machine is overkill and
> > we can safely just take the write lock on xtime_lock.
> >
> > If I'm still missing something, do let me know.
> What about a clocksource_unregister while a cpu is in the middle of a
> read_seqbegin/timekeeping_get_ns/read_seqretry? The clocksource structure
> is "free" after the successful call to the unregister. At least in theory
> this could be a use after free. The race window is tiny but on virtual
> systems there can be an arbitrary delay in the ktime_get sequence.

Huh. At first I thought "but we don't yet implement
clocksource_unregister!" but of course do now (I guess that TODO in the
clocksource.c header can be yanked :).

So yes, unregister has been contentious in the past for this very
reason. Once registered, its really hard to find a safe point when it
can be un-registered. Stop machine mostly solves this (although one
should note: vsyscall enabled clocksources really can't be freed, as
their vread() page needs to be statically mapped into userspace).

So while stop_machine is a solution here, it would make more sense to me
to use stop_machine (or maybe even a different method, as it sort of
screams RCU to me) to make sure all the cpus are out of the xtime_lock
critical section prior to returning from unregister_clocksource, rather
then stopping everything for the clocksource change.

> I agree that stop_machine is the big gun and restricts the code in the way
> how the clocksource functions may be call. But it is safe, no?

Actually, my reading of stop_machine makes me hesitate a bit, as I'm not
sure if with kernel preemption, we're sure to avoid stopping a thread
mid-syscall to gettimeofday. Anyone have a clue if that's avoided?

Regardless, we need some other method then stop_machine to register
clocksources, as stop_machine is just too limiting.


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at