Re: [PATCH] raise tsc clocksource rating
From: Ingo Molnar
Date: Tue Oct 30 2007 - 03:15:32 EST
* Dan Hecht <dhecht@xxxxxxxxxx> wrote:
>> but if there's a perfect TSC available (there is such hardware) then
>> the TSC _is_ the best clocksource. Paravirt now turns it off
>> unconditionally in essence.
>
> Not really. In the case hardware TSC is perfect, the paravirt time
> counter can be implemented directly in terms of hardware TSC; there is
> no loss in optimization. This is done transparently. And virtual TSC
> can be implemented this way too.
Of course if you duplicate all (or part) of the TSC clocksource driver
in the paravirt guest code then the "paravirt clocksource" is at least
as good as the TSC. But that argument is playing word-games, _of course_
if you use the same (or similar) code it's at least as good. The real
question are clocksources that communicate out to the hypervisor, and
hence have higher overhead than a native, TSC based clocksource - and
clocksources that use the TSC in a broken way.
> The real improvement that a paravirt clocksource offers over the TSC
> clocksource is that the guest does not need to measure the TSC
> frequency itself against some other constant frequency source (which
> is problematic on a virtual machine). [...]
hey, you need not tell me, i've implemented a hyper-clocksource driver
myself. But calibration is a boot only issue and there's no reason why
calibration _has_ to be fragile. For example we could easily extend the
TSC clocksource driver to not calibrate in the guest but take
calibration information from the host. It's in essence a trivial and
obvious extension to calibration. That way we get the highest possible
performance _and_ we share much of the clocksource driver with the host.
also, the way the TSC is used by guests like Xen is fundamentally
fragile on SMP. So i have a good reason to distrust the approach of
hypervisors to timekeeping. The maintenance problem to me is that
everyone in the paravirt space is busy coding away in their own (often
broken) direction, replicating the essence of the TSC clocksource driver
4 times over again and again, with subtle bugs in each variant, even in
cases where the TSC readout can be trusted perfectly well.
"Consolidation" and "sharing code" is not a particularly strong point of
the paravirt projects ;-) (ok, KVM is a notable exception there.)
anyway, i do agree that this patch is wrong currently, mainly due to TSC
calibration not being reliable in guest-space at the moment - but the
whole concept of putting a separate clocksource driver into each
paravirt guest, even in the case where the TSC is perfect, is madness.
That code, once the hardware gets sane (and there are good signs for
that), and once calibration can be passed from host to guest reliably,
_will_ be consolidated, because it makes perfect technical sense.
Ingo
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/