Re: [PATCH] Skip tsc synchronization checks if CONSTANT_TSC bit is set.

From: Andi Kleen
Date: Thu Oct 23 2008 - 04:03:47 EST


> >
> > Or are you saying time is always broken on VMware & Linux?
>
> The acpi_pm timer wrap problem has come up only with the clocksource and
> NO_HZ kernels, without NO_HZ there were periodic interrupts which caused
> the guest to be scheduled before ACPI_PM could wrap around.

ACPI_PM should be just fixed. My old independent noidlehz implementation
just always limited the sleep times to half the wrap time of the
timer. I suspect this needs to be done here too.



>
> > > So TSC is the ideal clocksource from performance and correctness point
> > > of view for VMware.
> >
> > But you don't seem to emulate it "ideal"ly otherwise you wouldn't
> > need all these hacks you're adding?
>
> "All these hacks" ? i guess you are talking about only this particular,

Everything that requires vmware detection means your hardware
emulation is not good enough.

> skipping the tsc_sync checks.
> Rest of them are valid bugs as i have mentioned.

The tsc frequency one didn't sound like a valid bug.

> > or implement
> > a real vmware PV timer and just say it's PV and not fully virtualized.
> > But doesn't the vmware paravirt ops have that already anyways?
>
> That's for 32bit only.

I though there were some efforts to make it 64bit too?
Or is there no VMI ROM on 64bit? Perhaps you could do the
timer without the ROM then.

> Apart from the tsc_sync problem i doubt we have
> any other issue with the TSC as clocksource, so adding a similar
> clocksource is something that i would avoid.
>
> >
> > But I personally think it wouldn't really scale to add detection for
> > more and more "nearly PV" hypervisors to the standard native kernel.
>
> I think we anyways need a way to detect if we are running on a
> hypervisor.

For PV sure. But not for non PV.

> That's the only way we can move towards having a single
> image which runs well on both native hardware and a virtualized
> environment.

If a hypervisor is not good enough to simulate hardware closely
enough it should just set up respective paravirt ops (or register
own clock drivers etc.), but not complicate the native code with a weird
half PV half fully emulated mix.


>
> I guess, the only thing that you don't agree over here is the enabling
> of CONSTANT_TSC bit when VMware is detected, right ?

My POV is that code supposed to drive real hardware shouldn't
have any "is hypervisor X|Y|Z" hacks. We already got a whole
lot of infrastructure for PV hypervisors.

For tsc_sync I suspect the fix is to either completely trust CONSTANT_TSC
or make the check accept more offset or possibly a combination of both.

-Andi
--
ak@xxxxxxxxxxxxxxx
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/