Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree

From: Thomas Gleixner
Date: Wed Mar 07 2007 - 03:31:56 EST


On Tue, 2007-03-06 at 18:08 -0800, Dan Hecht wrote:
> > IMO the paravirt interfaces should use nanoseconds anyway for both
> > readout and next event programming. That way the conversion is done in
> > the hypervisor once and the clocksources and clockevents are simple and
> > unified (except for the underlying hypervisor calls).
> >
>
> I disagree. The clocksource/clockevents layer are always going to have
> to convert nanoseconds to/from hardware units, so why not use it? And,
> some guests (say, a future version of linux that does trace-based
> process accounting) may want higher resolution than nanoseconds for
> certain uses.

That's a pure academic exercise. When we are at the point where
nanoseconds are to coarse - sometimes after we both retired - the
internal resolution will be femtoseconds or whatever fits.

Again: paravirt should use a common infrastructure for this. Virtual
clocksource and virtual clockevent devices, which operate on ktime_t and
not on some artificial clock chip emulation frequency. The backend
implementation will be still per hypervisor, but we have _ONE_ device
emulation model, which is exposed to the kernel instead of five.

On a Linux based host, you probably end up with a hrtimer on the host
side to schedule the next event on the guest. So why do we need to
convert ktime_t to some virtual frequency in the guest so we can convert
it back into ktime_t on the host ?

Abstractions for the abstractions sake are braindead. There is no real
reason to implement 128 bit math into that path just to make the virtual
clockevent device look like real hardware.

The abstraction of clockevents helps you to get rid of hardwired
hardware assumptions, but you insist on creating them artificially for
reasons which are beyond my grasp.

> In any case, this is beside the point; I'd prefer to
> stick to using the clockevents interface in the way it was intended
> rather than reaching into ->next_event.

Sigh. The gain is, that you still have a good reason, why you can't move
to the clockevents interface.

Jeremy spent a couple of hours to get NO_HZ running for Xen yesterday
instead of writing up lengthy excuses, why it is soooo hard and takes
sooo much time and the current interface is sooo insufficient.

tglx


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/