Re: + stupid-hack-to-make-mainline-build.patch added to -mm tree

From: Zachary Amsden
Date: Wed Mar 07 2007 - 18:25:46 EST


Thomas Gleixner wrote:
On the other hand we yet see things like:

/* We use normal irq0 handler on cpu0. */
time_init_hook();

Which is just reaching into the kernel code directly and does not handle
the clock event interrupt self contained. clockevents is not bound to
IRQ0 and this kind of hackery is exactly what we need to avoid in order
to get this maintainable.

Once this is used by paravirt implementations a change to the
mach-default implementation will break stuff left and right.

We've fixed that already. Thanks for pointing it out. We were just trying to re-use code.

Also the whole LAPIC business is so horrible, that it hurts. The generic
interrupt layer is there since almost a year and we still see the crude
emulation of hardware and assumptions of irq0 setup all over the place.

We carefully need to define, which existing kernel interfaces are used /
hooked in which way.

If the paravirt implementations actually use the already available
abstractions in the way in which those abstractions are designed, then
we get into a maintainable design. If there are shortcomings on those
abstractions we need to fix them in a sane way or provide a _common_
workaround (e.g. 128 bit math back and forth library) without impacting
the main kernel code.

Looking at vmitimer.c and the number of hardcoded assumptions are
telling me, that we are heading in exactly the opposite direction.

No, VMI timer is unique because for SMP, it is based on the APIC. On i386, SMP is hardwired to depend on the APIC, and so we simply re-use the pieces of it which are there, with the same assumptions about irqs, and hardware behavior, good or bad. We just have a different way of telling the LAPIC when to deliver interrupts.

The alternative is to pretty much completely copy apic.c into vmi.c or vmitimer.c, which seems a rather bad idea, since now two copies of nearly identical code need to be maintained.

Yes, if they are used in a sane and self contained way without reaching
all over the place and expecting that those functions, which are not
part of the paravirt interfaces will work for ever.

But we definitely need pieces of the core APIC dependent code. Xen needs pieces of it too, but very select pieces for SMP boot. The ugliness you point out is there, but the reason it is there is not because the paravirt code is cluttered, it is because the i386 code is so hardwired to use the APIC model that there is pain separating from it.

The correct solution here is to properly separate the APIC, SMP, and timer code so the logic of it which we want to reuse is separated from the hardware dependence. Clock events and clocksources take care of most of the timer issues, but there is still ugliness from SMP timer events depending on having part of the APIC infrastructure for wiring the interrupt gates.

No it's not an absolute blocker, as long as we can take care, that the
number of incarnations is

- designed to be shareable between hypervisors which have the same time
model
- common code like the 128 bit math is in a shared library
- self contained and not reaching out into core kernel code for no good
reason
Same goes for clock events, interrupts and other core facilities.

I think that is what everyone wants. This is an iterative process. We certainly don't want to reach out into core kernel code unless there is a good reason to do so, and with every development of clock events, sources, and interrupts, we have less of a reason to do so, and the code gets cleaner and more maintainable.

Zach
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/