Re: 3.2.1 Unable to reset IRR messages on boot

From: Konrad Rzeszutek Wilk
Date: Mon Mar 19 2012 - 15:43:10 EST


On Mon, Mar 19, 2012 at 09:30:46AM -0400, Josh Boyer wrote:
> On Mon, Mar 12, 2012 at 11:36:33AM -0700, Suresh Siddha wrote:
> > On Mon, 2012-03-12 at 09:24 -0400, Josh Boyer wrote:
> > > On Wed, Feb 01, 2012 at 12:00:30AM -0800, Suresh Siddha wrote:
> > > > Yes, it was helpful. Something like the appended patch should ignore the
> > > > bogus io-apic entry all together. As I can't test this, can you or the
> > > > reporter give the appended patch a try and ack please?
> > >
> > > Hi Suresh,
> > >
> > > Apologies for the delay. The original reporter had to return the
> > > machine he was using. We've since had another report where this
> > > happened and your patch below does indeed fix the issue.
> > >
> > > I'd suggest pushing this soon.
> > >
> > > https://bugzilla.redhat.com/show_bug.cgi?id=801501
> > >
> >
> > Thanks Josh. Peter/Ingo, please queue the appended patch for -tip.
>
> Hi Suresh,
>
> Seems this patch and Xen don't get along very well. See the bug link
> below. I've CC'd Konrad and hopefully he'll have some insight as to why
> that might be.

Quick glance at the code tells me that the 'mp_register_ioapic' with the
patch won't increment the gsi_top. That value (gsi_top) is used in
'get_nr_irqs_gsi()'. And that function is used:

#ifdef CONFIG_X86_IO_APIC
/*
* For an HVM guest or domain 0 which see "real" (emulated or
* actual respectively) GSIs we allocate dynamic IRQs
* e.g. those corresponding to event channels or MSIs
* etc. from the range above those "real" GSIs to avoid
* collisions.
*/
if (xen_initial_domain() || xen_hvm_domain())
first = get_nr_irqs_gsi();
#endif

So we get 'first' to be 16 instead of the proper GSI number.. Or perhaps
it is some other bizzare number. Would need to instrument this.


Now, the reason that read to the IOAPIC:
[ 0.000000] I/O APIC 0xfec00000 regs return all ones, skipping!

is b/c we aren't suppose to read the APIC entries at all. There was
some discussion between Ingo (or Peter?) and Jeremy years ago about a pvops call
to do a hypercall to read said entries but it was established that the
initial domain should have no such business. As such it does this:

2066 memset(dummy_mapping, 0xff, PAGE_SIZE);

and :
1899 * We just don't map the IO APIC - all access is via
1900 * hypercalls. Keep the address in the pte for reference.
1901 */
1902 pte = pfn_pte(PFN_DOWN(__pa(dummy_mapping)), PAGE_KERNEL);
1903 break;

[ignore that comment, there are no hypercalls for it]. This is in arch/x86/xen/mmu.c
So the IO_APIC is all 0xfff..


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/