RE: [PATCH] fix x2apic defect that Linux kernel doesn't mask 8259Ainterrupt during the time window between changing VT-d table base addressand initializing these VT-d entries(smpboot.c and apic.c )

From: Zhang, Lin-Bao (Linux Kernel R&D)
Date: Wed Oct 10 2012 - 19:06:28 EST



Hi Suresh,

Now , I want to clarify 2 type of errors :
a) some interrupt doesn't work through Interrupt -remapped IO-APIC.
b) an interrupt is coming , there is an IO-APIC entry to know how to route it to which CPU,
but VT-d entry is empty, ITP will report Present bit is clear.

BIOS should provide simple IO-APIC entries and VT-d entries ,although we don't trust them.
in the issue that we have happened to is of the second case.

1 ,
> As I mentioned earlier, the current design already ensures that all the IO-APIC
> RTE's are masked between the time we enable interrupt-remapping to the time
> when the IO-APIC RTE's are configured correctly.
>
> So I looked at why you are seeing the problem with v2.6.32 but not with the
> recent kernels. And I think I found out the reason.

I want to know what masking IO-APIC means?
I know from BIOS point of view ,they can disable IO-APIC, just like no IO-APIC equipment.
So masking IO-APIC entries equals to disable IO-APIC ? when a interrupt is coming ,
System doesn't make it through IO-APIC ,and directly to Local APIC ?
If yes , I have a question :
I first draw a time line :
a) Clear IO-APIC table c)
|-------------------------------------------------|------------------------------------------------------|
Change VT-d table b) create IO-APIC table and VT-d table and initialize them
Empty table empty IO-APIC table
When interrupt is coming between a) and b) ? I suppose the error : "VT-d entry 's Present bit is clear ).
When interrupt is coming between b) and c) ? ??
When interrupt is coming after c) point ? I suppose everything will work fine.





2,
We know maybe bios will provide a simple IO-APIC table for us, although we OS doesn't believe it.
I also did a test :

- disable x2apic in BIOS.
find kernel source 3.3.8 , and delete that patch (comment that key line) .
I found that it would display :
Kernel panic - not syncing: timer doesn't work through Interrupt-remapped IO-APIC.
But if we disable x2apic in BIOS , rhel6.2 doesn't need any patch, it can run. Very strange.

- enable x2apic in BIOS
delete your patch from 3.3.8 , same issue with us. VT-d entry error (Present bit0 is clear ).
But it doesn't display the error above(don't tell me it doesn't work through IO-APIC ,
it represents io-apic entry exists) .


-- Bob(LinBao Zhang)
HP linux kernel enginner

èº{.nÇ+‰·Ÿ®‰­†+%ŠËlzwm…ébëæìr¸›zX§»®w¥Š{ayºÊÚë,j­¢f£¢·hš‹àz¹®w¥¢¸ ¢·¦j:+v‰¨ŠwèjØm¶Ÿÿ¾«‘êçzZ+ƒùšŽŠÝj"ú!¶iO•æ¬z·švØ^¶m§ÿðà nÆàþY&—