Re: MSI interrupt for xhci still lost on 5.6-rc6 after cpu hotplug

From: Thomas Gleixner
Date: Tue May 05 2020 - 17:47:35 EST


Ashok,

"Raj, Ashok" <ashok.raj@xxxxxxxxx> writes:
> On Tue, May 05, 2020 at 09:36:04PM +0200, Thomas Gleixner wrote:
>> The way it works is:
>>
>> 1) New vector on different CPU is allocated
>>
>> 2) New vector is installed
>>
>> 3) When the first interrupt on the new CPU arrives then the cleanup
>> IPI is sent to the previous CPU which tears down the old vector
>>
>> So if the interrupt is sent to the original CPU just before #2 then this
>> will be correctly handled on that CPU after the set affinity code
>> reenables interrupts.
>
> I'll have this test tried out.. but in msi_set_affinity() the check below
> for lapic_vector_set_in_irr(cfg->vector), this is the new vector correct?
> don't we want to check for old_cfg.vector instead?
>
> msi_set_affinit ()
> {
> ....
> unlock_vector_lock();
>
> /*
> * Check whether the transition raced with a device interrupt and
> * is pending in the local APICs IRR. It is safe to do this outside
> * of vector lock as the irq_desc::lock of this interrupt is still
> * held and interrupts are disabled: The check is not accessing the
> * underlying vector store. It's just checking the local APIC's
> * IRR.
> */
> if (lapic_vector_set_in_irr(cfg->vector))
> irq_data_get_irq_chip(irqd)->irq_retrigger(irqd);

No. This catches the transitional interrupt to the new vector on the
original CPU, i.e. the one which is running that code.

Again the steps are:

1) Allocate new vector on new CPU

2) Set new vector on original CPU

3) Set new vector on new CPU

So we have 3 points where an interrupt can fire:

A) Before #2

B) After #2 and before #3

C) After #3

#A is hitting the old vector which is still valid on the old CPU and
will be handled once interrupts are enabled with the correct irq
descriptor - Normal operation (same as with maskable MSI)

#B This must be checked in the IRR because the there is no valid vector
on the old CPU.

#C is handled on the new vector on the new CPU

Thanks,

tglx