Re: [patch] x64: Avoid irq_chip mask/unmask in fixup_irqs forinterrupt-remapping

From: Suresh Siddha
Date: Fri Jun 05 2009 - 21:00:07 EST


On Fri, 2009-06-05 at 15:20 -0700, Gary Hade wrote:
> On Thu, Jun 04, 2009 at 06:18:14PM -0700, Suresh Siddha wrote:
> > On Thu, 2009-06-04 at 16:13 -0700, Eric W. Biederman wrote:
> > > Suresh Siddha <suresh.b.siddha@xxxxxxxxx> writes:
> > >
> > > > From: Suresh Siddha <suresh.b.siddha@xxxxxxxxx>
> > > > Subject: x64: Avoid irq_chip mask/unmask in fixup_irqs for interrupt-remapping
> > > >
> > > > In the presence of interrupt-remapping, irqs will be migrated in the
> > > > process context and we don't do (and there is no need to) irq_chip mask/unmask
> > > > while migrating the interrupt.
> > > >
> > > > Similarly fix the fixup_irqs() that get called during cpu offline and avoid
> > > > calling irq_chip mask/unmask for irqs that are ok to be migrated in the
> > > > process context.
> > > >
> > > > While we didn't observe any race condition with the existing code,
> > > > this change takes complete advantage of interrupt-remapping in
> > > > the newer generation platforms and avoids any potential HW lockup's
> > > > (that often worry Eric :)
> > >
> > > You now apparently fail to migrate the irq threads in tandem with
> > > the rest of the irqs.
> >
> > Eric, Are you referring to Gary's issues? As far as I understand, they
> > don't happen in the presence of interrupt-remapping.
>
> Suresh,
> We do not currently have the h/w on which to test this assertion
> but it seems like there is a good chance that at least the race that
> http://lkml.org/lkml/2009/6/2/377
> fixes could reproduce there.

No. In the presence of interrupt-remapping, migration of the irq will be
done atomically from the process context. So we don't depend for the
next interrupt to arrive for the migration.

In the particular case that you mentioned above, we are calling the
send_cleanup_vector() from the set_affinity() itself in case of
interrupt-remapping. And this will reset the cfg->move_in_progress.

But I agree that this bug is pretty nasty for non interrupt-remapping
cases and we are very lucky so far for not hitting it with all the
irqbalance changes and with suspend/resume code path.

While I agree with your online fix, I have to catch up with Eric's
concerns.

>
> The other problem that is repaired by
> http://lkml.org/lkml/2009/6/2/378

oh! This one is the famous Eric's rant about broken fixup_irqs() in the
presence of non interrupt-remapping.

Long time back I have proposed a solution to Eric to resolve this for
non interrupt-remapping cases but don't think I never addressed that.
Again let me catch up with Eric's comments and see if we can comeup with
a solution that is acceptable to everyone.

> depends on the i/o redirection table write with remote IRR bit
> set lockup anomaly that the interrupt-remapping code may avoid
> or perhaps is not even present with that h/w. My proposed fix
> for the problem is based on previous interrupt-remapping code
> that you recently removed with
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=0280f7c416c652a2fd95d166f52b199ae61122c0

I cleaned up my code for a reason (that I never liked it and is
complex). So I would def recommend not to go down that path.

This second issue also doesn't happen with interrupt-remapping, as we avoid
touching the io-apic RTE's and use a virtual vector in the RTE
(which is same irrespective of the destination).

Will work with you next week in coming up with clean fixes.

thanks,
suresh

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/