Re: [Xen-devel] [PATCH] xen: reuse the same pirq allocatedwhen driver load first time

From: Jan Beulich
Date: Thu May 23 2013 - 02:32:00 EST


>>> On 22.05.13 at 18:41, Konrad Rzeszutek Wilk <konrad.wilk@xxxxxxxxxx> wrote:
> On Wed, May 22, 2013 at 04:25:10PM +0100, Jan Beulich wrote:
>> Okay, that clarifies it quite a bit. For one, I'll leave any of the
>> emuirq stuff to Stefano, who wrote this originally. And then, from
>> the beginning of this thread, I'm not convinced that freeing a pirq
>> is really the right thing here: unmap_pirq() is the counterpart of
>> map_pirq(), not get_free_pirq(). I would think that is a guest
>> allocates a pirq and then unmaps it without first mapping it, it's
>> the guest's fault that it now lost one pirq resource. It should not
>> have allocated one in the first place if it didn't mean to use it for
>> anything.
>
> It does use it, but if you do run this in a loop:
> rmmod e1000e;modprobe e1000e
>
> it ends up doing thse three hypercalls: PHYSDEVOP_get_free_pirq,
> PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq and so on.

Yeah, my comment was more towards your simplified repro
module...

> The reason is that
> drivers/xen/events.c keeps track of the Linux IRQ <-> PIRQ just as long
> as needed - if the driver does a free_irq, well, then the mapping is
> de-allocated and lost.
>
> One patch I posted (for Linux) keeps track of the PIRQ so that if
> free_irq is called and we remove the Linux IRQ <-> PIRQ association,
> we still have the PIRQ saved away and can re-use it.
>
> In other words, the loop ends up doing:
> PHYSDEVOP_map_pirq, PHYSDEVOP_unmap_pirq

And that's the right solution here imo. Remember that
get_free_pirq was introduced only for the pv-ops kernel, the
classic one never needed it (and hence couldn't leak pIRQ-s).

>> That setting to a negative value is not causing the slot to be
>> permanently lost, it merely defers its freeing. It was the only
>> way I could find back then to reasonably handle an unmap
>> being done before the matched unbind.
>
> Ah, so pt_irq_destroy_bind (XEN_DOMCTL_unbind_pt_irq) is the counterpart
> to PHYSDEVOP_get_free_pirq in some form. Which looks to be on QEMU side
> only called when the PCI device is put in sleep or pulled out of the guest.
>
> It probably shouldn't be called when the device is merely de-activated.

Right.

Jan

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/