Re: [PATCH v5 0/4] kvm: level irqfd and new eoifd

From: Gleb Natapov
Date: Sun Jul 22 2012 - 11:09:18 EST


On Fri, Jul 20, 2012 at 01:07:32PM +0300, Michael S. Tsirkin wrote:
> On Thu, Jul 19, 2012 at 12:48:07PM -0600, Alex Williamson wrote:
> > On Thu, 2012-07-19 at 20:45 +0300, Michael S. Tsirkin wrote:
> > > On Thu, Jul 19, 2012 at 11:29:38AM -0600, Alex Williamson wrote:
> > > > On Thu, 2012-07-19 at 19:59 +0300, Michael S. Tsirkin wrote:
> > > > > On Mon, Jul 16, 2012 at 02:33:38PM -0600, Alex Williamson wrote:
> > > > > > v5:
> > > > > > - irqfds now have a one-to-one mapping with eoifds to prevent users
> > > > > > from consuming all of kernel memory by repeatedly creating eoifds
> > > > > > from a single irqfd.
> > > > > > - implement a kvm_clear_irq() which does a test_and_clear_bit of
> > > > > > the irq_state, only updating the pic/ioapic if changes and allowing
> > > > > > the caller to know if anything was done. I added this onto the end
> > > > > > as it's essentially an optimization on the previous design. It's
> > > > > > hard to tell if there's an actual performance benefit to this.
> > > > > > - dropped eoifd gsi support patch as it was only an FYI.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex
> > > > >
> > > > >
> > > > > So 3/4, 4/4 are racy and I think you convinced me it's best to drop it for
> > > > > now. I hope that fact that we already scan all vcpus under spinlock for
> > > > > level interrupts is enough to justify adding a lock here.
> > > > >
> > > > > To summarize issues still outstanding with 1/2, 2/2:
> > > > (a)
> > > > > - source id lingering after irqfd was destroyed/deassigned
> > > > > prevents assigning a new irqfd
> > > > (b)
> > > > > - if same irqfd is deassigned and re-assigned, this
> > > > > seems to succeed but does not give any more EOIs
> > > > (c)
> > > > > - document that user needs to re-inject interrupts
> > > > > injected by level IRQFD after migration as they are cleared
> > > > >
> > > > > Hope this helps!
> > > >
> > > > Thanks, I'm refining and testing a re-write. One thing I also noticed
> > > > is that we don't do anything when the eoifd is closed. We'll cleanup
> > > > when kvm is closed, but that can leave a lot of stray eoifds, and
> > > > therefore used irq_source_ids tied up. So, I think I need to pull in a
> > > > lot of the irqfd code just to be able to catch the POLLHUP and do
> > > > cleanup.
> > >
> > > I don't think it's worth it. With ioeventfd we have the same issue
> > > and we don't care: userspace should just DEASSIGN before close.
> > > With irqfd we committed to support cleanup by close but
> > > it happens kind of naturally since we poll irqfd anyway.
> > >
> > > It's there for irqfd for historical reasons.
> >
> > You're not dealing with such a limited resource for ioeventfds though.
> > It's pretty easily conceivable we could run out of irq source IDs.
>
> Running out of fds is also very conceivable. Not deassigning
> before close is a userspace bug anyway.
>
Close should free all recourses allocated by an fd. What if a code that
closes the fd have no idea what cleanup should be done (fd was passed by
unix socket). Heck, the code may not have permission to call ioctl
to deassign.

--
Gleb.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/