Re: [PATCH v5 2/4] kvm: KVM_EOIFD, an eventfd for EOIs

From: Michael S. Tsirkin
Date: Tue Jul 17 2012 - 11:13:02 EST


On Tue, Jul 17, 2012 at 08:57:04AM -0600, Alex Williamson wrote:
> On Tue, 2012-07-17 at 17:42 +0300, Michael S. Tsirkin wrote:
> > On Tue, Jul 17, 2012 at 08:29:43AM -0600, Alex Williamson wrote:
> > > On Tue, 2012-07-17 at 17:10 +0300, Michael S. Tsirkin wrote:
> > > > On Tue, Jul 17, 2012 at 07:59:16AM -0600, Alex Williamson wrote:
> > > > > On Tue, 2012-07-17 at 13:21 +0300, Michael S. Tsirkin wrote:
> > > > > > On Mon, Jul 16, 2012 at 02:33:55PM -0600, Alex Williamson wrote:
> > > > > > > + if (args->flags & KVM_EOIFD_FLAG_LEVEL_IRQFD) {
> > > > > > > + struct _irqfd *irqfd = _irqfd_fdget_lock(kvm, args->irqfd);
> > > > > > > + if (IS_ERR(irqfd)) {
> > > > > > > + ret = PTR_ERR(irqfd);
> > > > > > > + goto fail;
> > > > > > > + }
> > > > > > > +
> > > > > > > + gsi = irqfd->gsi;
> > > > > > > + level_irqfd = eventfd_ctx_get(irqfd->eventfd);
> > > > > > > + source = _irq_source_get(irqfd->source);
> > > > > > > + _irqfd_put_unlock(irqfd);
> > > > > > > + if (!source) {
> > > > > > > + ret = -EINVAL;
> > > > > > > + goto fail;
> > > > > > > + }
> > > > > > > + } else {
> > > > > > > + ret = -EINVAL;
> > > > > > > + goto fail;
> > > > > > > + }
> > > > > > > +
> > > > > > > + INIT_LIST_HEAD(&eoifd->list);
> > > > > > > + eoifd->kvm = kvm;
> > > > > > > + eoifd->eventfd = eventfd;
> > > > > > > + eoifd->source = source;
> > > > > > > + eoifd->level_irqfd = level_irqfd;
> > > > > > > + eoifd->notifier.gsi = gsi;
> > > > > > > + eoifd->notifier.irq_acked = eoifd_event;
> > > > > >
> > > > > > OK so this means eoifd keeps a reference to the irqfd.
> > > > > > And since this is the case, can't we drop the reference counting
> > > > > > around source ids now? Everything is referenced through irqfd.
> > > > >
> > > > > Holding a reference and using it as a reference count are not the same
> > > > > thing. What if another module holds a reference to this eventfd? How
> > > > > do we do anything on release?
> > > >
> > > > We don't as there is no release, and using kref on source id does not
> > > > help: it just never gets invoked.
> > >
> > > Please work out how you think it should work and let me know, I don't
> > > see it. We have an irq source id that needs to be allocated by irqfd
> > > and returned when it's unused. It becomes unused when neither irqfd nor
> > > eoifd are making use of it. irqfd and eoifd may be closed in any order.
> > > Use of the source id is what we're reference counting, which is why it's
> > > in struct _irq_source. How can I use an eventfd reference for the same?
> > > I don't know when it's unused. I don't know who else holds a reference
> > > to it... Doesn't make sense to me. Feels like attempting to squat on
> > > someone else's object.
> > >
> > >
> >
> > eoifd should prevent irqfd from being released.
>
> Why? Note that this is actually quite difficult too. We can't fail a
> release, nobody checks close(3p) return. Blocking a release is likely
> to cause all sorts of problems, so what you mean is that irqfd should
> linger around until there are no references to it... but that's exactly
> what struct _irq_source is for, is to hold the bits that we care about
> references to and automatically release it when there are none.

No no. You *already* prevent it. You take a reference to the eventfd
context.

> > It already keeps
> > a reference to it so it prevents irqfd from going away by userspace
> > closing the fd.
>
> Wrong, eoifd holds a reference to the eventfd for the irqfd, so it
> prevents the fd from going away, not the irqfd.

When the fd is no going away an ioctl is the only other way for
it to go away.

> > But it can still be released with deassign.
> > An easy solution is to fail deassign of irqfd if there is
> > eoifd bound to it.
>
> I don't know why we would impose such a bizarre usage model when
> reference counting on struct _irq_source seems to handle this nicely
> already.

Well eventfd gets an irqfd. What does it mean if said irqfd gets
deassigned, and potentially assigned an unrelated interrupt?
I think what I would expect is for it to handle the new interrupt.
This is hard to implement so let us fail this.

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/