Re: [KVM PATCH v9 0/5] irqfd fixes and enhancements

From: Michael S. Tsirkin
Date: Mon Jul 06 2009 - 12:14:25 EST


On Mon, Jul 06, 2009 at 10:56:02AM -0400, Gregory Haskins wrote:
> Avi Kivity wrote:
> > On 07/02/2009 06:50 PM, Avi Kivity wrote:
> >> On 07/02/2009 06:37 PM, Gregory Haskins wrote:
> >>> (Applies to kvm.git/master:1f9050fd)
> >>>
> >>> The following is the latest attempt to fix the races in
> >>> irqfd/eventfd, as
> >>> well as restore DEASSIGN support. For more details, please read the
> >>> patch
> >>> headers.
> >>>
> >>> As always, this series has been tested against the kvm-eventfd unit
> >>> test
> >>> and everything appears to be functioning properly. You can download
> >>> this
> >>> test here:
> >>
> >> Applied, thanks.
> >>
> >
> > ... and unapplied. There's a refcounting mismatch in irqfd_cleanup: a
> > reference is taken for each irqfd, but dropped for each guest. This
> > causes an oops if a guest with no irqfds is created and destroyed:
>
> I was able to reproduce this issue. The problem turned out to be that I
> inadvertently always did a flush_workqueue(), even if the work-queue was
> never initialized.
>
> The following interdiff applied to the reverted patch has been confirmed
> to fix the issue:

Could you document the init boolean and its locking rules?
The best place to put it would be where the field is declared btw.
Is it true that init === list_empty(&kvm->irqfds.items)?
If yes maybe we don't need this field at all.


> -------------------
>
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index fcc3469..52b0e04 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -318,6 +318,9 @@ kvm_irqfd_deassign(struct kvm *kvm, int fd, int gsi)
> struct _irqfd *irqfd, *tmp;
> struct eventfd_ctx *eventfd;
>
> + if (!kvm->irqfds.init)
> + return -ENOENT;
> +
> eventfd = eventfd_ctx_fdget(fd);
> if (IS_ERR(eventfd))
> return PTR_ERR(eventfd);

wouldn't it be cleaner to error out in the for each loop if we don't
find an entry to deactivate? Might be helpful for apps to get an error
if they didn't deassign anything.

> @@ -360,6 +363,9 @@ kvm_irqfd_release(struct kvm *kvm)
> {
> struct _irqfd *irqfd, *tmp;
>
> + if (!kvm->irqfds.init)
> + return;
> +

So here, I recall some old comment that flush below was
needed even if list is empty. Is this no longer true?
If not it might be cleaner to only flush if list is not empty.


> spin_lock_irq(&kvm->irqfds.lock);
>
> list_for_each_entry_safe(irqfd, tmp, &kvm->irqfds.items, list)
>
> ---------------------
>
> You can pick up this fix folded into the original v9:5/5 patch here:
>
> git pull
> git://git.kernel.org/pub/scm/linux/kernel/git/ghaskins/linux-2.6-hacks.git
> for-avi
>
> Sorry for the sloppy patch in v9. :( Will strive to do better next time.
>
> Regards,
> -Greg
>


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/