Re: [KVM PATCH v9 2/2] KVM: add iosignalfd support

From: Michael S. Tsirkin
Date: Tue Jul 07 2009 - 08:49:01 EST


On Tue, Jul 07, 2009 at 08:15:08AM -0400, Gregory Haskins wrote:
> Michael S. Tsirkin wrote:
> > Some comments below. Sorry, I know it's late in the series, but
> > previously I've been too confused with complicated locking to notice
> > anything else :(.
> >
> > On Mon, Jul 06, 2009 at 04:33:21PM -0400, Gregory Haskins wrote:
> >
> >
> >> +#define KVM_IOSIGNALFD_FLAG_PIO (1 << 1) /* is a pio (otherwise mmio) */
> >> +#define KVM_IOSIGNALFD_FLAG_DEASSIGN (1 << 2)
> >> +
> >> +struct kvm_iosignalfd {
> >> + __u64 trigger;
> >>
> >
> > for length <8, it's the 8*len least significant bits that are used, right?
> > That's a bit ugly ... Maybe just put an 8 byte array here instead, then
> > the first len bytes are valid.
> >
>
> The original series uses an array that I kmalloc'ed to size, or left it
> NULL for a wildcard. Avi didn't like this and requested a u64 for all
> values. I don't care either way, but since you two are asking for
> conflicting designs, I will let you debate.

It turns out the io bus will be changed in the future.
Let's just document the usage, and check that unused bits
are zeroed.

> >> + struct kvm_io_device dev;
> >> + int wildcard:1;
> >>
> >
> > don't use bitfields
> >
>
> bool?

sure

> >> +}
> >> +
> >> +/*
> >> + * MMIO/PIO writes trigger an event if the addr/val match
> >> + */
> >>
> >
> > single line comment can look like this:
> > /* MMIO/PIO writes trigger an event if the addr/val match */
> >
> >
> Ack
>
> >> +static int
> >> +iosignalfd_write(struct kvm_io_device *this, gpa_t addr, int len,
> >> + const void *val)
> >> +{
> >> + struct _iosignalfd *p = to_iosignalfd(this);
> >> +
> >> + if (!iosignalfd_in_range(p, addr, len, val))
> >> + return -EOPNOTSUPP;
> >> +
> >> + eventfd_signal(p->eventfd, 1);
> >> + return 0;
> >> +}
> >> +
> >> +/*
> >> + * This function is called as KVM is completely shutting down. We do not
> >> + * need to worry about locking just nuke anything we have as quickly as possible
> >> + */
> >> +static void
> >> +iosignalfd_destructor(struct kvm_io_device *this)
> >> +{
> >> + struct _iosignalfd *p = to_iosignalfd(this);
> >> +
> >> + iosignalfd_release(p);
> >> +}
> >> +
> >> +static const struct kvm_io_device_ops iosignalfd_ops = {
> >> + .write = iosignalfd_write,
> >> + .destructor = iosignalfd_destructor,
> >> +};
> >> +
> >> +static bool
> >> +iosignalfd_overlap(struct _iosignalfd *lhs, struct _iosignalfd *rhs)
> >>
> >
> > this checks both region overlap and data collision.
> > better split into two helpers?
> >
> >
> Why?

Because it says iosignalfd_overlap but that's not what it does?

> >> +{
> >> + /*
> >> + * Check for completely non-overlapping regions. We can simply
> >> + * return "false" for non-overlapping regions and be done with
> >> + * it.
> >> + */
> >>
> >
> > done with it -> ignore the value
> >
> >
> I think that is a valid expression (at least here in the states),

I'm always interested in improving my english :) here's what I thought:

"done with it" says we skip the rest of the function. "ignore the value"
says skip the value check. So I was trying to say let's be more
specific.

Isn't that what it means?

> but
> ok. Note that "false" means we are accepting the request, not ignoring
> any value. I will construct a better comment either way.

Maybe name a function in a way that makes this explicit?

> >> +
> >> + if ((lhs->addr + lhs->length) <= rhs->addr)
> >>
> >
> > this math can overflow.
> >
> >
> Well, we should probably have vetted that during assign since
> addr+length that overflows is not a valid region. I will put a check in
> for that.

add = ~0x0ULL, len = 1 should be valid.
You'll have to make the math not overflow.

> >> +
> >> + /*
> >> + * If we get here, we know there is *some* overlap, but we don't
> >> + * yet know how much.
> >>
> >
> > how much?
> >
> Well, as stated we don't know yet.

So why tell me :)?
What I mean is this statement (especially the the part of the statement
after the comma) does not add useful information.

> >
> >> Make sure its a "precise" overlap, or
> >>
> >
> > precise overlap -> address/len pairs match
> >
> >
>
> Precisely.

so say so :)

> >> + * its rejected as incompatible
> >> + */
> >>
> >
> > "rejected" does not seem to make sense in the context of a boolean
> > function
> >
> >
>
> Why? true = rejected, false = accepted. Seems boolean to me.

How should I know that? Rename it to iosignalfd_rejected?

> Whats
> wrong with that?

If you want to return 0 on success, != 0 on failure,
make the function int.

> >> +
> >> +/* assumes kvm->slots_lock write-lock held */
> >>
> >
> > it seems you only need read lock?
> >
> >
>
> The caller needs write-lock, so we just inherit that state. I can
> update the comment though (I just ran a find/replace on "kvm->lock held"
> while updating to your new interface, thus the vague comment)

I guess so.

> >> +static bool
> >> +iosignalfd_check_collision(struct kvm *kvm, struct _iosignalfd *p)
> >> +{
> >> + struct _iosignalfd *_p;
> >> +
> >> + list_for_each_entry(_p, &kvm->iosignalfds, list)
> >> + if (iosignalfd_overlap(_p, p))
> >>
> >
> > This looks wrong: let's assume I want one signal with length 1 and one
> > with length 2 at address 0. They never trigger together, so it should
> > be ok to have 2 such devices.
> >
>
> We have previously decided to not support mis-matched overlaps. It's
> more complicated to handle, and there isn't a compelling use-case for it
> that I am aware of.

That's not what I propose. By all means len = X should only catch
len = X and not len = X - 1.

However, I think it makes sense to
allow both len = 2 and len = 1 at the same address even though
they seem to overlap. And all you really need to do is simplify
the code: replace the tricky overlap logic with simple

if (rhs->addr == lhs->addr && rhs->len == lhs->len)
reject
else
accept

(+add wildcard thing)

> >> + if (p->eventfd != eventfd)
> >> + continue;
> >>
> >
> > So for deassign, you ignore all arguments besides fd? Is this
> > intentional? Looks strange - think of multiple addresses triggering a
> > single eventfd. But if so, it's better to have a different ioctl with
> > just the fields we need.
> >
>
> Hmm... I suspect the check for a range-match got lost along the way. I
> agree we should probably qualify this with more than just the eventfd.
>
> >
> >> +
> >> + __kvm_io_bus_unregister_dev(bus, &p->dev);
> >> + iosignalfd_release(p);
> >>
> >
> > a single deassign removed multiple irqfds? Looks ugly.
> >
>
> Avi requested this general concept.

Really? Avi, could you explain? I would think each
assign needs to be matched with 1 deassign. No?

--
MST
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/