Re: [PATCH v5 1/1] usb: gadget: add raw-gadget interface

From: Andrey Konovalov
Date: Wed Feb 05 2020 - 12:25:40 EST


On Wed, Feb 5, 2020 at 5:42 PM Felipe Balbi <balbi@xxxxxxxxxx> wrote:
>
>
> Hi,
>
> Andrey Konovalov <andreyknvl@xxxxxxxxxx> writes:
> >> >> > +static int raw_event_queue_add(struct raw_event_queue *queue,
> >> >> > + enum usb_raw_event_type type, size_t length, const void *data)
> >> >> > +{
> >> >> > + unsigned long flags;
> >> >> > + struct usb_raw_event *event;
> >> >> > +
> >> >> > + spin_lock_irqsave(&queue->lock, flags);
> >> >> > + if (WARN_ON(queue->size >= RAW_EVENT_QUEUE_SIZE)) {
> >> >> > + spin_unlock_irqrestore(&queue->lock, flags);
> >> >> > + return -ENOMEM;
> >> >> > + }
> >> >> > + event = kmalloc(sizeof(*event) + length, GFP_ATOMIC);
> >> >>
> >> >> I would very much prefer dropping GFP_ATOMIC here. Must you have this
> >> >> allocation under a spinlock?
> >> >
> >> > The issue here is not the spinlock, but that this might be called in
> >> > interrupt context. The number of atomic allocations here is restricted
> >> > by 128, and we can reduce the limit even further (until some point in
> >> > the future when and if we'll report more different events). Another
> >> > option would be to preallocate the required number of objects
> >> > (although we don't know the required size in advance, so we'll waste
> >> > some memory) and use those. What would you prefer?
> >>
> >> I think you shouldn't do either :-) Here's what I think you should do:
> >>
> >> 1. support O_NONBLOCK. This just means conditionally removing your
> >> wait_for_completion_interruptible().
> >
> > I don't think non blocking read/writes will work for us. We do
> > coverage-guided fuzzing and need to collect coverage for each syscall.
> > In the USB case "syscall" means processing a USB request from start to
> > end, and thus we need to wait until the kernel finishes processing it,
> > otherwise we'll fail to associate coverage properly (It's actually a
> > bit more complex, as we collect coverage for the whole initial
> > enumeration process as for one "syscall", but the general idea stands,
> > that we need to wait until the operation finishes.)
>
> Fair enough, but if the only use case that this covers, is a testing
> scenario, we don't gain much from accepting this upstream, right?

We gain a lot, even though it's just for testing. For one thing, once
the patch is upstream, all syzbot instances that target upstream-ish
branches will start fuzzing USB, and there won't be any need for me to
maintain a dedicated USB fuzzing branch manually. Another thing, is
that syzbot will be able to do fix/cause bisection (at least for the
bugs that are fixed/introduced after this patch is merged). And
finally, once this is upstream, we'll be able to backport this to
Android kernels and start testing them as well.

> We can
> still support both block and nonblock, but let's at least give the
> option.
>
> >> 2. Every time user calls write(), you usb_ep_alloc(), allocate a buffer
> >> with the write size, copy buffer to kernel space,
> >> usb_ep_queue(). When complete() callback is called, then you free the
> >> request. This would allow us to amortize the cost of copy_from_user()
> >> with several requests being queued to USB controller.
> >
> > I'm not sure I really get this part. We'll still need to call
> > copy_from_user() and usb_ep_queue() once per each operation/request.
> > How does it get amortized? Or do you mean that having multiple
> > requests queued will allow USB controller to process them in bulk?
>
> yes :-)
>
> > This makes sense, but again, we"ll then have an issue with coverage
> > association.
>
> You can still enqueue one by one, but this would turn your raw-gadget
> interface more interesting for other use cases.
>
> >> 3. Have a pre-allocated list of requests (128?) for read(). Enqueue them
> >> all during set_alt(). When user calls read() you will:
> >>
> >> a) check if there are completed requests to be copied over to
> >> userspace. Recycle the request.
> >>
> >> b) if there are no completed requests, then it depends on O_NONBLOCK
> >>
> >> i) If O_NONBLOCK, return -EWOULDBLOCK
> >> ii) otherwise, wait_for_completion
> >
> > See response to #1, if we prequeue requests, then the kernel will
> > start handling them before we do read(), and we'll fail to associate
> > coverage properly. (This will also require adding another ioctl to
> > imitate set_alt(), like the USB_RAW_IOCTL_CONFIGURE that we have.)
>
> set_alt() needs to be supported if we're aiming at providing support for
> various USB classes to be implemented on top of what you created :-)

What do you mean by supporting set_alt() here? AFAIU set_alt() is a
part of the composite gadget framework, which I don't use for this.
Are there some other actions (besides sending/receiving requests) that
need to be exposed to userspace to implement various USB classes? The
one that I know about is halting endpoints, it's mentioned in the TODO
section in documentation.

>
> >> I think this can all be done without any GFP_ATOMIC allocations.
> >
> > Overall, supporting O_NONBLOCK might be a useful feature for people
> > who are doing something else other than fuzzing, We can account for
> > potential future extensions that'll support it, so detecting
> > O_NONBLOCK and returning an error for now makes sense.
> >
> > WDYT?
>
> If that's the way you want to go, that's okay. But let's, then, prepare
> the code for extension later on. For example, let's add an IOCTL which
> returns the "version" of the ABI. Based on that, userspace can detect
> features and so on.

This sounds good to me. Let's concentrate on implementing the part
that is essential for testing/fuzzing, as it was the initial reason
why I started working on this, instead of using e.g. GadgetFS. I'll
add such IOCTL in v6.

Re GFP_ATOMIC allocations, if we're using the blocking approach,
should I decrease the limit of the number of such allocations or do
something else?

Re licensing comments, do I need to change anything after all?