Re: [PATCH 5/5] aio: Refactor aio_read_evt, use cmxchg(), fix bug

From: Zach Brown
Date: Wed Oct 10 2012 - 17:43:15 EST


> True. But that could be solved with a separate interface that either
> doesn't use a context to submit a call synchronously, or uses an
> implicit per thread context.

Sure, but why bother if we can make the one submission interface fast
enough to satisfy quick callers? Less is more, and all that.

> I don't have a _strong_ opinion there, but my intuition is that we
> shouldn't be creating new types of handles without a good reason. I
> don't think the annoyances are for the most part particular to file
> descriptors, I think the tend to be applicable to handles in general and
> at least with file descriptors they're known and solved.

I strongly disagree. That descriptors are an expensive limited
resources is a perfectly good reason to not make them required to access
the ring.

> That would be awesome, though for it to be worthwhile there couldn't be
> any kernel notion of a context at all and I'm not sure if that's
> practical. But the idea hadn't occured to me before and I'm sure you've
> thought about it more than I have... hrm.
>
> Oh hey, that's what acall does :P

:)

> For completions though you really want the ringbuffer pinned... what do
> you do about that?

I don't think the kernel has to mandate that, no. The code has to deal
with completions faulting, but they probably won't. In acall it
happened that completions always came from threads that could block so
its coping mechanism was to just use put_user() :).

If userspace wants them rings locked, they can mlock() the memory.

Think about it from another angle: the current mechanism of creating an
aio ring is a way to allocate pinned memory outside of the usual mlock
accounting. This could be abused, so aio grew an additional tunable to
limit the number of total entries in rings in the system.

By putting the ring in normal user memory we avoid that problem
entirely.

- z
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/