Re: kvm aio wishlist

From: Avi Kivity
Date: Tue Nov 25 2008 - 05:49:37 EST


Suparna Bhattacharya wrote:
Why not extend io_submit() to use a thread pool when going through a non-aio-ready path? Yet a new interface, with another round of integrating to the previous interfaces, is not a comforting thought. I still haven't got used to the fact that aio can work with fd polling.

Even paths that provide fop->aio_read/write can be synchronous (like non
O_DIRECT filesystem read/writes) underneath, and then there could be multiple
blocking points.

If they are known to be synchronous when execution starts, they could just return -ENOSYS and fall back to threads, until someone implements a truly async path.

BTW, Ben had implemented a fallback approach that spawned kernel threads
- it was an initial patch and didn't do any thread pooling at that time.

I had a fallback path for pollable fds which did not require thread pools
http://lwn.net/Articles/216443/ (limited to fds which support non blocking semantics)

These are good solutions for the complex-blocking and never blocking cases.

OR

Maybe we could use a very simple version of syslets to do an io_submit
in libaio :)

Does the syslet approach of continuing in a different thread (different
thread id) affect kvm ?

Yes, we like to pthread_kill() threads from time to time, and even expose the thread IDs to management tools so they can control pinning.

Perhaps a variant of syslet, that is kernel-only, and does:

- always allocate a new kernel stack at io_submit() time, but not a new thread
- start executing the rarely-blocking path of the request (like block mapping and get_users_pages_fast) on the new stack
- if we block here, clone a new thread and graft the stack onto it
- start the always-blocking portion of the call (enqueuing a bio)
- exit the new thead if we hit the slowpath, or deallocate the stack and longjmp back to the main stack if we did not

This does not expose any new semantics to userspace. It does twist the guts of the kernel in that we have to duplicate thread_info, but if thread_info is only accessed from current, I think that is managable.

(I think I just described fibrils, no? I think that was a good idea. Why can't we go back to it?)

--
error compiling committee.c: too many arguments to function

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/