Re: [RFC][PATCH] Cross Memory Attach

From: Avi Kivity
Date: Wed Sep 15 2010 - 06:58:42 EST


On 09/15/2010 03:18 AM, Christopher Yeoh wrote:
The basic idea behind cross memory attach is to allow MPI programs doing
intra-node communication to do a single copy of the message rather than
a double copy of the message via shared memory.

If the host has a dma engine (many modern ones do) you can reduce this to zero copies (at least, zero processor copies).

The following patch attempts to achieve this by allowing a
destination process, given an address and size from a source process, to
copy memory directly from the source process into its own address space
via a system call. There is also a symmetrical ability to copy from
the current process's address space into a destination process's
address space.



Instead of those two syscalls, how about a vmfd(pid_t pid, ulong start, ulong len) system call which returns an file descriptor that represents a portion of the process address space. You can then use preadv() and pwritev() to copy memory, and io_submit(IO_CMD_PREADV) and io_submit(IO_CMD_PWRITEV) for asynchronous variants (especially useful with a dma engine, since that adds latency).

With some care (and use of mmu_notifiers) you can even mmap() your vmfd and access remote process memory directly.

A nice property of file descriptors is that you can pass them around securely via SCM_RIGHTS. So a process can create a window into its address space and pass it to other processes.

(or you could just use a shared memory object and pass it around)

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/