Re: [RFC][PATCH] Cross Memory Attach

From: Ingo Molnar
Date: Wed Sep 15 2010 - 09:52:21 EST



* Avi Kivity <avi@xxxxxxxxxx> wrote:

> On 09/15/2010 03:18 AM, Christopher Yeoh wrote:
>
> > The basic idea behind cross memory attach is to allow MPI programs
> > doing intra-node communication to do a single copy of the message
> > rather than a double copy of the message via shared memory.
>
> If the host has a dma engine (many modern ones do) you can reduce this
> to zero copies (at least, zero processor copies).
>
> > The following patch attempts to achieve this by allowing a
> > destination process, given an address and size from a source
> > process, to copy memory directly from the source process into its
> > own address space via a system call. There is also a symmetrical
> > ability to copy from the current process's address space into a
> > destination process's address space.
>
> Instead of those two syscalls, how about a vmfd(pid_t pid, ulong
> start, ulong len) system call which returns an file descriptor that
> represents a portion of the process address space. You can then use
> preadv() and pwritev() to copy memory, and io_submit(IO_CMD_PREADV)
> and io_submit(IO_CMD_PWRITEV) for asynchronous variants (especially
> useful with a dma engine, since that adds latency).
>
> With some care (and use of mmu_notifiers) you can even mmap() your
> vmfd and access remote process memory directly.
>
> A nice property of file descriptors is that you can pass them around
> securely via SCM_RIGHTS. So a process can create a window into its
> address space and pass it to other processes.
>
> (or you could just use a shared memory object and pass it around)

Interesting, but how will that work in a scalable way with lots of
non-thread tasks?

Say we have 100 processes. We'd have to have 100 fd's - each has to be
passed to a new worker process.

In that sense a PID is just as good of a reference as an fd - it can be
looked up lockless, etc. - but has the added advantage that it can be
passed along just by number.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/