Re: [openib-general] Re: [PATCH][RFC][0/4] InfiniBand userspaceverbs implementation

From: Andrew Morton
Date: Mon Apr 25 2005 - 19:15:07 EST

Caitlin Bestler <caitlin.bestler@xxxxxxxxx> wrote:
> >
> > > > This is because there is no file descriptor or anything else associated
> > > > with the pages which permits the kernel to clean stuff up on unclean
> > > > application exit. Also there are the obvious issues with permitting
> > > > pinning of unbounded amounts of memory.
> > >
> > > Correct, the driver must be able to determine that the process has died
> > > and clean up after it, so the pinned region in most implementations is
> > > associated with an open file descriptor.
> >
> > How is that association created?
> There is not a file descrptor, but there is an rnic handle. Both DAPL
> and IT-API require that process death will result in the handle and all
> of its dependent objects being released.

What's an "rnic handle", in Linux terms?

> The rnic handle can always be declared to be a "file descriptor" if
> that makes it follow normal OS conventions more precisiely.

Does that mean that the code has not yet been implemented?

Yes, a Linux fd is appropriate. But we don't have any sane way right now
of saying "you need to run put_page() against all these pages in the
->release() handler". That'll need to be coded by yourselves.

> There is also a need for some form of resource manager to approve
> creation of Memory Regions. Obviously you cannot have multiple
> applications claiming half of physical memory.

The kernel already has considerable resource management capabilities.
Please consider using/extending/generalising those before inventing
anything new. RLIMIT_MEMLOCK would be a starting point.

> But if you merely require the user to have root privileges in order
> to create a Memory Region, and then take a first-come first-served
> attitude, I don't think you end up with something that is truly a
> general purpose capability.

We don't want code in the kernel which will permit hostile unprivileged
users to trivially cause the box to lock up. RLIMIT_MEMLOCK and, if
necessary, CAP_IPC_LOCK sound appropriate here.

> A general purpose RDMA capability requires the ability to indefinitely
> pin large portions of user memory. It makes sense to integrate that
> with OS policy control over resource utilization and to integrate it with
> memory suspend/resume capabilities so that hotplug memory can
> be supported. What you can't do is downgrade a Memory Region so
> that it is no longer a memory region. Doing that means that you are
> not truly supporting RDMA.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at