Re: [PATCH] IPC initialize shmmax and shmall from the current value not the default

From: Marian Marinov
Date: Sun May 04 2014 - 05:30:56 EST


On 05/04/2014 04:20 AM, Davidlohr Bueso wrote:
On Sun, 2014-05-04 at 03:28 +0300, Marian Marinov wrote:
On 05/04/2014 02:53 AM, Davidlohr Bueso wrote:
On Sun, 2014-05-04 at 01:48 +0300, Marian Marinov wrote:
When we are creating new IPC namespace that should be cloned from the current namespace it is a good idea to copy the
values of the current shmmax and shmall to the new namespace.

Why is this a good idea?

This would break userspace that relies on the current behavior.
Furthermore we've recently changed the default value of both these
limits to be as large as you can get, thus deprecating them. I don't
like the idea of this being replaced by namespaces.

Thanks,
Davidlohr


The current behavior is create_ipc_ns()->shm_init_ns()

void shm_init_ns(struct ipc_namespace *ns)
{
ns->shm_ctlmax = SHMMAX;
ns->shm_ctlall = SHMALL;
ns->shm_ctlmni = SHMMNI;
ns->shm_rmid_forced = 0;
ns->shm_tot = 0;
ipc_init_ids(&shm_ids(ns));
}

This means that whenever you are creating an IPC namespace it gets its SHMMAX and SHMALL values from the defaults for
the kernel.

This is exactly what I meant by 'current behavior'.

If for some reason you want to have smaller(or bigger, for older kernels) limit. This means changing the values in
/proc/sys/kernel/shmmax and /proc/sys/kernel/shmall. However the program that is started with the new IPC namespace may
lack privileges to write to these files and so it can not modify them.

I see no reason why namespaces should behave any different than the rest
of the system, wrt this. And this changes how *and* when these limits
are set, which impacts at a userspace level with no justification.

What I'm proposing is simply to copy the current values of the host machine, as set by a privileged process before the
namespace creation.

Maybe a better approach would be to allow the changes to be done by processes having CAP_SYS_RESOURCE inside the new
namespace?

Why do you need this? Is there any real impact/issue you're seeing?

I'm using Linux Containers and I need to be able to either start containers with different SHMMAX or set different SHMMAX to already running containers without giving them full root access.

-Marian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/