Re: [PATCH -mm 1/9] unshare system call: system call handler function

From: JANAK DESAI
Date: Thu Dec 15 2005 - 23:34:25 EST


Eric W. Biederman wrote:

JANAK DESAI <janak@xxxxxxxxxx> writes:



Eric W. Biederman wrote:



If it isn't legal how about we deny the unshare call.
Then we can share this code with clone.

Eric






unshare is doing the reverse of what clone does. So if you are unsharing
namespace
you want to make sure that you are not sharing fs. That's why the CLONE_FS flag
is forced (meaning you are also going to unshare fs). Since namespace is shared
by default and fs is not, if clone is called with CLONE_NEWNS, the intent is to
create a new namespace, which means you cannot share fs. clone checks to
makes sure that CLONE_NEWNS and CLONE_FS are not specified together, while
unshare will force CLONE_FS if CLONE_NEWNS is spefcified. Basically you
can have a shared namespace and non shared fs, but not the other way around and
since clone and unshare are doing opposite things, sharing code between them
that
checks constraints on the flags can get convoluted.



I follow but I am very disturbed.

You are leaving CLONE_NEWNS to mean you want a new namespace.

For clone CLONE_FS unset means generate an unshared fs_struct
CLONE_FS set means share the fs_struct with the parent

But for unshare CLONE_FS unset means share the fs_struct with others
and CLONE_FS set means generate an unshared fs_struct

The meaning of CLONE_FS between the two calls in now flipped,
but CLONE_NEWNS is not. Please let's not implement it this way.



CLONE_FS unset for unshare doesn't mean that share the fs_struct with
others. It just means that you leave the fs_struct alone (which may or
maynot be shared). I agree that it is confusing, but I am having difficulty
in seeing this reversal of flag meaning. clone creates a second task and
allows you to pick what you want to share with the parent. unshare allows
you to pick what you don't want to share anymore. The "what" in both
cases can be same and still you can end up with a task with "share" state
for a perticular structure. For example, if we #define FS CLONE_FS,
then clone(FS) will imply that you want to share fs_struct but unshare(FS)
will imply that we want to unshare fs_struct. Does it still appear that
the flag meaning has changed? clone and unshare are doing different
things, the flag just indicates the structure on which they operate.

Part of the problem is the double negative in the name, leading
me to suggest that sys_share might almost be a better name.



I disagree. You cannot start sharing with this system call. You can
only unshare a shared structure. If you don't specify a flag corresponding
to a structure, that structure stays in its current state which may be
shared or unshared. Since the only action you can perform with this
call is unshare, unshare is the more appropriate name.

So please code don't invert the meaning of the bits. This will
allow sharing of the sanity checks with clone.

In addition this leaves open the possibility that routines like
copy_fs properly refactored can be shared between clone and unshare.




umm.. I am not sure how you were thinking of refactoring it, but my attempt at
using copy_* functions introduced race conditions.

Eric





-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/