Re: [RFC] c/r: prctl: Add ability to set new mm_struct::exe_file v3

From: Oleg Nesterov
Date: Fri Mar 09 2012 - 09:33:39 EST


On 03/09, Cyrill Gorcunov wrote:
>
> > > + /*
> > > + * Setting new mm::exe_file is only allowed
> > > + * when no VM_EXECUTABLE vma's left. This is
> > > + * a special C/R case when a restored program
> > > + * need to change own /proc/$pid/exe symlink.
> > > + * After this call mm::num_exe_file_vmas become
> > > + * meaningless. If mm::num_exe_file_vmas will
> > > + * ever increase back from zero -- this code
> > > + * needs to be revised, thus WARN_ here, just
> > > + * to be sure.
> >
> > To be shure in what??
>
> To be sure it's not increased somewhere else before
> down_write taken.

Who can do this? Only another CLONE_VM thread. And _only_ if we
already have the bug in mm_exe accounting logic. And only if that
thread does something to trigger the bug in the small window
between.

> > I'd understand if you add something like
> >
> > WARN_ON(!mm->num_exe_file_vmas && !current->in_exec);
> >
> > into added_exe_file_vma().
> >
> > Or
> > WARN_ON(mm->num_exe_file_vmas <= 0);
> >
> > into removed_exe_file_vma().
>
> This one looks like a good idea for me -- it's cheap and
> not a hot path.

But not in this patch, please.

> > But imho your WARN looks like "OK, I checked it lockless but I
> > am not sure this is correct".
>
> Oleg, I bet if someone will be changing num_exe_file_vmas overall
> idea -- this prctl code will be fixed at last moment (if ever) only
> because it's very specific, so I wanted to not miss such moment
> and add some check that the rest of the kernel is in a good state.
> This test is cheap but may prevent potential problem if one day
> mm::exe_file concept will be reworked.

The test is cheap indeed. If you mean performance-wise.

But it looks confusing, imho. I do not care about a couple of CPU
cycles. The code should be optimized for the reading in the first
place, not for executing ;) Imho, of course.

And once again. Following your logic you need another WARN_ON()
right after we drop mmap_sem. Why? To be sure it's not increased
somewhere else _after_ down_write taken. And another one after
fput.

Sure, bugs are possible. And yes, in theory this WARN_ON() can
catch some problem. But there is tradeoff. Given that you need
another thread to trigger the (potential) bug and the window is
tiny, how high do you estimate the probability it can help?

> Sure I can simply drop this WARN_ON ;)

Oh, keep it if you like it ;)

Yes I hate it, but you are the author and this is almost cosmetic.

Oleg.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/