Re: [RFC PATCH] kernel/fork: fix CLONE_CHILD_CLEARTID regression in nscd

From: Michal Hocko
Date: Fri Aug 12 2016 - 05:43:57 EST


On Wed 03-08-16 23:08:04, Oleg Nesterov wrote:
> sorry for delay, I am travelling till the end of the week.

Same here...

> On 08/01, Michal Hocko wrote:
> >
> > fec1d0115240 ("[PATCH] Disable CLONE_CHILD_CLEARTID for abnormal exit")
>
> almost 10 years ago ;)

Yes, it's been a while... I guess nscd doesn't enable persistent host
caching by default. I just know that our customer wanted to enable this
feature to find out it doesn't work properly. At least that is my
understanding.

> > has caused a subtle regression in nscd which uses CLONE_CHILD_CLEARTID
> > to clear the nscd_certainly_running flag in the shared databases, so
> > that the clients are notified when nscd is restarted.
>
> So iiuc with this patch nscd_certainly_running should be cleared even if
> ncsd was killed by !sig_kernel_coredump() signal, right?

Yes.

> > We should also check for vfork because
> > this is killable since d68b46fe16ad ("vfork: make it killable").
>
> Hmm, why? Can't understand... In any case this check doesn't look right, the
> comment says "a killed vfork parent" while tsk->vfork_done != NULL means it
> is a vforked child.
>
> So if we want this change, why we can't simply do
>
> - if (!(tsk->flags & PF_SIGNALED) &&
> + if (!(tsk->signal->flags & SIGNAL_GROUP_COREDUMP) &&
>
> ?

This is what I had initially. But then the comment above the check made
me worried that the parent of vforked child might get confused if the
flag is cleared. I might have completely misunderstood the point of the
comment though. So if you believe that vfork_done check is incorrect I
can drop it. It shouldn't have any effect on the nscd usecase AFAIU.

Thanks!

--
Michal Hocko
SUSE Labs