Re: [PATCH] Fix race between oom kill and task exit

From: Sameer Nanda
Date: Fri Dec 06 2013 - 12:55:26 EST


On Fri, Dec 6, 2013 at 7:19 AM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
> On 12/05, David Rientjes wrote:
>>
>> On Thu, 5 Dec 2013, Oleg Nesterov wrote:
>>
>> > > Your v2 series looks good and I suspect anybody trying them doesn't have
>> > > additional reports of the infinite loop? Should they be marked for
>> > > stable?
>> >
>> > Unlikely...
>> >
>> > I think the patch from Sameer makes more sense for stable as a temporary
>> > (and obviously incomplete) fix.
>>
>> There's a problem because none of this is currently even in linux-next. I
>> think we could make a case for getting Sameer's patch at
>> http://marc.info/?l=linux-kernel&m=138436313021133 to be merged for
>> stable,
>
> Probably.
>
> Ah, I just noticed that this change
>
> - if (p->flags & PF_EXITING) {
> + if (p->flags & PF_EXITING || !pid_alive(p)) {
>
> is not needed. !pid_alive(p) obviously implies PF_EXITING.

Ah right.

>
>> but then we'd have to revert it in linux-next
>
> Or perhaps Sameer can just send his fix to stable/gregkh.
>
> Just the changelog should clearly explain that this is the minimal
> workaround for stable. Once again it doesn't (and can't) fix all
> problems even in oom_kill_process() paths, but it helps anyway to
> avoid the easy-to-trigger hang.

I don't mind doing that if that seems to be the consensus. FWIW, I've
already added my patch to the Chrome OS kernel repo.

>
>> before merging your
>> series at http://marc.info/?l=linux-kernel&m=138616217925981.
>
> Just in case, I won't mind to rediff my patches on top of Sameer's
> patch and then add git-revert patch.
>
>> All of the
>> issues you present in that series seem to be stable material, so why not
>> just go ahead with your series and mark it for stable for 3.13?
>
> OK... I can do this too.
>
> I do not really like this because it adds thread_head/node but doesn't
> remove the old ->thread_group. We will do this later, but obviously
> this is not the stable material.
>
> IOW, if we send this to stable, thread_head/node/for_each_thread will
> be only used by oom_kill.c.
>
> And this is risky. For example, 1/4 depends on (at least) another patch
> I sent in preparation for this change, commit 81907739851
> "kernel/fork.c:copy_process(): don't add the uninitialized
> child to thread/task/pid lists", perhaps on something else.
>
> So personally I'd prefer to simply send the workaround for stable.
>
> Oleg.
>



--
Sameer
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/