Re: [PATCH] exit: move exit_task_namespaces() after exit_task_work()

From: Cong Wang
Date: Fri Dec 15 2017 - 19:01:28 EST


On Fri, Dec 15, 2017 at 12:00 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> On Fri, Dec 15, 2017 at 8:35 AM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
>> On Fri, Dec 15, 2017 at 7:56 AM, Eric W. Biederman
>> <ebiederm@xxxxxxxxxxxx> wrote:
>>> Cong Wang <xiyou.wangcong@xxxxxxxxx> writes:
>>>
>>>> syzbot reported we have a use-after-free when mqueue_evict_inode()
>>>> is called on __cleanup_mnt() path, where the ipc ns is already
>>>> freed by the previous exit_task_namespaces(). We can just move
>>>> it after after exit_task_work() to avoid this use-after-free.
>>>
>>> How does that possibly work. (I haven't seen this syzbot report).
>>>
>>> Looking at the code we have get_ns_from_inode. Which takes the mq_lock,
>>> sees if the pointer is NULL and takes a reference if it is non-NULL.
>>>
>>> Meanwhile put_ipc_ns calls mq_clear_sbinfo(ns) with the mq_lock held
>>> when the count drops to zero.
>>>
>>> Where is the race in that?
>>>
>>> The rest of mqueue_evict_inode uses the returned pointer and
>>> tests that the pointer is non-NULL before user it.
>>>
>>> So either szbot is giving you a bad report or there is a subtle race
>>> there I am not seeing. The change below is not at all the proper way to
>>> fix a subtle race.
>>>
>>> Eric
>>
>> Cong, what was that report? Searching by
>> "exit_task_work|exit_task_namespaces" there are too many of them:
>> https://groups.google.com/forum/#!searchin/syzkaller-bugs/%22exit_task_work$7Cexit_task_namespaces%22%7Csort:date
>>
>> I can only say that syzbot does not make up reports. That's something
>> that actually happened and was provoked by userspace.
>
>
> Ah, found that bug:
> https://groups.google.com/d/msg/syzkaller-bugs/1XBaqnPSXzs/VF-eCSPuCQAJ

Yeah, and it is introduced by:

http://git.cmpxchg.org/cgit.cgi/linux-mmots.git/commit/?id=9c583773d036336176e9e50441890659bc4eeae8