Re: [PATCH] kernel: fix data race in put_pid

From: Dmitry Vyukov
Date: Thu Sep 17 2015 - 14:00:10 EST


On Thu, Sep 17, 2015 at 7:57 PM, Dmitry Vyukov <dvyukov@xxxxxxxxxx> wrote:
> I can update the patch description, but let me explain it here first.
>
> Here is the essence of what happens:
>
> // thread 1
> 1: pid->foo = 1; // foo is the first word of pid object
> // then it does put_pid
> 2: atomic_dec_and_test(&pid->count) // decrements count to 1 and
> returns false so the function returns
>
> // thread 2
> // executes put_pid
> 3: atomic_load(&pid->count); // returns 1, so proceed to kmem_cache_free
> // then kmem_cache_free does:

Oh, sorry, I messed it.
4: *(void**)pid = head->freelist;
goes here.

> 5: head->freelist = (void*)pid;
>
> This can be executed as:
>
> 4: *(void**)pid = head->freelist;
> 1: pid->foo = 1; // foo is the first word of pid object
> 2: atomic_dec_and_test(&pid->count) // decrements count to 1 and
> returns false so the function returns
> 3: atomic_load(&pid->count); // returns 1, so proceed to kmem_cache_free
> 5: head->freelist = (void*)pid;
>
>
> And we get corrupted allocator freelist.

This scenario can be extended to all other words of pid object,
because memory allocator can generally write to all words of the freed
object (e.g. write 0xdeaddead into all words and then crash
discovering 0x1 in some word).

> On Thu, Sep 17, 2015 at 7:44 PM, Oleg Nesterov <oleg@xxxxxxxxxx> wrote:
>> On 09/17, Dmitry Vyukov wrote:
>>>
>>> What happens here exactly matches what is described in CONTROL
>>> DEPENDENCIES section of Documentation/memory-barriers.txt. So all the
>>> bad things described there are possible here.
>>
>> And I still can't understand how these bad things connect to put_pid().
>> Probably I should re-read memory-barriers.txt, it changes quite often.
>>
>>> I don't
>>> know what to add to that.
>>
>> OK, let me quote the parts of your changelog,
>>
>> For example, if store to the first word of the object to build a freelist
>> in kmem_cache_free() hoists above the check, stores to the first word
>> in other threads can corrupt the memory allocator freelist.
>>
>> I simply can't parse this. Yes, this is probably because of my bad
>> English, but I'll appreciate it if you can explain at least, say,
>> "stores to the first word in other threads".
>>
>> Did you mean that a freed pid can be reallocated by another thread,
>> then overwritten, and this all can happen before atomic_read(count)?
>>
>>
>> Hmm. or perhaps you meant that the "last" put_pid() which observes
>> atomic_read() == 1 can race with another thread which writes to this
>> pid and does put_pid()? This is another story, and if you meant this
>> the changelog could clearly explain your concerns.
>>
>> Or what?
>>
>>
>> So let me repeat. Since I can't understand you, I leave this to other
>> reviewers. But imho the changelog should be updated in any case.
>>
>> Oleg.
>>
>
>
>
> --
> Dmitry Vyukov, Software Engineer, dvyukov@xxxxxxxxxx
> Google Germany GmbH, DienerstraÃe 12, 80331, MÃnchen
> GeschÃftsfÃhrer: Graham Law, Christine Elizabeth Flores
> Registergericht und -nummer: Hamburg, HRB 86891
> Sitz der Gesellschaft: Hamburg
> Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
> sind, leiten Sie diese bitte nicht weiter, informieren Sie den
> Absender und lÃschen Sie die E-Mail und alle AnhÃnge. Vielen Dank.
> This e-mail is confidential. If you are not the right addressee please
> do not forward it, please inform the sender, and please erase this
> e-mail including any attachments. Thanks.



--
Dmitry Vyukov, Software Engineer, dvyukov@xxxxxxxxxx
Google Germany GmbH, DienerstraÃe 12, 80331, MÃnchen
GeschÃftsfÃhrer: Graham Law, Christine Elizabeth Flores
Registergericht und -nummer: Hamburg, HRB 86891
Sitz der Gesellschaft: Hamburg
Diese E-Mail ist vertraulich. Wenn Sie nicht der richtige Adressat
sind, leiten Sie diese bitte nicht weiter, informieren Sie den
Absender und lÃschen Sie die E-Mail und alle AnhÃnge. Vielen Dank.
This e-mail is confidential. If you are not the right addressee please
do not forward it, please inform the sender, and please erase this
e-mail including any attachments. Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/