Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1val:-59

From: Martin Mokrejs
Date: Wed May 30 2012 - 10:39:33 EST




Konstantin Khlebnikov wrote:
> Konstantin Khlebnikov wrote:
>> Martin Mokrejs wrote:
>>>
>>>
>>> Konstantin Khlebnikov wrote:
>>>> Martin Mokrejs wrote:
>>>>> Andrew Morton wrote:
>>>>>> On Wed, 30 May 2012 00:18:31 +0400
>>>>>> Konstantin Khlebnikov<khlebnikov@xxxxxxxxxx> wrote:
>>>>>>
>>>>>>> Oleg Nesterov wrote:
>>>>>>>> On 05/22, Andrew Morton wrote:
>>>>>>>>>
>>>>>>>>> Also, I have a note here that Oleg was unhappy with the patch. Oleg
>>>>>>>>> happiness is important. Has he cheered up yet?
>>>>>>>>
>>>>>>>> Well, yes, I do not really like this patch ;) Because I think there is
>>>>>>>> a more simple/straightforward fix, see below. In my opinion it also
>>>>>>>> makes the original code simpler.
>>>>>>>>
>>>>>>>> But. Obviously this is subjective, I can't prove my patch is "better",
>>>>>>>> and I didn't try to test it.
>>>>>>>>
>>>>>>>> So I won't argue with Konstantin who dislikes my patch, although I
>>>>>>>> would like to know the reason.
>>>>>>>
>>>>>>> I don't remember why I dislike your patch.
>>>>>>> For now I can only say ACK )
>>>>>>
>>>>>> We'll need a changelogged signed-off patch, please Oleg. And some evidence
>>>>>> that it was tested would be nice ;)
>>>>>
>>>>> I will reboot in few hours, finally after few days ... I am running this first
>>>>> patch. I will try to test the second/alternative patch more quickly. Sorry for
>>>>> the delay.
>>>>>
>>>>
>>>> easiest way trigger this bug:
>>>>
>>>> #define _GNU_SOURCE
>>>> #include<unistd.h>
>>>> #include<sched.h>
>>>> #include<sys/syscall.h>
>>>> #include<sys/mman.h>
>>>>
>>>> static inline int sys_clone(unsigned long flags, void *stack, int *ptid, int *ctid)
>>>> {
>>>> return syscall(SYS_clone, flags, stack, ptid, ctid);
>>>> }
>>>>
>>>> int main(int argc, char **argv)
>>>> {
>>>> void *page;
>>>>
>>>> page = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
>>>> sys_clone(CLONE_VFORK | CLONE_VM | CLONE_CHILD_CLEARTID, NULL, NULL, page);
>>>> }
>>>>
>>>
>>> I am getting segfaults with this.
>>>
>>> (gdb) where
>>> #0 0x0000000000000000 in ?? ()
>>> #1 0x00007f430f70a7e0 in __elf_set___libc_subfreeres_element_free_mem__ () from /lib64/libc.so.6
>>> #2 0x00007f430f70a7e8 in __elf_set___libc_atexit_element__IO_cleanup__ () from /lib64/libc.so.6
>>> #3 0x0000000000000001 in ?? ()
>>> #4 0x0000000000000000 in ?? ()
>>> (gdb)
>>>
>>> What number should I give it as an argument? ;-)
>>
>> there is no arguments.
>>
>> yeah it corrupts stack. I'm too lazy to write it properly =)
>> but on non-patched kernel it also triggers this bug:
>> [206732.025131] BUG: Bad rss-counter state mm:ffff88000d8a6c80 idx:1 val:-1
>>
>> --
>> To unsubscribe, send a message with 'unsubscribe linux-mm' in
>> the body to majordomo@xxxxxxxxxx For more info on Linux MM,
>> see: http://www.linux-mm.org/ .
>> Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
>> Don't email:<a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx</a>
>
> this version works without segfaults =)
>
> #define _GNU_SOURCE
> #include <stdlib.h>
> #include <sched.h>
> #include <sys/mman.h>
>
> int child(void *arg)
> {
> return 0;
> }
>
> char stack[4096];
>
> int main(int argc, char **argv)
> {
> void *page;
>
> page = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
> clone(child, stack + sizeof(stack), CLONE_VFORK | CLONE_VM | CLONE_CHILD_CLEARTID, NULL, NULL, NULL, page);
> return 0;
> }
>

Thanks, this app does not crash anymore. Re-confirming that both patches fix the issue on my system.

Martin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/