Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1val:-59

From: Konstantin Khlebnikov
Date: Wed May 30 2012 - 08:22:38 EST


Martin Mokrejs wrote:


Konstantin Khlebnikov wrote:
Martin Mokrejs wrote:
Andrew Morton wrote:
On Wed, 30 May 2012 00:18:31 +0400
Konstantin Khlebnikov<khlebnikov@xxxxxxxxxx> wrote:

Oleg Nesterov wrote:
On 05/22, Andrew Morton wrote:

Also, I have a note here that Oleg was unhappy with the patch. Oleg
happiness is important. Has he cheered up yet?

Well, yes, I do not really like this patch ;) Because I think there is
a more simple/straightforward fix, see below. In my opinion it also
makes the original code simpler.

But. Obviously this is subjective, I can't prove my patch is "better",
and I didn't try to test it.

So I won't argue with Konstantin who dislikes my patch, although I
would like to know the reason.

I don't remember why I dislike your patch.
For now I can only say ACK )

We'll need a changelogged signed-off patch, please Oleg. And some evidence
that it was tested would be nice ;)

I will reboot in few hours, finally after few days ... I am running this first
patch. I will try to test the second/alternative patch more quickly. Sorry for
the delay.


easiest way trigger this bug:

#define _GNU_SOURCE
#include<unistd.h>
#include<sched.h>
#include<sys/syscall.h>
#include<sys/mman.h>

static inline int sys_clone(unsigned long flags, void *stack, int *ptid, int *ctid)
{
return syscall(SYS_clone, flags, stack, ptid, ctid);
}

int main(int argc, char **argv)
{
void *page;

page = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
sys_clone(CLONE_VFORK | CLONE_VM | CLONE_CHILD_CLEARTID, NULL, NULL, page);
}


I am getting segfaults with this.

(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00007f430f70a7e0 in __elf_set___libc_subfreeres_element_free_mem__ () from /lib64/libc.so.6
#2 0x00007f430f70a7e8 in __elf_set___libc_atexit_element__IO_cleanup__ () from /lib64/libc.so.6
#3 0x0000000000000001 in ?? ()
#4 0x0000000000000000 in ?? ()
(gdb)

What number should I give it as an argument? ;-)

there is no arguments.

yeah it corrupts stack. I'm too lazy to write it properly =)
but on non-patched kernel it also triggers this bug:
[206732.025131] BUG: Bad rss-counter state mm:ffff88000d8a6c80 idx:1 val:-1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/