Re: 3.4-rc7: BUG: Bad rss-counter state mm:ffff88040b56f800 idx:1val:-59

From: Konstantin Khlebnikov
Date: Wed May 30 2012 - 08:54:35 EST


Konstantin Khlebnikov wrote:
Martin Mokrejs wrote:


Konstantin Khlebnikov wrote:
Martin Mokrejs wrote:
Andrew Morton wrote:
On Wed, 30 May 2012 00:18:31 +0400
Konstantin Khlebnikov<khlebnikov@xxxxxxxxxx> wrote:

Oleg Nesterov wrote:
On 05/22, Andrew Morton wrote:

Also, I have a note here that Oleg was unhappy with the patch. Oleg
happiness is important. Has he cheered up yet?

Well, yes, I do not really like this patch ;) Because I think there is
a more simple/straightforward fix, see below. In my opinion it also
makes the original code simpler.

But. Obviously this is subjective, I can't prove my patch is "better",
and I didn't try to test it.

So I won't argue with Konstantin who dislikes my patch, although I
would like to know the reason.

I don't remember why I dislike your patch.
For now I can only say ACK )

We'll need a changelogged signed-off patch, please Oleg. And some evidence
that it was tested would be nice ;)

I will reboot in few hours, finally after few days ... I am running this first
patch. I will try to test the second/alternative patch more quickly. Sorry for
the delay.


easiest way trigger this bug:

#define _GNU_SOURCE
#include<unistd.h>
#include<sched.h>
#include<sys/syscall.h>
#include<sys/mman.h>

static inline int sys_clone(unsigned long flags, void *stack, int *ptid, int *ctid)
{
return syscall(SYS_clone, flags, stack, ptid, ctid);
}

int main(int argc, char **argv)
{
void *page;

page = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
sys_clone(CLONE_VFORK | CLONE_VM | CLONE_CHILD_CLEARTID, NULL, NULL, page);
}


I am getting segfaults with this.

(gdb) where
#0 0x0000000000000000 in ?? ()
#1 0x00007f430f70a7e0 in __elf_set___libc_subfreeres_element_free_mem__ () from /lib64/libc.so.6
#2 0x00007f430f70a7e8 in __elf_set___libc_atexit_element__IO_cleanup__ () from /lib64/libc.so.6
#3 0x0000000000000001 in ?? ()
#4 0x0000000000000000 in ?? ()
(gdb)

What number should I give it as an argument? ;-)

there is no arguments.

yeah it corrupts stack. I'm too lazy to write it properly =)
but on non-patched kernel it also triggers this bug:
[206732.025131] BUG: Bad rss-counter state mm:ffff88000d8a6c80 idx:1 val:-1

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@xxxxxxxxxx For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email:<a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx</a>

this version works without segfaults =)

#define _GNU_SOURCE
#include <stdlib.h>
#include <sched.h>
#include <sys/mman.h>

int child(void *arg)
{
return 0;
}

char stack[4096];

int main(int argc, char **argv)
{
void *page;

page = mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0);
clone(child, stack + sizeof(stack), CLONE_VFORK | CLONE_VM | CLONE_CHILD_CLEARTID, NULL, NULL, NULL, page);
return 0;
}

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/