Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linux2.6.34-rc3)

From: Linus Torvalds
Date: Fri Apr 02 2010 - 14:14:12 EST



I think this is likely due to the new scalable anon_vma linking by Rik.
Nothing else I can imagine should have introduced anything like it.

Rik: the picures have the information, but you need to look at several to
see both the oops and the backtrace. Here's a condensed version:

shrink_all_memory ->
do_try_to_free_pages ->
shrink_zone ->
shrink_inactive_list ->
shrink_page_list ->
page_referenced

where page_referenced() oopses due page_referenced_anon() as per
Borislav's description below.

Added all the usual suspects to the Cc list. Left the full report appended
so that the new people don't have to search for it on lkml.

Linus

On Fri, 2 Apr 2010, Borislav Petkov wrote:
>
> I've got the following oopsie two times now when hibernating - this
> means, I don't get it everytime I hibernate but only sometimes, say once
> in a blue moon.
>
> And yeah, I couldn't catch it over serial console so I had to make ugly
> pictures. By the way, the numbers in the filenames increment as I scroll
> down the whole oops (yep, it hadn't completely frozen and I still could
> do Shift->PgUp or Shift->PgDn on the console):
>
> http://www.kernel.org/pub/linux/kernel/people/bp/
>
> So, here's what I could decipher from the oopsie, someone else who's
> more knowledgeable in mm, rmap and anon_vma's list traversal should be
> able to tell what goes wrong there.
>
> EIP is at page_referenced+0xee
>
> which is
>
> <disasm>
> 10c4: 41 01 c4 add %eax,%r12d
> 10c7: 83 7d cc 00 cmpl $0x0,-0x34(%rbp)
> 10cb: 74 19 je 10e6 <page_referenced+0xff>
> 10cd: 4d 8b 6d 20 mov 0x20(%r13),%r13
> 10d1: 49 83 ed 20 sub $0x20,%r13
>
> 10d5: 49 8b 45 20 mov 0x20(%r13),%rax <--------------
>
> 10d9: 0f 18 08 prefetcht0 (%rax)
> 10dc: 49 8d 45 20 lea 0x20(%r13),%rax
> 10e0: 48 39 45 80 cmp %rax,-0x80(%rbp)
> </disasm>
>
>
> Corresponding asm:
>
> <asm>
> .loc 1 496 0
> movq 32(%r13), %r13 # <variable>.same_anon_vma.next, __mptr.451
> .LVL295:
> subq $32, %r13 #, avc
> .LVL296:
> .L184:
> .LBE1278:
> movq 32(%r13), %rax # <variable>.same_anon_vma.next, <variable>.same_anon_vma.next <----------------
> prefetcht0 (%rax) # <variable>.same_anon_vma.next
> leaq 32(%r13), %rax #, tmp97
> cmpq %rax, -128(%rbp) # tmp97, %sfp
> jne .L187 #,
> .L186:
> .loc 1 514 0
> movq %r14, %rdi # anon_vma,
> call page_unlock_anon_vma #
> </asm>
>
>
> and the NULL pointer in question is being written into %r13 and then 32
> is subtracted from it (I'm guessing container_of()). This is consistent
> with the register snapshot - %r13 contains 0xffffffffffffffe0 which is
> -32 and with the code dump in the oops, in CIMG1640.JPG code points to
> opcode 49 8b 45 20.
>
> Which is the following piece of code in <mm/rmap.c:page_referenced_anon()>.
>
> <source>
>
> mapcount = page_mapcount(page);
> list_for_each_entry(avc, &anon_vma->head, same_anon_vma) {
> struct vm_area_struct *vma = avc->vma;
> unsigned long address = vma_address(page, vma);
> if (address == -EFAULT)
> continue;
>
> </source>
>
> which tells us that same_anon_vma.next is NULL. Hmm...
>
> --
> Regards/Gruss,
> Boris.
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/