Re: OOPS in perf_mmap_close()

From: Al Viro
Date: Thu May 23 2013 - 00:48:32 EST

On Wed, May 22, 2013 at 11:48:51PM -0400, Vince Weaver wrote:
> In case anyone cares, the Oops is happening here:
> 1a56: 48 c1 e8 0c shr $0xc,%rax
> 1a5a: 48 ff c0 inc %rax
> > 1a5d: f0 48 29 45 60 lock sub %rax,0x60(%rbp)
> 1a62: 49 8b 46 40 mov 0x40(%r14),%rax
> Which maps to this in perf_mmap_close() in kernel/events/core.c:
> atomic_long_sub((size >> PAGE_SHIFT) + 1, &user->locked_vm);
> And "user" (%rbp) is RBP: 0000000000000000, hence the problem.
> I'm having trouble tracking the problem back any further as the code is a
> bit covoluted and is not commented at all.

FWIW, at least part of perf_mmap_close() is obvious garbage - increment of
->pinned_vm happens in mmap(), decrement - on the ->close() of the last
VMA clonal to one we'd created in that mmap(), regardless of the address
space it's in. Not that handling of ->pinned_vm made any sense wrt fork()...

Actually... What happens if you mmap() the same opened file of that
kind several times, each time with the same size? AFAICS, on all
subsequent calls we'll get
if (event->rb) {
if (event->rb->nr_pages == nr_pages)
goto unlock;
if (!ret)

i.e. we bump event->mmap_count *and* event->rb->refcount. munmap()
all of them and each will generate a call of perf_mmap_close(); ->mmap_count
will go down to zero and on all but the last call we'll have nothing else
done. On the last call we'll hit ring_buffer_put(), which will decrement
event->rb->refcount once. Note that by that point we simply don't know
how many times we'd incremented it in those mmap() calls - it's too late
to clean up. IOW, unless I'm misreading that code, we've got a leak in
there. Not the same bug, but...

