[rfc] x86, ptrace: memory accounting for branch tracing

From: Markus Metzger
Date: Tue Dec 16 2008 - 10:25:07 EST


Account memory allocated for the BTS buffer to the traced task's
total_vm and locked_vm.

The mlock ulimit is typically quite low if you want to trace multiple
tasks. By accounting the memory to the tracee, we should be OK if
we're debugging multiple processes. For debugging multiple threads,
though, the ulimit may be too low.


When the tracer dies, it does not detach properly from the traced
tasks. In particular, ptrace_detach() and ptrace_disable() are not
called.

Thus, BTS resources are not properly released. They will be released
in exit_thread() when the tracee dies. At that time, the task's mm has
already been destroyed, so we don't update total_vm and locked_vm.

While we're not actually leaking memory, we don't give it back to
total_vm and locked_vm. From how I understood the code, we started
with a copy, anyway, so the changes are gone when the process
dies. That may be good enough.

In case the debugger re-attaches, it will find an already allocated
bts buffer and use that one, or deallocate it and allocate a new
one. In that case, the tracer will get the memory for the existing
buffer subtracted from its total_vm and locked_vm.

Two tasks working together (one allocating the bts buffer and dying;
the other deallocating the bts buffer) would be able to mlock an
unlimited amount of memory.


I guess my approach is simply wrong. How would one approach this
correctly? Could anyone point me to some code?


Signed-off-by: Markus Metzger <markus.t.metzger@xxxxxxxxx>
---

Index: ftrace/arch/x86/kernel/ptrace.c
===================================================================
--- ftrace.orig/arch/x86/kernel/ptrace.c 2008-12-15 10:52:58.000000000 +0100
+++ ftrace/arch/x86/kernel/ptrace.c 2008-12-15 11:18:23.000000000 +0100
@@ -650,6 +650,56 @@
return drained;
}

+static int ptrace_bts_allocate_buffer(struct task_struct *child, size_t size)
+{
+ unsigned long rlim, vm, pgsz;
+ int error = -ENOMEM;
+
+ pgsz = PAGE_ALIGN(size) >> PAGE_SHIFT;
+
+ down_write(&child->mm->mmap_sem);
+
+ rlim = child->signal->rlim[RLIMIT_AS].rlim_cur >> PAGE_SHIFT;
+ vm = child->mm->total_vm + pgsz;
+ if (rlim < vm)
+ goto out;
+
+ rlim = child->signal->rlim[RLIMIT_MEMLOCK].rlim_cur >> PAGE_SHIFT;
+ vm = child->mm->locked_vm + pgsz;
+ if (rlim < vm)
+ goto out;
+
+ child->bts_buffer = kzalloc(size, GFP_KERNEL);
+ if (!child->bts_buffer)
+ goto out;
+
+ child->bts_size = size;
+
+ child->mm->total_vm += pgsz;
+ child->mm->locked_vm += pgsz;
+
+ error = 0;
+ out:
+ up_write(&child->mm->mmap_sem);
+ return error;
+}
+
+static void ptrace_bts_free_buffer(struct task_struct *child)
+{
+ unsigned long pgsz = PAGE_ALIGN(child->bts_size) >> PAGE_SHIFT;
+
+ down_write(&child->mm->mmap_sem);
+
+ child->mm->total_vm -= pgsz;
+ child->mm->locked_vm -= pgsz;
+
+ up_write(&child->mm->mmap_sem);
+
+ kfree(child->bts_buffer);
+ child->bts_buffer = NULL;
+ child->bts_size = 0;
+}
+
static int ptrace_bts_config(struct task_struct *child,
long cfg_size,
const struct ptrace_bts_config __user *ucfg)
@@ -679,14 +729,13 @@

if ((cfg.flags & PTRACE_BTS_O_ALLOC) &&
(cfg.size != child->bts_size)) {
- kfree(child->bts_buffer);
+ int error;

- child->bts_size = cfg.size;
- child->bts_buffer = kzalloc(cfg.size, GFP_KERNEL);
- if (!child->bts_buffer) {
- child->bts_size = 0;
- return -ENOMEM;
- }
+ ptrace_bts_free_buffer(child);
+
+ error = ptrace_bts_allocate_buffer(child, cfg.size);
+ if (error < 0)
+ return error;
}

if (cfg.flags & PTRACE_BTS_O_TRACE)
@@ -701,10 +750,8 @@
if (IS_ERR(child->bts)) {
int error = PTR_ERR(child->bts);

- kfree(child->bts_buffer);
+ ptrace_bts_free_buffer(child);
child->bts = NULL;
- child->bts_buffer = NULL;
- child->bts_size = 0;

return error;
}
@@ -787,9 +834,7 @@
ds_release_bts(child->bts);
child->bts = NULL;

- kfree(child->bts_buffer);
- child->bts_buffer = NULL;
- child->bts_size = 0;
+ ptrace_bts_free_buffer(child);
}
#endif /* CONFIG_X86_PTRACE_BTS */
}
---------------------------------------------------------------------
Intel GmbH
Dornacher Strasse 1
85622 Feldkirchen/Muenchen Germany
Sitz der Gesellschaft: Feldkirchen bei Muenchen
Geschaeftsfuehrer: Douglas Lusk, Peter Gleissner, Hannes Schwaderer
Registergericht: Muenchen HRB 47456 Ust.-IdNr.
VAT Registration No.: DE129385895
Citibank Frankfurt (BLZ 502 109 00) 600119052

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/