Re: Possible "struct pid" leak from tty_io.c

From: Catalin Marinas
Date: Fri Mar 09 2007 - 11:45:03 EST


Eric,

For a longer explanation, see the second part of this e-mail. In
short, the patch below seems to fix this particular leak. I'm not sure
that's the correct/complete fix as I seem to still get a 2nd report.
Any info is welcomed.

diff --git a/drivers/char/tty_io.c b/drivers/char/tty_io.c
index e453268..4e33dc2 100644
--- a/drivers/char/tty_io.c
+++ b/drivers/char/tty_io.c
@@ -1375,6 +1375,9 @@ static void do_tty_hangup(struct work_struct *work)
}
read_unlock(&tasklist_lock);

+ put_pid(tty->session);
+ put_pid(tty->pgrp);
+
tty->flags = 0;
tty->session = NULL;
tty->pgrp = NULL;

On 08/03/07, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
"Catalin Marinas" <catalin.marinas@xxxxxxxxx> writes:
> The /sbin/init application calls sys_clone() a few times but only one
> leak is reported (see below). Looking at the reported pid object (at
> 0xc7c14500), count is 2 and nr is 296 but no process with pid 296
> exists any more.
[...]
> unreferenced object 0xc7c14500 (size 36):
> comm "init", pid 245, jiffies 4294939289
> backtrace:
> [<c0070c18>] kmem_cache_alloc
> [<c003a528>] alloc_pid
> [<c0026468>] do_fork
> [<c00153b0>] sys_clone
> [<c0010f80>] ret_fast_syscall

I think this is the path that all pid structures come from so
unfortunately that doesn't help tracing this problem down.

No, indeed, but that's the only thing kmemleak can report. Anyway, I
got some more information now, after adding several printk's:

The difference from other pid objects is that this one (with nr 296)
is passed as a parameter to proc_set_tty(). The __proc_set_tty()
function increments the pid->count twice via get_pid(), and, with two
other get_pid calls, the pid->count for this object gets to 5 (1 being
the initial value). The prints below are function name, struct pid
address (different from the runs yesterday though), pid->nr and
pid->count (after get_pid incrementing). It also show the return
address and symbol (the calling function):

alloc_pid: c7c149d8, 296, 1
get_pid: c7c149d8, 296, 2
return: c0122d64 (proc_set_tty+0x34/0x54)
get_pid: c7c149d8, 296, 3
return: c0122d64 (proc_set_tty+0x34/0x54)
get_pid: c7c149d8, 296, 4
return: c002b328 (do_exit+0x2e4/0x7f8) - this is actually the get_pid
in disassociate_ctty but it is reported like this because of get_pid
inlining
get_pid: c7c149d8, 296, 5
return: c0124a0c (tty_vhangup+0x14/0x18)

On the exit path (see below), however, put_pid is called twice before
free_pid and once via release_task -> detach_pid -> free_pid -> ... ->
__rcu_process_callbacks -> delayed_put_pid -> put_pid. Note that
free_pid is called with pid->nr == 3 and the last put_pid gets called
with nr == 3 as well (but it decrements it to 2 and that's what I find
at that memory location). In the trace below, the pid->count is
printed before put_pid modifies it:

put_pid: c7c149d8, 296, 5
return: c0124b5c (disassociate_ctty+0x14c/0x230)
put_pid: c7c149d8, 296, 4
return: c0124ba8 (disassociate_ctty+0x198/0x230)
detach_pid: c7c149d8, 296, 3
return: c002a230 (release_task+0x1c0/0x358)
detach_pid: c7c149d8, 296, 3
return: c002a248 (release_task+0x1d8/0x358)
detach_pid: c7c149d8, 296, 3
return: c002a254 (release_task+0x1e4/0x358)
free_pid: c7c149d8, 296, 3
return: c003a990 (detach_pid+0xac/0xc8)
...
delayed_put_pid: c7c149d8, 296, 3
return: c003af68 (__rcu_process_callbacks+0x19c/0x25c)
put_pid: c7c149d8, 296, 3
return: c003a8cc (delayed_put_pid+0x54/0x6c)

In the above disassociate_ctty() function the code below (line 1542)
doesn't seem to get called:

tty = get_current_tty();
if (tty) {
put_pid(tty->session);
put_pid(tty->pgrp);
tty->session = NULL;
tty->pgrp = NULL;
} else {

and I get the following error if TTY_DEBUG_HANGUP is defined - "error
attempted to write to tty [0x00000000] = NULL".

It looks like the tty_vhangup() call in in disassociate_ctty() sets
current->signal->tty to NULL in the do_each_pid_task loop in
do_tty_hangup (p->signal->tty = NULL). The second call to
get_current_tty() in disassociate_ctty() return NULL and therefore no
put_pid on tty->session and tty->pgrp (which are also set to NULL in
the previous function).

Regards.

--
Catalin
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/