Re: gdb strangness Under 2.3.11-pre1

Alexander Viro (viro@math.psu.edu)
Sun, 18 Jul 1999 16:27:48 -0400 (EDT)


On Sun, 18 Jul 1999, Tim Waugh wrote:

> On Sun, 18 Jul 1999, Alexander Viro wrote:
>
> > > do the messages go away?
> >
> > See if the following patch will help. It boots, runs and leak had gone. No
> > problems so far. I'm submitting it to Linus.
>
> Yes, that works fine here. No worrying messages.
>
> But lazy-tlb tasks don't show up in 'ps'. That is, unless you apply this
> patch.. ;-)
>
> --- linux/fs/proc/array.c~ Sun Jul 18 20:49:57 1999
> +++ linux/fs/proc/array.c Sun Jul 18 20:54:47 1999
> @@ -459,7 +459,7 @@
> p = find_task_by_pid(pid);
> read_unlock(&tasklist_lock); /* FIXME!! This should be done after the last use */
>
> - if (!p || !p->mm)
> + if (!p || !p->mm || (p->flags & PF_LAZY_TLB))
> return 0;
> return get_array(p, p->mm->env_start, p->mm->env_end, buffer);
> }
> @@ -1378,7 +1378,8 @@
> ok = p->dumpable;
> if(!cap_issubset(p->cap_permitted, current->cap_permitted))
> ok=0;
> - if(!p->mm) /* Scooby scooby doo where are you ? */
> + if(!(p->flags & PF_LAZY_TLB) && !p->mm)
> + /* Scooby scooby doo, where are you ? */
> p=NULL;
> }
>
> I think that deals with all of the things to watch for in proc/array.c
> now.

Looks sane. The only thing that still worries me is the fact that we are
making a potentially blocking call (mmput()) from the tail of schedule().
It's an SMP-only problem, but there it may give us a lot of shit.
See how it can happen:
CPU#1 CPU#2
Process A Process B
A blocks
Lazy thread C Process B
B blocks
Lazy thread C Process A
A exits
Lazy thread C Process B
C blocks
whatever gets the control will have to mmput() the last reference to
memory context. It may block. IOW, use up a lot of stack space and sleep
in schedule() again. Nothing guarantees that we will not get the same
situation when it will wake up again. Repeat until the stack overflow ;-/

It's a problem both with the original code and with the patched variant.
Probably too hard to exploit, but I wouldn't bet on it - depth of pathes
in mmput() may be *really* big (mmput -> exit_mmap -> fput -> dput -> iput
-> inode->delete() -> ... and hell knows what if you were doing mmap()
over NFS). Maybe we might just handle them over to a designated thread or
postpone the thing until crossing the ring 3 boundary...

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/