Re: [PATCH] Re: [RFC PATCH] namespaces: fix leak on fork() failure

From: Mike Galbraith
Date: Fri May 04 2012 - 12:57:26 EST


On Fri, 2012-05-04 at 08:36 -0700, Eric W. Biederman wrote:
> Mike Galbraith <efault@xxxxxx> writes:
>
> > On Fri, 2012-05-04 at 07:13 -0700, Eric W. Biederman wrote:
> >> Mike Galbraith <efault@xxxxxx> writes:
>
> >> Did you have HZ=100 in that kernel? 400 tasks at 100Hz all serialized
> >> somehow and then doing synchronize_rcu at a jiffy each would account
> >> for 4 seconds. And the nsproxy certainly has a synchronize_rcu call.
> >
> > HZ=250
>
> Rats. Then non of my theories even approaches holding water.
>
> >> The network namespace is comparatively heavy weight, at least in the
> >> amount of code and other things it has to go through, so that would be
> >> my prime suspect for those 29 seconds. There are 2-4 synchronize_rcu
> >> calls needed to put the loopback device. Still we use
> >> synchronize_rcu_expedited and that work should be out of line and all of
> >> those calls should batch.
> >>
> >> Mike is this something you are looking at a pursuing farther?
> >
> > Not really, but I can put it on my good intentions list.
>
> About what I expected. I just wanted to make certain I understood the
> situation.
>
> I will remember this as something weird and when I have time perhaps
> I will investigate and track it.
>
> >> I want to guess the serialization comes from waiting on children to be
> >> reaped but the namespaces are all cleaned up in exit_notify() called
> >> from do_exit() so that theory doesn't hold water. The worst case
> >> I can see is detach_pid from exit_signal running under the task list lock.
> >> but nothing sleeps under that lock. :(
> >
> > I'm up to my ears in zombies with several instances of the testcase
> > running in parallel, so I imagine it's the same with hackbench.
>
> Oh interesting.
>
> > marge:/usr/local/tmp/starvation # taskset -c 3 ./hackbench -namespace& for i in 1 2 3 4 5 6 7 ; do ps ax|grep defunct|wc -l;sleep 1; done
> > [1] 29985
> > Running with 10*40 (== 400) tasks.
> > 1
> > 397
> > 327
> > 261
> > 199
> > 135
> > 72
> > marge:/usr/local/tmp/starvation # Time: 7.675
>
> So if I read your output right the first second is spent running the
> code and the rest of the time is spent reaping zombies.

The distance between these is mighty fishy.

marge:~ # grep 'signalfd_cleanup ' /trace2
vsftpd-9628 [003] .... 712.571961: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.575717: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.579698: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.587734: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.591671: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.595695: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.599685: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.603680: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.607682: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.611692: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.615740: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.619705: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.623730: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.627748: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.631712: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.635741: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.643683: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.647685: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.651691: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.655742: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.659738: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.663738: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.667756: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.671693: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.679682: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.683694: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.687750: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.691738: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.695751: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.699740: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.703736: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.707757: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.711685: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.715689: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.719694: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.723742: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.727752: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.731695: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.739687: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.743688: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.747697: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.751689: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.755688: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.759699: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.763705: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.767754: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.771702: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.775749: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.775884: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.783754: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.787754: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.791763: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.795764: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.799755: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.807768: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.835723: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.843695: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.847752: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.851694: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.855711: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.859704: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.863751: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.867754: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.871753: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.875765: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.879706: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.883696: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.887697: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.891711: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.898493: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.911740: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.927755: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.955754: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.975771: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 712.995826: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.003739: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.003920: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.011710: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.015831: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.023827: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.031694: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.035715: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.039714: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.043816: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.047726: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.051818: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.055724: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.059814: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.063725: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.067824: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.071825: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.075726: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.079709: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.083814: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.087850: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.095859: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.099826: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.103830: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.107726: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.111723: signalfd_cleanup <-__cleanup_sighand
vsftpd-9628 [003] d... 713.115874: signalfd_cleanup <-__cleanup_sighand



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/