Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

From: Ilpo Järvinen
Date: Mon May 26 2008 - 13:08:59 EST


On Mon, 26 May 2008, Ingo Molnar wrote:

>
> * Ilpo Järvinen <ilpo.jarvinen@xxxxxxxxxxx> wrote:
>
> > > > Hmm, readfds is NULL isn't it?!? Are you sure you straced the
> > > > right process?
> > >
> > > yes, i'm stracing the task that is hung unexpectedly.
> >
> > But that wasn't the receiving process? (I didn't quickly find into
> > which direction distcc ports go, so I couldn't confirm this). If you
> > still have that situation at hand, could you check which is the
> > receiving process (e.g., using netstat -p, the end which has Recv-Q is
> > the right one) and where it's stuck?
>
> it wasnt the receiving process. There's no receiving process - which is
> weird:
>
> Active Internet connections (w/o servers)
> Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
> tcp 0 207232 europe:37198 europe:distcc ESTABLISHED 19578/distcc
> tcp 0 0 europe:ssh dione:36284 ESTABLISHED -
> tcp 0 0 europe:ssh e2:45910 ESTABLISHED -
> tcp 72283 0 europe:distcc europe:37198 ESTABLISHED -

Just to be sure (please forgive me if you find this nearly an insult :-)),
did you have enough rights to find out the pid (ie., if that process not
owned by you then you need superuser privs for that)?

> i just gave it as a general example of why sometimes stracing a task can
> 'disturb' the observed system and can kick the TCP state machine out of
> a stall. I did not say it's occuring here.

Yeah, I understood that earlier. Similarly, I just wanted to point out
the end where the problem lies :-).

> > ...It may still be that the receiving process is stuck due to the
> > non-net related changes you have there.
>
> the socket does not seem to be owned. It should have closed down?
> Refcounting issue?

It's well possible that e.g., net namespaces have some bug in handling
of orphaned tcp.

> find below the sysrq-t dump.

...I'll have a look into that as well (though with such I'm on a more
unfamiliar territory, so it will take a moment).


--
i.