Re: [bug] stuck localhost TCP connections, v2.6.26-rc3+

From: Ingo Molnar
Date: Sat May 31 2008 - 02:10:27 EST



* Ingo Molnar <mingo@xxxxxxx> wrote:

> ah, in retrospect i realized that this test had one flaw: some of the
> systems i the build cluster already ran a newer kernel and hence were
> targets for this bug.
>
> so i turned off CONFIG_TCP_CONG_CUBIC on all the testboxes and
> rebooted the cluster boxes into 2.6.25, and the hung sockets are now
> gone. (about 150 successful iterations)
>
> i did another change as well: i removed the localhost distcc
> component. I'll reinstate that now to make sure it's really related to
> TCP_CONG_CUBIC and not to localhost networking.

ok, once i added back the localhost distcc component and the hung kernel
build + stuck TCP socket bug happened again overnight:

Active Internet connections (w/o servers)
Proto Recv-Q Send-Q Local Address Foreign Address State
tcp 72187 0 10.0.1.14:3632 10.0.1.14:47910 ESTABLISHED
tcp 0 174464 10.0.1.14:47910 10.0.1.14:3632 ESTABLISHED

so it seems distcc over localhost was the aspect that made it fail.

_Perhaps_ what matters is to have the new post-rc3 TCP code on _both_
sides of the connection. But that is just a theory - it could be timing,
etc.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/