Re: ppoll() stuck on POLLIN while TCP peer is sending

From: Eric Wong
Date: Sat Dec 29 2012 - 06:34:25 EST


Eric Wong <normalperson@xxxxxxxx> wrote:
> Eric Wong <normalperson@xxxxxxxx> wrote:
> > I'm finding ppoll() unexpectedly stuck when waiting for POLLIN on a
> > local TCP socket. The isolated code below can reproduces the issue
> > after many minutes (<1 hour). It might be easier to reproduce on
> > a busy system while disk I/O is happening.
>
> Ugh, I can't seem to reproduce this anymore... Will try something
> else tomorrow.

The good news is I'm not imagining this...

The bad news is the issue is real and took a long time to reproduce
again. This issue happens even without preempt, and without
tcp_low_latency on 3.7.1

While running `toosleepy', I also needed to run heavy (not loopback)
network and disk activity (several USB, SATA, and eSATA drives
simultaneously) for many hours before hitting this.

Hopefully this report is helpful in solving the issue. Looking in at
the various pieces in net and select/poll paths, there's several
references to race conditions in the comments so this is hopefully
familiar territory to someone here...
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/