Re: tty breakage in X (Was: tty vs workqueue oddities)

From: Benjamin Herrenschmidt
Date: Fri Jun 03 2011 - 02:56:58 EST


On Fri, 2011-06-03 at 16:17 +1000, Benjamin Herrenschmidt wrote:

> Some more data: It -looks- like what happens is that the flush_to_ldisc
> work queue entry constantly re-queues itself (because the PTY is full ?)
> and the workqueue thread will basically loop forver calling it without
> ever scheduling, thus starving the consumer process that could have
> emptied the PTY.
>
> At least that's a semi half-assed theory. If I add a schedule() to
> process_one_work() after dropping the lock, the problem disappears.
>
> So there's a combination of things here that are quite interesting:
>
> - A lot of work queued for the kworker will essentially go on without
> scheduling for as long as it takes to empty all work items. That doesn't
> sound very nice latency-wise. At least on a non-PREEMPT kernel.
>
> - flush_to_ldisc seems to be nasty and requeues itself over and over
> again from what I can tell, when it can't push the data out, in this
> case, I suspect because the PTY is full but I don't know for sure yet.

Interesting results from x86. I could not initially reproduce there at
all on my little Atom board (the one from kernel summit last year).

Eventually I looked at the kernel config, switched off PREEMPT_VOLUNTARY
and I can now reproduce on x86 too. Again, if you have both threads/core
running, the problem isn't as visible (you do see "hickups" when cat'ing
a large file, the atom is slow enough I suppose).

But offline a cpu (leave only one up) and cat a large file (dmesg is
enough for me to trigger it) and you see the hangs.

So I think my theory stands that flush_to_ldisc constantly reschedule
itself causing the worker thread to eat all CPU and starve the consumer
of the PTY. I won't have time to dig much deeper today nor probably this
week-end so I'm sending this email for others who want to look.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/