Re: Is it a workqueue related issue in 2.6.37 (Was: Re: [libvirt]blkio cgroup [solved])

From: Tejun Heo
Date: Thu Feb 24 2011 - 09:31:16 EST


Hello,

On Thu, Feb 24, 2011 at 09:23:03AM -0500, Vivek Goyal wrote:
> On Thu, Feb 24, 2011 at 10:18:00AM +0100, Dominik Klein wrote:
>
> Hi Dominik,
>
> Thanks for the tests and reports. I checked the latest logs also and
> I see that cfq has scheduled a work but that work never gets scheduled.
> I never see the trace message which says cfq_kick_queue().
>
> I am ccing it to lkml and tejun to see if he has any suggestions.
>
> Tejun,
>
> I will give you some details about what we have discussed so far.
>
> Dominik is trying blkio throttling feature and trying to throttle some
> virtual machines. He is using 2.6.37 kernels and once he launches 3
> virtual machines he notices that system is kind of frozen. After running
> some traces we noticed that CFQ has requests but it is not dispatching
> these to devices any more.
>
> This problem does not show up with deadline scheduler and also goes away
> with 2.6.38-rc6 kernels.

Hmmm... Maybe the following commit?

commit 7576958a9d5a4a677ad7dd40901cdbb6c1110c98
Author: Tejun Heo <tj@xxxxxxxxxx>
Date: Mon Feb 14 14:04:46 2011 +0100

workqueue: wake up a worker when a rescuer is leaving a gcwq

After executing the matching works, a rescuer leaves the gcwq
whether there are more pending works or not. This may decrease
the concurrency level to zero and stall execution until a new work
item is queued on the gcwq.

Make rescuer wake up a regular worker when it leaves a gcwq if
there are more works to execute, so that execution isn't stalled.

Signed-off-by: Tejun Heo <tj@xxxxxxxxxx>
Reported-by: Ray Jui <rjui@xxxxxxxxxxxx>
Cc: stable@xxxxxxxxxx

--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/