Re: Possible sandybridge livelock issue

From: Ingo Molnar
Date: Mon May 16 2011 - 02:29:59 EST



* James Bottomley <James.Bottomley@xxxxxxxxxxxxxxxxxxxxx> wrote:

> We've just come off a large round of debugging a kswapd problem over on
> linux-mm:
>
> http://marc.info/?t=130392066000001
>
> The upshot was that kswapd wasn't being allowed to sleep (which we're
> now fixing). However, in spite of intensive efforts, the actual hang
> was only reproducible on sandybridge laptops.
>
> When the hang occurred, kswapd basically pegged one core in 100% system
> time. This looks like there's something specific to sandybridge that
> causes this type of bad interaction. I was wondering if it could be
> something to to with a scheduling problem in turbo mode? Once kswapd
> goes flat out, the core its on will kick into turbo mode, which causes
> it to get preferentially scheduled there, leading to the live lock.

There's no explicit 'schedule Sandybridge differently' logic in the scheduler.

Thus turbo mode can only affect scheduling by executing code faster. Executing
faster *does* mean more scheduling on that CPU: it's faster to do work so it's
faster back to idle again.

I.e. i can see Sandybridge being special only due to timing and performance
differences.

> The only evidence I have to support this theory is that when I reproduce the
> problem with PREEMPT, the core pegs at 100% system time and stays there even
> if I turn off the load. However, if I can execute work that causes kswapd to
> be kicked off the core it's running on, it will calm back down and go to
> sleep.

At first sight this looks like some sort of kswapd problem: if you put kswapd
into TASK_*INTERRUPTIBLE and schedule() it then the scheduler won't keep it
running, on Sandybridge or elsewhere. The scheduler can't magically make kswapd
runnable unless there's some big bug in it. So you first need to examine why
kswapd never schedules to idle.

Thanks,

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/