Re: rcu: INFO: rcu_preempt detected stalls on CPUs/tasks on v3.6

From: Michael Wang
Date: Tue Aug 07 2012 - 01:05:45 EST


On 08/07/2012 04:35 AM, Sasha Levin wrote:
> On 08/06/2012 10:31 PM, John Stultz wrote:
>> On 08/06/2012 11:28 AM, Sasha Levin wrote:
>>> On 08/06/2012 08:20 PM, John Stultz wrote:
>>>> On 08/06/2012 10:21 AM, John Stultz wrote:
>>>>> On 08/05/2012 09:55 AM, Sasha Levin wrote:
>>>>>> On 07/30/2012 03:17 PM, Avi Kivity wrote:
>>>>>>> Possible causes:
>>>>>>> - the APIC calibration in the guest failed, so it is programming too
>>>>>>> low values into the timer
>>>>>>> - it actually needs 1 us wakeups and then can't keep up (esp. as kvm
>>>>>>> interrupt injection is slowing it down)
>>>>>>>
>>>>>>> You can try to find out by changing
>>>>>>> arch/x86/kvm/lapic.c:start_lapic_timer() to impose a minimum wakeup of
>>>>>>> (say) 20 microseconds which will let the guest live long enough for you
>>>>>>> to ftrace it and see what kind of timers it is programming.
>>>>>> I've kept trying to narrow it down, and found out It's triggerable using adjtimex().
>>>> Sorry, one more question: Could you provide details on how is it trigger-able using adjtimex?
>>> It triggers after a while of fuzzing using trinity of just adjtimex ('./trinity --quiet -l off -cadjtimex').
>>>
>>> Trinity is available here: http://git.codemonkey.org.uk/?p=trinity.git .
>>>
>>> Let me know if I can help further with reproducing this, I can probably copy over my testing environment to some other host if you'd like.
>> So far no luck. Dmesg mostly just gets filled up with trinity-child OOMs. How much memory are you running with?
>>
>> Are you running trinity as root or as some user that has CAP_SYS_TIME and can actually change values via adjtimex? Or does it trip just by reading the values?
>
> As root in a disposable vm. It triggers at a random point, not after a specific call.

I have tested with a 3.6.0-rc1 guest again, running command:

./trinity --quiet -l off -cadjtimex --dangerous

for normal user:
only oom info
for root:
the guest hung without any stall info printed

I'm not sure how this trinity tool implemented, but at least it do help
to produce some rarely kernel bug...

And could you please also provide the way you start the guest? Is there
any special option?

Regards,
Michael Wang

>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@xxxxxxxxxxxxxxx
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/