Re: [PATCH v3 2/2] Make hard lockup detection use timestamps

From: ZAK Magnus
Date: Fri Jul 29 2011 - 19:44:42 EST


Are you saying that any call to printk() will touch the watchdogs? I
wasn't aware of that and it doesn't seem to comply with my
observations too well, either. Then again, at the moment I don't
understand some of the things I'm currently seeing so I could just be
wrong.

On Fri, Jul 29, 2011 at 1:55 PM, Don Zickus <dzickus@xxxxxxxxxx> wrote:
> On Thu, Jul 28, 2011 at 05:16:00PM -0700, ZAK Magnus wrote:
>> No news?
>>
>> I've been testing and looking into issues and I realized dump_stack()
>> calls touch_nmi_watchdog(). That wrecks what the patch is trying to do
>> so I'm changing it to save the trace and print it later after the
>> stall has completed. This would also resolve some other things you
>> were saying weren't so good. Hopefully the logic is similar enough
>> that some things you may have learned still apply.
>
> So yeah, the acting of printing was resesting the softlockup counter and
> delaying it forever.  In parallel, rcu has its own stall detector that was
> going off after a minute or two.
>
> Once I routed the printk to trace_printk and disabled dump_stack,
> everything started working as expected.
>
> Now the question is how to avoid shooting ourselves in the foot by
> printk'ing a message without resetting the hard/soft lock watchdogs.
>
> I'll have to think about how to do that.  If you can come up with any
> ideas let me know.
>
> We almost need a quiet dump_stack that dumps to a buffer instead of the
> console.  But I am not sure that is worth the effort.
>
> Hmm.
>
> Cheers,
> Don
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/