Re: [PATCH 2/2 v2] watchdog: Always return NOTIFY_OK during cpuup/down events

From: WANG Cong
Date: Thu Mar 17 2011 - 08:17:15 EST


On Mon, 07 Mar 2011 16:37:40 -0500, Don Zickus wrote:

> This patch addresses a couple of problems. One was the case when the
> hardlockup failed to start, it also failed to start the softlockup.
> There were valid cases when the hardlockup shouldn't start and that
> shouldn't block the softlockup (no lapic, bios controls perf counters).
>
> The second problem was when the hardlockup failed to start on boxes
> (from a no lapic or bios controlled perf counter case), it reported
> failure to the cpu notifier chain. This blocked the notifier from
> continuing to start other more critical pieces of cpu bring-up (in our
> case based on a 2.6.32 fork, it was the mce). As a result, during soft
> cpu online/offline testing, the system would panic when a cpu was
> offlined because the cpu notifier would succeed in processing a watchdog
> disable cpu event and would panic in the mce case as a result of
> un-initialized variables from a never executed cpu up event.

What I saw is microcode, its /sys entries failed to come up and this
triggers a warning when these entries are removed when the CPU became
offline again.

>
> I realized the hardlockup/softlockup cases are really just debugging
> aids and should never impede the progress of a cpu up/down event.
> Therefore I modified the code to always return NOTIFY_OK and instead
> rely on printks to inform the user of problems.
>

Yeah, it should also fix the problem I saw.

Reviewed-by: WANG Cong <xiyou.wangcong@xxxxxxxxx>

Thanks.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/