Re: applesmc oops in 3.10/3.11

From: Guenter Roeck
Date: Mon Sep 30 2013 - 23:37:48 EST

On 09/30/2013 06:57 PM, Chris Murphy wrote:

On Sep 27, 2013, at 5:33 PM, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On 09/27/2013 11:03 AM, Chris Murphy wrote:

On Sep 27, 2013, at 11:59 AM, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On Fri, Sep 27, 2013 at 11:41:42AM -0600, Chris Murphy wrote:

On Sep 27, 2013, at 11:12 AM, Guenter Roeck <linux@xxxxxxxxxxxx> wrote:

On Fri, Sep 27, 2013 at 12:21:04PM -0400, Josh Boyer wrote:
On Thu, Sep 26, 2013 at 2:34 AM, Henrik Rydberg <rydberg@xxxxxxxxxxx> wrote:
This suggests that initialization may be attempted more than once. The key cache
is allocated only once, but the number of keys is read for each attempt.

No idea if that can happen, but if the number of keys can increase after
the first initialization attempt you would have an explanation for the crash.

Good idea, and easy enough to test with the patch below.

Should we apply this patch even though it may not solve the specific problem ?

Yes, why not - it certainly won't hurt. I am running it right now, so
it is at least run-tested.

Again, not sure if the key count can change, but the current code is at the very
least inconsistent, as it keeps reading the key count without updating or
verifying the cache size.

Yes - I agree that the error state is far-fetched, but it is hard to
see any other logical explanation. There is of course always the
possibility that the problem is somewhere else completely.

Proper patch attached.



From dedefba9167913c46e1896ce0624e68ffe95d532 Mon Sep 17 00:00:00 2001
From: Henrik Rydberg <rydberg@xxxxxxxxxxx>
Date: Thu, 26 Sep 2013 08:33:16 +0200
Subject: [PATCH] hwmon: (applesmc) Check key count before proceeding

After reports from Chris and Josh Boyer of a rare crash in applesmc,
Guenter pointed at the initialization problem fixed below. The patch
has not been verified to fix the crash, but should be applied

Reported-by: <jwboyer@xxxxxxxxxxxxxxxxx>
Suggested-by: Guenter Roeck <linux@xxxxxxxxxxxx>
Signed-off-by: Henrik Rydberg <rydberg@xxxxxxxxxxx>
drivers/hwmon/applesmc.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

Thanks for the quick reply. I'll get this rolled into our kernels soon.

I sent a pull request to Linus, so you should be able to pull it from
the upstream kernel shortly. Would be great to get feedback if the patch
solves the problem (or doesn't).

I'll start running it when it appears in koji. It's very transient, maybe one oops per week with lots of (other) testing. I'm not even sure if it happens on warm or cold boots or both.

When you do, can you possibly trigger an event based on the warning added
with the patch ? This might help us to identify if the problem fixed
with the patch actually happens.

I don't understand the question. I'm uncertain how to trigger, and also what event.

The patch includes a new warning message.

pr_warn("key count changed from %d to %d\n",
s->key_count, count);

It would be great if there would be a means to detect if this message is seen
in a kernel log, because it would show that the potential crash condition
fixed with the patch was actually encountered. This would help us to determine
if we actually fixed the problem or not.

Of course, we'll know if is wasn't fixed if the system still crashes.

Warning message triggered with 3.12.0-0.rc3.git0.1.fc21.x86_64.

[ 10.886016] applesmc: key count changed from 261 to 1174405121

Explains the crash, but the new key count is very wrong. 1174405121 = 0x46000001.
Which I guess explains the subsequent memory allocation error in the log.

Henrik, any idea what might be going on ? Is it possible that the previous
command failure leaves some state machine in a bad state ?


Attaching new full dmesg to the bug report:


To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at
Please read the FAQ at