Re: mei: cancel stall timers in mei_reset

From: Eugene Shatokhin
Date: Fri Nov 01 2013 - 08:38:24 EST


Hi,

In case my previous mail went to /dev/null, it is all about the flood of error messages in the system log, like these:

mei_me 0000:00:16.0: reset: wrong host start response
mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
mei_me 0000:00:16.0: reset: unexpected enumeration response hbm.
mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS

Even after the patches (https://lkml.org/lkml/2013/9/2/162) that went into kernel 3.10.15, the problem still shows up occasionally on my Lenovo X230 laptop with kernels 3.10.15, 3.10.16 and 3.11.6. It seems to be retated to IRQ handling now.

When the problem occurs, mei_reset() is called repeatedly and the system logs grow rapidly until they consume all free disk space.

Most of the messages are output by mei_hbm_dispatch().

I added debug prints to the code of mei_hbm_dispatch() and found the following:

1. When HOST_START_RES_CMD response is being handled, dev->hbm_state is 0x2 sometimes, that is, MEI_HBM_ENUM_CLIENTS (!) rather than MEI_HBM_START what it probably should be. That's why the error message is printed and mei_reset() is called again.

2. When HOST_ENUM_RES_CMD response is being handled, dev->hbm_state is 0x1 (MEI_HBM_START) while it should be MEI_HBM_ENUM_CLIENTS. This also results in an error message and mei_reset().

That is, dev->hbm_state contains a wrong value in both cases. I haven't figured out so far why it happens.

Do you know how to fix that?

Messages from the system log related to MEI:
-----------------
13:30:42 systemd-sleep[9364]: System resumed.
13:30:42 kernel: mei_me 0000:00:16.0: reset: properties response hbm wrong status.
13:30:42 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
<...>
13:30:47 kernel: mei_me 0000:00:16.0: wait hw ready failed. status = -110
<...>
13:31:17 kernel: mei_me 0000:00:16.0: reset: init clients timeout hbm_state = 1.
13:31:17 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
13:31:17 kernel: mei_me 0000:00:16.0: reset: wrong host start response (dev_state: 0x1, hbm_state: 0x2)
13:31:17 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
13:31:17 kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm (dev_state: 0x1, hbm_state: 0x1)
13:31:17 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
13:31:17 kernel: mei_me 0000:00:16.0: reset: wrong host start response (dev_state: 0x1, hbm_state: 0x2)
13:31:17 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
13:31:17 kernel: mei_me 0000:00:16.0: reset: unexpected enumeration response hbm (dev_state: 0x1, hbm_state: 0x1)
13:31:17 kernel: mei_me 0000:00:16.0: unexpected reset: dev_state = INIT_CLIENTS
-----------------

By the way, here is MEI hw info from lspci output:
-----------------
00:16.0 Communication controller [0780]: Intel Corporation 7 Series/C210 Series Chipset Family MEI Controller #1 [8086:1e3a] (rev 04)
Subsystem: Lenovo Device [17aa:21fa]
Flags: bus master, fast devsel, latency 0, IRQ 11
Memory at f2535000 (64-bit, non-prefetchable) [size=16]
Capabilities: [50] Power Management version 3
Capabilities: [8c] MSI: Enable- Count=1/1 Maskable- 64bit+
Kernel modules: mei_me
-----------------

If you need other info, please let me know.

Regards,
Eugene

--
Eugene Shatokhin, ROSA Laboratory.
www.rosalab.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/