Re: [Possible REGRESSION, 4.16-rc4] Error updating SMART data during runtime and could not connect to lvmetad at some boot attempts

From: Hans de Goede
Date: Mon Mar 19 2018 - 05:51:04 EST


Hi Thorsten,

On 19-03-18 10:42, Thorsten Leemhuis wrote:
Hi! On 11.03.2018 09:20, Martin Steigerwald wrote:

Since 4.16-rc4 (upgraded from 4.15.2 which worked) I have an issue
with SMART checks occassionally failing like this:

Martin (or someone else): Could you gibe a status update? I have this
issue on my list or regressions, but it's hard to follow as two
different issues seem to be discussed. Or is it just one issue? Did the
patch/discussion that Bart pointed to help? Is the issue still showing
up in rc6?

Your right there are 2 issues here:

1) The Crucial M500 SSD (at least the 480GB MU03 firmware version) does
not like enabling SATA link power-management at a level of min_power
or at the new(ish) med_power_with_dipm level. This problem exists in
older kernels too, so this is not really a regression.

New in 4.16 is a Kconfig option to enable SATA LPM by default, which
makes this existing problem much more noticeable. Not sure if you want
to count this as a regression. Either way I'm preparing and sending
out a patch fixing this (by blacklisting LPM for this model SSD) right
now.

2) There seem to be some latency issues in the MU03 version of the
firmware, triggered by polling SMART data, which causes lvmetad to
timeout in some cases. Note I'm not involved in that part of this
thread, but I believe that issue is currently unresolved.

Regards,

Hans