Re: Race to power off harming SATA SSDs

From: Hans de Goede
Date: Mon May 08 2017 - 05:09:53 EST


Hi,

On 08-05-17 11:06, Ricard Wanderlof wrote:

On Mon, 8 May 2017, David Woodhouse wrote:

On Mon, 8 May 2017, David Woodhouse wrote:
Our empirical testing trumps your "can never happen" theory :)

I'm sure it does. But what is the explanation then? Has anyone analyzed
what is going on using an oscilloscope to verify relationship between
erase command and supply voltage drop?

Not that I'm aware of. Once we have reached the "it does happen and we
have to cope" there was not a lot of point in working out *why* it
happened.

In fact, the only examples I *personally* remember were on NOR flash,
which takes longer to erase. So it's vaguely possible that it doesn't
happen on NAND. But really, it's not something we should be depending
on and the software mechanisms have to remain in place.

My point is really that say that the problem is in fact not that the erase
is cut short due to the power fail, but that the software issues a second
command before the first erase command has completed, for instance, or
some other situation. Then we'd have a concrete situation which we can
resolve (i.e., fix the bug), rather than assuming that it's the hardware's
fault and implement various software workarounds.

You're forgetting that the SSD itself (this thread is about SSDs) also has
a major software component which is doing housekeeping all the time, so even
if the main CPU gets reset the SSD's controller may still happily be erasing
blocks.

Regards,

Hans