Re: Race to power off harming SATA SSDs

From: David Woodhouse
Date: Mon May 08 2017 - 06:12:58 EST


On Mon, 2017-05-08 at 11:06 +0200, Ricard Wanderlof wrote:
>
> My point is really that say that the problem is in fact not that the eraseÂ
> is cut short due to the power fail, but that the software issues a secondÂ
> command before the first erase command has completed, for instance, orÂ
> some other situation. Then we'd have a concrete situation which we canÂ
> resolve (i.e., fix the bug), rather than assuming that it's the hardware'sÂ
> fault and implement various software workarounds.

On NOR flash we have *definitely* seen it during powerfail testing.

A block looks like it's all 0xFF when you read it back on mount, but if
you read it repeatedly, you may see bit flips because it wasn't
completely erased. And even if you read it ten times and 'trust' that
it's properly erased, it could start to show those bit flips when you
start to program it.

It was very repeatable, and that's when we implemented the 'clean
markers' written after a successful erase, rather than trusting a block
that "looks empty".

Attachment: smime.p7s
Description: S/MIME cryptographic signature