Re: [PATCH v5 09/15] scsi: ufs: add error recovery after DL NAC error

From: Hannes Reinecke
Date: Tue Mar 01 2016 - 02:52:16 EST


On 02/28/2016 09:32 PM, Yaniv Gardi wrote:
> Some vendor's UFS device sends back to back NACs for the DL data frames
> causing the host controller to raise the DFES error status. Sometimes
> such UFS devices send back to back NAC without waiting for new
> retransmitted DL frame from the host and in such cases it might be
> possible the Host UniPro goes into bad state without raising the DFES
> error interrupt. If this happens then all the pending commands would
> timeout only after respective SW command (which is generally too
> large).
>
> This change workarounds such device behaviour like this:
> - As soon as SW sees the DL NAC error, it would schedule the error
> handler
> - Error handler would sleep for 50ms to see if there any fatal errors
> raised by UFS controller.
> - If there are fatal errors then SW does normal error recovery.
> - If there are no fatal errors then SW sends the NOP command to
> device to check if link is alive.
> - If NOP command times out, SW does normal error recovery
> - If NOP command succeed, skip the error handling.
>
> If DL NAC error is seen multiple times with some vendor's UFS devices
> then enable this quirk to initiate quick error recovery and also
> silence related error logs to reduce spamming of kernel logs.
>
> Signed-off-by: Subhash Jadavani <subhashj@xxxxxxxxxxxxxx>
> Signed-off-by: Yaniv Gardi <ygardi@xxxxxxxxxxxxxx>
>
> ---
> drivers/scsi/ufs/ufshcd.c | 93 +++++++++++++++++++++++++++++++++++++++++++++++
> drivers/scsi/ufs/ufshci.h | 2 +
> 2 files changed, 95 insertions(+)
>
Reviewed-by: Hannes Reinecke <hare@xxxxxxx>

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@xxxxxxx +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)