[PATCH v1 6/6] scsi: ufs: Update the fast abort path in ufshcd_abort() for PM requests

From: Can Guo
Date: Thu May 13 2021 - 01:56:36 EST


If PM requests fail during runtime suspend/resume, RPM framework saves the
error to dev->power.runtime_error. Before the runtime_error gets cleared,
runtime PM on this specific device won't work again, leaving the device
in either suspended or active state permanently.

When task abort happens to a PM request sent during runtime suspend/resume,
even if it can be successfully aborted, RPM framework anyways saves the
(TIMEOUT) error. But we want more and we can do better - let error handling
recover and clear the runtime_error. So, let PM requests take the fast
abort path in ufshcd_abort().

Signed-off-by: Can Guo <cang@xxxxxxxxxxxxxx>
---
drivers/scsi/ufs/ufshcd.c | 21 ++++++++++++---------
1 file changed, 12 insertions(+), 9 deletions(-)

diff --git a/drivers/scsi/ufs/ufshcd.c b/drivers/scsi/ufs/ufshcd.c
index a6313cf40..2a814e2 100644
--- a/drivers/scsi/ufs/ufshcd.c
+++ b/drivers/scsi/ufs/ufshcd.c
@@ -2643,7 +2643,7 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)

lrbp = &hba->lrb[tag];
if (unlikely(lrbp->in_use)) {
- if (hba->wl_pm_op_in_progress)
+ if (cmd->request->rq_flags & RQF_PM)
set_host_byte(cmd, DID_BAD_TARGET);
else
err = SCSI_MLQUEUE_HOST_BUSY;
@@ -2690,7 +2690,7 @@ static int ufshcd_queuecommand(struct Scsi_Host *host, struct scsi_cmnd *cmd)
* err handler blocked for too long. So, just fail the scsi cmd
* sent from PM ops, err handler can recover PM error anyways.
*/
- if (hba->wl_pm_op_in_progress) {
+ if (cmd->request->rq_flags & RQF_PM) {
hba->force_reset = true;
set_host_byte(cmd, DID_BAD_TARGET);
goto out_compl_cmd;
@@ -6959,14 +6959,17 @@ static int ufshcd_abort(struct scsi_cmnd *cmd)
}

/*
- * Task abort to the device W-LUN is illegal. When this command
- * will fail, due to spec violation, scsi err handling next step
- * will be to send LU reset which, again, is a spec violation.
- * To avoid these unnecessary/illegal steps, first we clean up
- * the lrb taken by this cmd and mark the lrb as in_use, then
- * queue the eh_work and bail.
+ * This fast path guarantees the cmd always gets aborted successfully,
+ * meanwhile it invokes the error handler. It allows contexts, which
+ * are blocked by this cmd, to fail fast. It serves multiple purposes:
+ * #1 To avoid unnecessary/illagal abort attempts to the W-LU.
+ * #2 To avoid live lock between eh_work and specific contexts, i.e.,
+ * suspend/resume and eh_work itself.
+ * #3 To let eh_work recover runtime PM error in case abort happens
+ * to cmds sent from runtime suspend/resume ops.
*/
- if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN) {
+ if (lrbp->lun == UFS_UPIU_UFS_DEVICE_WLUN ||
+ (cmd->request->rq_flags & RQF_PM)) {
ufshcd_update_evt_hist(hba, UFS_EVT_ABORT, lrbp->lun);
spin_lock_irqsave(host->host_lock, flags);
if (lrbp->cmd) {
--
Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, a Linux Foundation Collaborative Project.