Re: [PATCH v4] scsi: ufs: Cleanup completed request without interrupt notification

From: Can Guo
Date: Fri Jul 31 2020 - 04:02:44 EST


Hi Bart,

On 2020-07-31 12:06, Bart Van Assche wrote:
On 2020-07-30 18:30, Stanley Chu wrote:
On Mon, 2020-07-27 at 11:18 +0000, Avri Altman wrote:
Looks good to me.
But better wait and see if Bart have any further reservations.

Would you have any further suggestions?

Today is the first time that I took a look at ufshcd_abort(). The
approach of that function looks wrong to me. This is how I think that a
SCSI LLD abort handler should work:
(1) Serialize against the completion path
(__ufshcd_transfer_req_compl()) such that it cannot happen that the
abort handler and the regular completion path both call
cmd->scsi_done(cmd) at the same time. I'm not sure whether an existing
synchronization object can be used for this purpose or whether a new
synchronization object has to be introduced to serialize scsi_done()
calls from __ufshcd_transfer_req_compl() and ufshcd_abort().
(2) While holding that synchronization object, check whether the SCSI
command is still outstanding. If so, submit a SCSI abort TMR to the device.
(3) If the command has been aborted, call scsi_done() and return
SUCCESS. If aborting failed and the command is still in progress, return
FAILED.

An example is available in srp_abort() in
drivers/infiniband/ulp/srp/ib_srp.c.

Bart.


AFAIK, sychronization of scsi_done is not a problem here, because scsi layer
use the atomic state, namely SCMD_STATE_COMPLETE, of a scsi cmd to prevent
the concurrency of abort and real completion of it.

Check func scsi_times_out(), hope it helps.

enum blk_eh_timer_return scsi_times_out(struct request *req)
{
...
if (rtn == BLK_EH_DONE) {
/*
* Set the command to complete first in order to prevent a real
* completion from releasing the command while error handling
* is using it. If the command was already completed, then the
* lower level driver beat the timeout handler, and it is safe
* to return without escalating error recovery.
*
* If timeout handling lost the race to a real completion, the
* block layer may ignore that due to a fake timeout injection,
* so return RESET_TIMER to allow error handling another shot
* at this command.
*/
if (test_and_set_bit(SCMD_STATE_COMPLETE, &scmd->state))
return BLK_EH_RESET_TIMER;
if (scsi_abort_command(scmd) != SUCCESS) {
set_host_byte(scmd, DID_TIME_OUT);
scsi_eh_scmd_add(scmd);
}
}
}

Thanks,

Can Guo.