Re: [PATCH] scsi: sd: add runtime pm to open / release

From: Martin Kepplinger
Date: Wed Jul 29 2020 - 11:40:57 EST


On 29.07.20 16:53, James Bottomley wrote:
> On Wed, 2020-07-29 at 07:46 -0700, James Bottomley wrote:
>> On Wed, 2020-07-29 at 10:32 -0400, Alan Stern wrote:
>>> On Wed, Jul 29, 2020 at 04:12:22PM +0200, Martin Kepplinger wrote:
>>>> On 28.07.20 22:02, Alan Stern wrote:
>>>>> On Tue, Jul 28, 2020 at 09:02:44AM +0200, Martin Kepplinger
>>>>> wrote:
>>>>>> Hi Alan,
>>>>>>
>>>>>> Any API cleanup is of course welcome. I just wanted to remind
>>>>>> you that the underlying problem: broken block device runtime
>>>>>> pm. Your initial proposed fix "almost" did it and mounting
>>>>>> works but during file access, it still just looks like a
>>>>>> runtime_resume is missing somewhere.
>>>>>
>>>>> Well, I have tested that proposed fix several times, and on my
>>>>> system it's working perfectly. When I stop accessing a drive
>>>>> it autosuspends, and when I access it again it gets resumed and
>>>>> works -- as you would expect.
>>>>
>>>> that's weird. when I mount, everything looks good, "sda1". But as
>>>> soon as I cd to the mountpoint and do "ls" (on another SD card
>>>> "ls" works but actual file reading leads to the exact same
>>>> errors), I get:
>>>>
>>>> [ 77.474632] sd 0:0:0:0: [sda] tag#0 UNKNOWN(0x2003) Result:
>>>> hostbyte=0x00 driverbyte=0x08 cmd_age=0s
>>>> [ 77.474647] sd 0:0:0:0: [sda] tag#0 Sense Key : 0x6 [current]
>>>> [ 77.474655] sd 0:0:0:0: [sda] tag#0 ASC=0x28 ASCQ=0x0
>>>> [ 77.474667] sd 0:0:0:0: [sda] tag#0 CDB: opcode=0x28 28 00 00
>>>> 00 60 40 00 00 01 00
>>>
>>> This error report comes from the SCSI layer, not the block layer.
>>
>> That sense code means "NOT READY TO READY CHANGE, MEDIUM MAY HAVE
>> CHANGED" so it sounds like it something we should be
>> ignoring. Usually this signals a problem, like you changed the
>> medium manually (ejected the CD). But in this case you can tell us
>> to expect this by setting
>>
>> sdev->expecting_cc_ua
>>
>> And we'll retry. I think you need to set this on all resumed
>> devices.
>
> Actually, it's not quite that easy, we filter out this ASC/ASCQ
> combination from the check because we should never ignore medium might
> have changed events on running devices. We could ignore it if we had a
> flag to say the power has been yanked (perhaps an additional sdev flag
> you set on resume) but we would still miss the case where you really
> had powered off the drive and then changed the media ... if you can
> regard this as the user's problem, then we might have a solution.
>
> James
>

oh I see what you mean now, thanks for the ellaboration.

if I do the following change, things all look normal and runtime pm
works. I'm not 100% sure if just setting expecting_cc_ua in resume() is
"correct" but that looks like it is what you're talking about:

(note that this is of course with the one block layer diff applied that
Alan posted a few emails back)


--- a/drivers/scsi/scsi_error.c
+++ b/drivers/scsi/scsi_error.c
@@ -554,16 +554,8 @@ int scsi_check_sense(struct scsi_cmnd *scmd)
* so that we can deal with it there.
*/
if (scmd->device->expecting_cc_ua) {
- /*
- * Because some device does not queue unit
- * attentions correctly, we carefully check
- * additional sense code and qualifier so as
- * not to squash media change unit attention.
- */
- if (sshdr.asc != 0x28 || sshdr.ascq != 0x00) {
- scmd->device->expecting_cc_ua = 0;
- return NEEDS_RETRY;
- }
+ scmd->device->expecting_cc_ua = 0;
+ return NEEDS_RETRY;
}
/*
* we might also expect a cc/ua if another LUN on the target
diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c
index d90fefffe31b..5ad847fed8b9 100644
--- a/drivers/scsi/sd.c
+++ b/drivers/scsi/sd.c
@@ -3642,6 +3642,8 @@ static int sd_resume(struct device *dev)
if (!sdkp) /* E.g.: runtime resume at the start of
sd_probe() */
return 0;

+ sdkp->device->expecting_cc_ua = 1;
+
if (!sdkp->device->manage_start_stop)
return 0;