[ANNOUNCE] Status of unlocked_qcmds=1 operation for .37

From: Nicholas A. Bellinger
Date: Wed Oct 20 2010 - 16:54:34 EST


Greetings all,

So as we get closer to the .37 merge window, I wanted to take this
oppourtunity to recap the current status of the drop-host_lock /
unlocked_qcmds=1 patches, and what is required for the next RFCv5 and
hopefully a merge into .37. The last RFCv4 was posted here:

http://marc.info/?l=linux-kernel&m=128563953114561&w=2

Since then, Christof Schmitt has sent a patch to drop struct
scsi_cmnd->serial_number usage in zfcp, and Tim Chen has sent an
important fix to drop an extra host_lock access that I originally missed
in qla2xxx SHT->queuecommand() that certainly would have deadlocked a
running machine. Many thanks to Christof and Tim for your
contributions and review!

So at this point in the game the current score sits at:

*) core drivers/scsi remaining issue(s):

The issue raised by andmike during RFCv4 described as:

"If we skip __scsi_try_to_abort_cmd when REQ_ATOM_COMPLETE is set it
would be correct for the scsi_decide_disposition cases but it would
appear this would stop __scsi_try_to_abort_cmd from being called in the
time out case as REQ_ATOM_COMPLETE is set prior to calling
blk_rq_timed_out."

The complete discussion is here:

http://marc.info/?l=linux-scsi&m=128535319915212&w=2

We still need folks with experience to dig into this code, so you know
the scsi_error.c code please jump in!

*) LLD libraries running by default w/ unlocked_qcmds=1

libiscsi: need ack from mnc
libsas: need ack from jejb
libfc: remaining rport state + host_lock less issue. Need more input
from mnc for James Smart and Joe on this...
libata: jgarzik thinks this should be OK, review and ack from tejun
would also be very helpful.

The main issue remaining here is the audit of libfc rport (and other..?)
code that assumes host_lock is held to protect state. mnc, do you have
any more thoughts for James Smart and Joe here..?

*) Individual LLDs running by default w/ unlocked_qcmds=1

aic94xx: need ack maintainer at adaptec..?)
mvsas: need ack maintainer at marvell..?)
pm8001: need ack Jang Wang
qla4xxx, qla2xxx: need ack Andrew Vasquez
fnic: need ack Joe Eykholt

Aside from the required ACKs, I am not aware of any other mainline LLDs
doing the legacy SHT->queuecommand() -> unlock() -> do_lld_work() ->
lock() that have not already been converted.

The main question here is if any out of tree SCSI LLDs use this legacy
optimization.. Should we come up with a compile time way to alert
vendors to this..?

*) Individual LLDs converted to use explict scsi_cmd_get_serial()

mpt2sas: Add scsi_cmd_get_serial() call
mpt/fusion: Add scsi_cmd_get_serial() call
dpt_i2o: Add scsi_cmd_get_serial() call
eata: Add scsi_cmd_get_serial() call
u14-34f: Add scsi_cmd_get_serial() call
zfcp: Remove scsi_cmnd->serial_number from debug traces

Aside from the required ACKs, I am not aware of any other mainline LLDs
that use struct scsi_cmnd->serial_number internally that have not
already been converted. The same applies to out of tree modules for
this, but is certainly not critical.

So as the clock winds down to get this merged into .37, if anyone else
has any concerns or comments please let myself and the other relivent
maintainers know and we will try to address them accordingly.

Thanks!

--nab

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/