Re: [ANNOUNCE] Status of unlocked_qcmds=1 operation for .37

From: Luben Tuikov
Date: Thu Oct 21 2010 - 15:45:15 EST


--- On Wed, 10/20/10, Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx> wrote:

> From: Nicholas A. Bellinger <nab@xxxxxxxxxxxxxxx>
> Subject: [ANNOUNCE] Status of unlocked_qcmds=1 operation for .37
> To: "linux-kernel" <linux-kernel@xxxxxxxxxxxxxxx>, "linux-scsi" <linux-scsi@xxxxxxxxxxxxxxx>
> Cc: "Vasu Dev" <vasu.dev@xxxxxxxxxxxxxxx>, "Tim Chen" <tim.c.chen@xxxxxxxxxxxxxxx>, "Andi Kleen" <ak@xxxxxxxxxxxxxxx>, "Matthew Wilcox" <willy@xxxxxxxxxxxxxxx>, "James Bottomley" <James.Bottomley@xxxxxxx>, "Mike Christie" <michaelc@xxxxxxxxxxx>, "Jens Axboe" <jaxboe@xxxxxxxxxxxx>, "James Smart" <james.smart@xxxxxxxxxx>, "Andrew Vasquez" <andrew.vasquez@xxxxxxxxxx>, "FUJITA Tomonori" <fujita.tomonori@xxxxxxxxxxxxx>, "Hannes Reinecke" <hare@xxxxxxx>, "Joe Eykholt" <jeykholt@xxxxxxxxx>, "Christoph Hellwig" <hch@xxxxxx>, "Jon Hawley" <warthog9@xxxxxxxxxx>, "MPTFusionLinux" <DL-MPTFusionLinux@xxxxxxx>, "eata.c maintainer" <dario.ballabio@xxxxxxxxx>, "Luben Tuikov" <ltuikov@xxxxxxxxx>, "mvsas maintainer" <kewei@xxxxxxxxxxx>, "pm8001 maintainer Jack Wang" <jack_wang@xxxxxxxxx>, "Brian King" <brking@xxxxxxxxxxxxxxxxxx>, "Mike Anderson" <andmike@xxxxxxxxxxxxxxxxxx>, "Christof Schmitt" <christof.schmitt@xxxxxxxxxx>, "Tejun Heo" <tj@xxxxxxxxxx>, "Andrew Morton"
<akpm@xxxxxxxxxxxxxxxxxxxx>, "H. Peter Anvin" <hpa@xxxxxxxxx>
> Date: Wednesday, October 20, 2010, 1:49 PM
> Greetings all,
>
> So as we get closer to the .37 merge window, I wanted to
> take this
> oppourtunity to recap the current status of the
> drop-host_lock /
> unlocked_qcmds=1 patches, and what is required for the next
> RFCv5 and
> hopefully a merge into .37.  The last RFCv4 was posted
> here:
>
> http://marc.info/?l=linux-kernel&m=128563953114561&w=2
>
> Since then, Christof Schmitt has sent a patch to drop
> struct
> scsi_cmnd->serial_number usage in zfcp, and Tim Chen has
> sent an
> important fix to drop an extra host_lock access that I
> originally missed
> in qla2xxx SHT->queuecommand() that certainly would have
> deadlocked a
> running machine.   Many thanks to Christof
> and Tim for your
> contributions and review!
>
> So at this point in the game the current score sits at:
>
> *) core drivers/scsi remaining issue(s):
>
> The issue raised by andmike during RFCv4 described as:
>
> "If we skip __scsi_try_to_abort_cmd when REQ_ATOM_COMPLETE
> is set it
> would be correct for the scsi_decide_disposition cases but
> it would
> appear this would stop __scsi_try_to_abort_cmd from being
> called in the
> time out case as REQ_ATOM_COMPLETE is set prior to calling
> blk_rq_timed_out."
>
> The complete discussion is here:
>
> http://marc.info/?l=linux-scsi&m=128535319915212&w=2
>
> We still need folks with experience to dig into this code,
> so you know
> the scsi_error.c code please jump in!
>
> *) LLD libraries running by default w/ unlocked_qcmds=1
>
> libiscsi: need ack from mnc
> libsas: need ack from jejb
> libfc: remaining rport state + host_lock less issue. 
> Need more input
>        from mnc for James Smart
> and Joe on this...
> libata: jgarzik thinks this should be OK, review and ack
> from tejun
>         would also be very helpful.
>
> The main issue remaining here is the audit of libfc rport
> (and other..?)
> code that assumes host_lock is held to protect state. 
> mnc, do you have
> any more thoughts for James Smart and Joe here..?
>
> *) Individual LLDs running by default w/ unlocked_qcmds=1
>
> aic94xx: need ack maintainer at adaptec..?)

Adaptec doesn't exist anymore--it's just a memory in the minds of
many good engineers. Anyway, as a former Adaptec employee and
the author of aic94xx and the SAS code in the Linux kernel (which
Bottomley copied from a git tree from an Adaptec server or maybe
from the linux-scsi ML, munged it up and submitted as his own
into linux-scsi git tree) I can give an ACK on both. Both the
aic94xx and the SAS code were written with the host lock as a
kludge and we unlock/lock it as you can see in the code.

> mvsas: need ack maintainer at marvell..?)
> pm8001: need ack Jang Wang
> qla4xxx, qla2xxx: need ack Andrew Vasquez
> fnic:  need ack Joe Eykholt
>
> Aside from the required ACKs, I am not aware of any other
> mainline LLDs
> doing the legacy SHT->queuecommand() -> unlock()
> -> do_lld_work() ->
> lock() that have not already been converted.
>
> The main question here is if any out of tree SCSI LLDs use
> this legacy
> optimization..  Should we come up with a compile time
> way to alert
> vendors to this..?
>
> *) Individual LLDs converted to use explict
> scsi_cmd_get_serial()
>
> mpt2sas: Add scsi_cmd_get_serial() call
> mpt/fusion: Add scsi_cmd_get_serial() call
> dpt_i2o: Add scsi_cmd_get_serial() call
> eata: Add scsi_cmd_get_serial() call
> u14-34f: Add scsi_cmd_get_serial() call
> zfcp: Remove scsi_cmnd->serial_number from debug traces
>
> Aside from the required ACKs, I am not aware of any other
> mainline LLDs
> that use struct scsi_cmnd->serial_number internally that
> have not
> already been converted.  The same applies to out of
> tree modules for
> this, but is certainly not critical.
>
> So as the clock winds down to get this merged into .37, if
> anyone else
> has any concerns or comments please let myself and the
> other relivent
> maintainers know and we will try to address them
> accordingly.
>
> Thanks!
>
> --nab
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/