Re: [RFC]: Mainline of TCM Core and TCM_Loop for v2.6.35

From: Vladislav Bolkhovitin
Date: Sat May 29 2010 - 13:26:51 EST


Nicholas A. Bellinger, on 05/28/2010 06:01 AM wrote:
On Thu, 2010-05-27 at 22:41 +0400, Vladislav Bolkhovitin wrote:
Nicholas A. Bellinger, on 05/20/2010 05:09 AM wrote:
Greetings James and Co,

I would like to formally request the inclusion of TCM Core v4 codebase
containing the fabric independent configfs infrastructure design and
TCM_Loop SCSI LLD fabric module supporting multi-fabric I_T Nexus and
Port emulation for SAS, FC and iSCSI into mainline for v2.6.35

The plan is to push TCM Core and the TCM_Loop LLD module for use with
existing userspace applications into mainline first, and then focus on
extending the upstream fabric libraries (libiscsi, libfc, libsas) for
new and future TCM modules to support a common set of kernel-level
target mode infrastructure for HW and SW fabric engines once the main
target pieces are in place.

On the userspace fabric <-> kernel backstore side, the TCM_Loop fabric
module is currently running with full SPC-3 PR and ALUA support using
fabric-independent virtual SCSI target port emulation with the STGT
iSCSI userspace fabric code and SG_IO backstores. TCM_Loop is also
being used with SG_IO for QEMU-KVM megasas HBA emulation into Linux and
MSFT x86 guests and is able to run at sustained 10 Gb/sec throughput
into KVM guest.

For the kernelspace fabric <-> userspace backstore side for v2.6.35, the
plan is to extend the existing drivers/scsi/scsi_tgt_[lib,if].c direct
mapped ring interface to the WIP kernel level TCM/STGT subsystem
backstore plugin mentioned in previously on linux-scsi. This will allow
projects presenting a userspace block device to access existing TCM
kernel level target module fabric drivers.
I've got 2 question and 1 note.

1. Are there any evidences that TCM has any value over STGT?

TCM provides a HBA / device model for kernel level backstores with SPC-4
PR and ALUA logic on top of mainline Linux storage subsystems. The TCM
v4 design also provides a fabric independent control plane using a set
of generic struct config_groups and fabric context dependent
CONFIGFS_EATTR() based macros that allow for rapid development of new,
conversion of existing, and extention of existing TCM fabric modules.

When used in combination with STGT userspace fabric modules and SG_IO
backstore (tgt.git:usr/iscsi/ for example) and TCM_Loop Port and Nexus
emulation, it allows any kernel level TCM backstore and associated SPC-4
PR and ALUA logic to be made accessable to STGT fabric module code
running in userspace.

Also, STGT currently does not contain the ability to run in SCSI LLD
mode so it is not possible to access kernel level target functionality
inside of a QEMU-KVM guest using virtio or the new megasas HBA
emulation. Using the TCM_Loop fabric module it is now possible to
access TCM fabric independent SPC-4 logic into the virtualized guest
with any hypervisor (kvm, xen, vmw) that properly supports scsi-generic
and some manner of HBA emulation.

All the above can be implemented in STGT. Considering that they were only recently added in TCM, why they were not added in STGT instead?

So far, I've only read not supported words with once a reference to my effort on completely unrelated project.

I have no idea what you are talking about here. As mentioned in my
original email, my efforts for mainling TCM have been to extend and
complement STGT. In open source you have to build upon what already
exists upstream and move forward.

This sounds very attractive, but not practical. Because of fundamental architectural differences, you'd end up with 2 separate somehow coupled subsystems doing the same things by 2 different interfaces. Definitely, this ugly end result would not be something which everybody liked and expected.

2. Are there any users of this code using it in production to prove its usability and stability? I mean, used not by RisingTide and its customers, because on the RisingTide's web page it's clearly written that their target software "partially available as the open-source LIO Core target".
libs
Wrong. We (RisingTide) validate and maintain a backport tree of TCM and
LIO kernel code for our customers who do not necessarly run on bleeding
edge kernels.

Year, you are making from the "partially available" the "fully available" code.

Also just FYI, here in North America you can go into almost any major
electronics store and purchase a storage server from multiple different
vendors containing TCM/LIO code directly from lio-core-2.6.git/master.

Names, please.

Do you mean Netgear (http://old.nabble.com/Re%3A-Invalid-module-format---no-symbol-version-formodule_layout-p27116634.html)?

Anyway, it is apparent that open source LIO/TCM you are pushing has none or very few production users and there are no signs it is changing. Users prefer alternative solutions.

Moreover, I have not seen any positive reference about production usage of LIO/TCM anywhere, the only reference I've seen so far was the above negative feedback about the Netgear experience.

But you should really understand that the 'who is using what' has never
been a strong agruement for mainline acceptance of any project.

Size of the users base has always been one of the main arguments.

As we can see Linux-iSCSI.org development mailing list (http://groups.google.com/group/linux-iscsi-target-dev?hl=en) has near zero activity.

Wrong again. The LIO-devel list contains series after series of
bisectable commits that are posted in a human readable and reviewable
manner. All of the interesting commits related to the v4 configfs
design and port of LIO-Target, TCM_FC, and TCM_Loop fabric modules have
been posted to linux-scsi over the last months as well.

Year, you are the one making traffic there.

The note is that the idea to use the STGT's scsi_tgt_[lib,if].c direct mapped ring interface to extend TCM in the user space and allow present STGT's user space devices to work with TCM is unpractical, because the STGT's interface and devices are built around SCSI target state machine and memory management in the user space, while TCM has them both in the kernel.

I think you are misunderstanding what the TCM STGT backstore subsystem
plugin at lio-core-2.6.git/drivers/target/target_core_stgt.c is supposed
to do, and what I have proposed with the second area of TCM and STGT
compatibility.

Nicholas,

I have been working in area of SCSI targets since 2003, I have created the best OSS SCSI target subsystem which is widely used and getting better and better every day. I am one of few people among recipients of this thread who has sufficient knowledge and experience to be able see the whole picture and be able to evaluate quality and consequences of the code and architectural decisions in this area. (Even deep experience in SCSI initiator side development is not quite sufficient for that, because SCSI initiator and target sides solve completely different tasks [1].) If something isn't clear for me, I can simply look in the source code and quickly find out the answers.

<MOANING ON>

Actually, I'd prefer to stay away from all those TCM discussions and let somebody similarly skillful to judge. But, since there is no such person appearing, I have to participate myself to explain the real state of things and let people judge based on _facts_, not the marketing stuff you are too often presenting. So far too much of what you have written after closer examinations turned out to be a misleading half-truth as in this particular case, where the end result isn't going to be what everybody would expect hearing about "TCM and STGT compatibility" or as it was before with "1 to many" pass-through which only sometimes "1 to many", otherwise not enforced "1 to 1", welcoming data corruption, or even before with Persistent Reservations which worked as expected only with a single connected initiator, etc.. I have to explain everybody for who it isn't obvious what is true and what is NOT true in your "half-truth". Competition is a good thing, but without all those undercover dirty marketing games which I'm really tired. They are disgusting.

<MOANING OFF>

We will be extending the scsi_tgt_[lib,if].c mapped ring interface to
allow TCM to access userspace backstores transparently with existing
kernel level TCM fabric modules, and using the generic configfs fabric
module infrastructure in target_core_fabric_configfs.c for the port and
I_T nexus control plane just as you would with any TCM backstore
subsystem today.

Again, in open source you have to build upon what already exists and
move forward. The original STGT kernel <-> userspace ring abstraction
and logic in drivers/scsi/scsi_tgt_lib.c:scsi_tgt_queue_command() ->
scsi_tgt_uspace_send_cmd() is already going to do the vast majority of
what is required for handling fabric I/O processing and I_T Nexus and
Port management in kernel space with a userspace backstore. It is
really just a matter of allowing the STGT ring request to optionally be
sent out to userspace as a standalone LUN instead of as a target port.

You'd end up in one of 2 options:

1. You'd make TCM to pass-through requests from its target drivers ("fabric modules" in your terminology) directly to the STGT core in the user space bypassing TCM's internal memory management and target state machine, i.e. effectively make them behave as STGT target drivers. As the result, we would have 2 separate interfaces (TCM and STGT) doing the same thing as well as 2 sets of target drivers and 2 sets of backend handlers from each interface. That apparently wouldn't be a Linux's way of doing things. It wouldn't be moving forward, it would be moving in the maintenance hell.

2. You'd just throw away existing STGT messages and add new ones. Then add in STGT a big "TCM compatibility" level to make STGT and it's backend be able to use the new messages and work with new model with memory management and target state machine in the kernel. Obviously, it would be even uglier than (1).

But it's an Open Source, so you can do whatever you want. Show us the code and we will see. My intention is only to _warn_ people that they shouldn't count on your (marketing) plans, because there are fundamental reasons preventing them be implemented in an acceptable way.

Also I should note that the decision to extend the fabric libraries by additional target mode specific routines is a very bad move. I already covered this topic in http://lkml.org/lkml/2008/12/10/245. In short, SCSI initiator and target sides share nearly nothing in the processing code [2], so they should be separated to keep different things separately, as a good design practice required, not to heap them altogether as it's currently done and you are going to continue. Good example of how it is already done is NFS client (fs/nfs/) and server (fs/nfsd/), which share only few ACL processing routines in fs/nfs_common/.

Vlad

[1] SCSI initiator and target are a client and a server correspondingly, where one is generating requests and parsing responses, another one parsing requests and generating responses, so they have very few in common. Like apache (server) and links/firefox (client), or sendmail (server) and mutt/thunderbird (client).

[2] Initiator and target modes share only (1) constants, (2) low level memory processing and mapping routines. Both of them already separated out in the headers and the block subsystem. In case if a hardware supports both initiator and target modes at the same time, the target mode support should be done as an add-on through a set of hooks exported by the corresponding initiator module for the hardware to allow the target add-on to process target mode commands from the hardware. This way code for both modes would be clearly separated and it would allow to load the target mode add-on only when it is needed.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/