RE: PATCH - InfiniBand Access Layer (IBAL)

From: Woodruff, Robert J
Date: Mon Mar 15 2004 - 18:01:22 EST


On Sun, Mar 14, Greg KH wrote:

First, As my boss remined me this morning,
let me make sure I was clear, there are not 2 different efforts now,
only one, openib.org.

1) OpenIB represents a number of companies coming together with lots of
InfiniBand source code,
with duplicate code for the access layer and most of the ULPs
2) the SourceForge work is already part of this
3) the foundation of infiniband support will be the Access Layer, so it
needs the community's feedback first
4) we are looking for feedback on both the access layer code in the
current openib snapshot and the access layer code that we submitted a
few weeks ago
to learn which is more acceptable to the community.

Now to answer a couple specific questions.

>Hm, without open source drivers, the Intel stack doesn't seem very
>viable, correct?

Correct, that is why we hope that Mellanox contributes their driver for
IBAL to open source.

>> The comments you have given on IBAL would probably only take a few
>> weeks to change.
>Is that work already underway? Finished? If neither, why not?

Work is, or at least was underway, but
we put it aside last week to review the rest of the code now in
openib.org.
We also need an open source driver.

>What are the issues with the OpenIB stack?

As I stated above, we are part of the openib.org collaboration and
will be working on helping develop a stack that is "best of
breed" from all of the available code. Starting from the bottom up,
we first need to review the various proposals for the
Access Layer and determine which code base to start with.
The initial agreement was to use the
TopSpin code for an access layer. This agreement was made before anyone
got to see any code.
After review of this code, we think it needs a lot of work. We were
waiting for the openib.org email lists to open and sending in comments
there.
That way we could work a lot of details offline from lkml, since
lots of discussion will be needed.

But since you asked here are a few,

1.) The tsapi APIs look like Windows APIs (at least in the original
drop)
2.) Looking at the API specification document,
It is missing lots of verbs required by the InfiniBand Specification
CloseCA, ModifyAV, QueryCQ, CreateEEC, ModifyEEC, QueryEEC,
DestroyEEC, QueryMR, ReregisterMR, ReregisterPhysMR, RegisterSharedMR,
AllocMW, QueryMW, BindMW, and FreeMW
3.) The code is not compliant with the InfiniBand specification and has
proprietary
implementations of things like "path records" so it will only work with
the
TopSpin subnet manager that requires you to buy a topspin switch.
4.) Not sure if they have fixed this yet in the 2.6 code, but the 2.4
code
has like 18 different loadable modules. This could probably be collapsed
into 5, one for the HCA driver, one for the access layer, one for the
IPoIB driver, one for the SRP driver and one for the SDP driver.
5.) There is no user-mode access layer requiring ULPs to code to the HCA

user-mode driver APIs directly.
This will mean that new user mode ULPs will need to be
developed for each new HCA that comes along.
6.)The VAPI code has extra propietary verbs that are not specified by
the InfiniBand
Specification.
7.) The implementation is deficient in it's support for InfiniBand
management
services, like the required RMPP protocol, MAD services, SA query helper
functions.
8.) Some of the message fields of the CM are hard coded.
9.) The CM does not support reliable datagrams.
10) There is no built in support for plug and play events, port up/down,
LID change, SM change,
11) VAPI call stack is deep and puts a lot of big data structure on the
stack.

There is more, but as I stated before, we suggest discussing most of
these issues within
openib.org first, trying to come to agreement on what is best and then
review our
suggestions with lkml to make sure we are one track.

>If there are any, how does the Intel stack solve those
>issues?

The SourceForge code IBAL(not just developed by Intel but has
contributions from several companies,
including InfiniCon, Mellanox, Fujitsu and Intel)
is feature complete and compliant with the InfiniBand specification. It
may not be quite as
hardened as the TopSpin stack, but that gap is rapidly closing.
We'd also like to know from the other openib.org people,
What are the issues with the SourceForge IBAL ?

We know the issues raised by lkml and think these can be fixed.

The biggest problem I see is that we do not have an open source HCA
driver
and that could be fixed too, if Mellanox wanted to, or someone could
take the VAPI code they open sourced and port it to IBAL.

>Could the Intel solutions be merged
>into the OpenIB stack to solve these issues?

Given there are so many issues with the TSAPI, would it be easier to
fix the issues lkml raised with IBAL and port the "best of breed" ULPs
to it ?
Since all the tsAPIs will have to change anyway, to non-Windows-ize
them,
all the ULPs will need to be re-ported again anyway.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/