Re: [PATCH V4 0/5] mlx5 ConnectX control misc driver

From: Jakub Kicinski
Date: Fri Mar 22 2024 - 18:29:34 EST


On Fri, 22 Mar 2024 18:44:23 -0300 Jason Gunthorpe wrote:
> On Fri, Mar 22, 2024 at 01:58:26PM -0700, Jakub Kicinski wrote:
> > > Well said, David.
> > >
> > > I would totally support doing something like this in a fairly generic
> > > way that could be leveraged/instantiated by drivers that will allow
> > > communication/inspection of hardware blocks in the datapath. There are
> > > lots of different ways this could go, so feedback on this would help get
> > > us all moving in the right direction.
> >
> > The more I learn, the more I am convinced that the technical
> > justifications here are just smoke and mirrors.
>
> Let's see some evidence of this then, point to some sillicon devices
> in the multibillion gate space that don't have complex FW built into
> their design?

Existence of complex FW does not imply that production systems must
have a backdoor to talk to that FW in kernel-unmitigated fashion.

As an existence proof I give you NICs we use at Meta.
Or old Netronome NICs, you can pick.

> > The main motivation for nVidia, Broadcom, (and Enfabrica?) being to
> > hide as much as possible of what you consider your proprietary
> > advantage in the "AI gold rush".
>
> Despite all of those having built devices like this well before the
> "AI gold rush" and it being a general overall design principle for the
> industry because, yes, the silicon technology available actually
> demands it.
>
> It is not to say you couldn't do otherwise, it is just simply too
> expensive.

I do agree that it is expensive, not sure if it's "too" expensive.
But Linux never promised that our way of doing SW development would
always be the most cost effective option, right? Especially short
term. Or that we'll be competitive time to market.

> > RDMA is what it is but I really hate how you're trying to pretend
> > that it's is somehow an inherent need of advanced technology and
> > we need to lower the openness standards for all of the kernel.
>
> Open hardware has never been an "openness standard" for the kernel.

I was in the meeting with a vendor this morning and when explicitly
asked by an SRE (not from my org nor in any way "primed" by me)
whether configuration of some run of the mill PCI thing can be exposed
via devlink params instead of whatever proprietary thing the vendor was
pitching, the vendor's answer was silence and then a pitch of another
proprietary mechanism.

So no, the "open hardware" is certainly not a requirement for the
kernel. But users can't get vendors to implement standard Linux
configuration interfaces, and your proposal will make it a lot worse.