Re: [RFC PATCH 02/14] mm/hms: heterogenenous memory system (HMS) documentation

From: Jerome Glisse
Date: Wed Dec 05 2018 - 13:33:24 EST


On Wed, Dec 05, 2018 at 11:20:30AM -0700, Logan Gunthorpe wrote:
>
>
> On 2018-12-05 11:07 a.m., Jerome Glisse wrote:
> >> Well multiple links are easy when you have a 'link' bus. Just add
> >> another link device under the bus.
> >
> > So you are telling do what i am doing in this patchset but not under
> > HMS directory ?
>
> No, it's completely different. I'm talking about creating a bus to
> describe only the real hardware that links GPUs. Not creating a new
> virtual tree containing a bunch of duplicate bus and device information
> that already exists currently in sysfs.
>
> >>
> >> Technically, the accessibility issue is already encoded in sysfs. For
> >> example, through the PCI tree you can determine which ACS bits are set
> >> and determine which devices are behind the same root bridge the same way
> >> we do in the kernel p2pdma subsystem. This is all bus specific which is
> >> fine, but if we want to change that, we should have a common way for
> >> existing buses to describe these attributes in the existing tree. The
> >> new 'link' bus devices would have to have some way to describe cases if
> >> memory isn't accessible in some way across it.
> >
> > What i am looking at is much more complex than just access bit. It
> > is a whole set of properties attach to each path (can it be cache
> > coherent ? can it do atomic ? what is the access granularity ? what
> > is the bandwidth ? is it dedicated link ? ...)
>
> I'm not talking about just an access bit. I'm talking about what you are
> describing: standard ways for *existing* buses in the sysfs hierarchy to
> describe things like cache coherency, atomics, granularity, etc without
> creating a new hierarchy.
>
> > On top of that i argue that more people would use that information if it
> > were available to them. I agree that i have no hard evidence to back that
> > up and that it is just a feeling but you can not disprove me either as
> > this is a chicken and egg problem, you can not prove people will not use
> > an API if the API is not there to be use.
>
> And you miss my point that much of this information is already available
> to them. And more can be added in the existing framework without
> creating any brand new concepts. I haven't said anything about
> chicken-and-egg problems -- I've given you a bunch of different
> suggestions to split this up into more managable problems and address
> many of them within the APIs and frameworks we have already.

The thing is that what i am considering is not in sysfs, it does not
even have linux kernel driver, it is just chips that connect device
between them and there is nothing to do with those chips it is all
hardware they do not need a driver. So there is nothing existing that
address what i need to represent.

If i add a a fake driver for those what would i do ? under which
sub-system i register them ? How i express the fact that they
connect device X,Y and Z with some properties ?

This is not PCIE ... you can not discover bridges and child, it
not a tree like structure, it is a random graph (which depends
on how the OEM wire port on the chips).


So i have not pre-existing driver, they are not in sysfs today and
they do not need a driver. Hence why i proposed what i proposed
a sysfs hierarchy where i can add those "virtual" object and shows
how they connect existing device for which we have a sysfs directory
to symlink.


Cheers,
Jérôme