Re: [RFC PATCH 00/14] Heterogeneous Memory System (HMS) and hbind()

From: Jerome Glisse
Date: Tue Dec 04 2018 - 19:29:22 EST


On Tue, Dec 04, 2018 at 03:58:23PM -0800, Dave Hansen wrote:
> On 12/4/18 1:57 PM, Jerome Glisse wrote:
> > Fully correct mind if i steal that perfect summary description next time
> > i post ? I am so bad at explaining thing :)
>
> Go for it!
>
> > Intention is to allow program to do everything they do with mbind() today
> > and tomorrow with the HMAT patchset and on top of that to also be able to
> > do what they do today through API like OpenCL, ROCm, CUDA ... So it is one
> > kernel API to rule them all ;)
>
> While I appreciate the exhaustive scope of such a project, I'm really
> worried that if we decided to use this for our "HMAT" use cases, we'll
> be bottlenecked behind this project while *it* goes through 25 revisions
> over 4 or 5 years like HMM did.
>
> So, should we just "park" the enhancements to the existing NUMA
> interfaces and infrastructure (think /sys/devices/system/node) and wait
> for this to go in? Do we try to develop them in parallel and make them
> consistent? Or, do we just ignore each other and make Andrew sort it
> out in a few years? :)

Let have a battle with giant foam q-tip at next LSF/MM and see who wins ;)

More seriously i think you should go ahead with Keith HMAT patchset and
make progress there. In HMAT case you can grow and evolve the NUMA node
infrastructure to address your need and i believe you are doing it in
a sensible way. But i do not see a path for what i am trying to achieve
in that framework. If anyone have any good idea i would welcome it.

In the meantime i hope i can make progress with my proposal here under
staging. Once i get enough stuff working in userspace and convince guinea
pig (i need to find a better name for those poor people i will coerce
in testing this ;)) then i can have some hard evidence of what thing in
my proposal is useful on some concret case with open source stack from
top to bottom. It might means stripping down what i am proposing today
to what turns out to be useful. Then start a discussion about merging the
kernel underlying code into one (while preserving all existing API) and
getting out of staging with real syscall we will have to die with.

I know that at the very least the hbind() and hpolicy() syscall would
be successful as the HPC folks have been been dreaming of this. The
topology thing is harder to know, they are some users today but i can
not say how much more interest it can spark outside of this very small
community that is HPC.

Cheers,
Jérôme