Re: [PATCH v3 1/4] futex: Implement mechanism to wait on any of several futexes

From: Pierre-Loup A. Griffais
Date: Mon Mar 02 2020 - 21:56:19 EST

Next message: Sasha Levin: "[PATCH AUTOSEL 5.4 51/58] bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs."
Previous message: Sasha Levin: "[PATCH AUTOSEL 5.4 54/58] csky/smp: Fixup boot failed when CONFIG_SMP"
Next in thread: Peter Zijlstra: "'simple' futex interface [Was: [PATCH v3 1/4] futex: Implement mechanism to wait on any of several futexes]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 2/29/20 2:27 AM, Thomas Gleixner wrote:

"Pierre-Loup A. Griffais" <pgriffais@xxxxxxxxxxxxxxxxx> writes:

On 2/28/20 1:25 PM, Thomas Gleixner wrote:

Peter Zijlstra <peterz@xxxxxxxxxxxxx> writes:

Thomas mentioned something like that, the problem is, ofcourse, that we
then want to fix a whole bunch of historical ills, and the probmem
becomes much bigger.

We keep piling features on top of an interface and mechanism which is
fragile as hell and horrible to maintain. Adding vectoring, multi size
and whatever is not making it any better.

There is also the long standing issue with NUMA, which we can't address
with the current pile at all.

So I'm really advocating that all involved parties sit down ASAP and
hash out a new and less convoluted mechanism where all the magic new
features can be addressed in a sane way so that the 'F' in Futex really
only means Fast and not some other word starting with 'F'.

Are you specifically talking about the interface, or the mechanism
itself? Would you be OK with a new syscall that calls into the same code
as this patch? It does seem like that's what we want, so if we rewrote a
mechanism I'm not convinced it would come out any different. But, the
interface itself seems fair-game to rewrite, as the current futex
syscall is turning into an ioctl of sorts.

No, you are misreading what I said. How does a new syscall make any
difference? It still adds new crap to a maze which is already in a state
of dubious maintainability.

I was just going by the context added by Peter, which seemed to imply your concerns were mostly around the interface, because I couldn't understand a clear course of action to follow just from your email. And frankly, still can't, but hopefully you can help us get there.

This solves a real problem with a real usecase; so I'd like to stay
practical and not go into deeper issues like solving NUMA support for
all of futex in the interest of users waiting at the other end. Can you
point us to your preferred approach just for the scope of what we're
trying to accomplish?

If we go by the argument that something solves a real use case and take
this as justification to proliferate existing crap, then we never get to
the point where things get redesigned from ground up. Quite the
contrary, we are going to duct tape it to death.

It does not matter at all whether the syscall is multiplexing or split
up into 5 different ones. That's a pure cosmetic exercise.

While all the currently proposed extensions (multiple wait, variable
size) make sense conceptually, I'm really uncomfortable to just cram
them into the existing code. They create an ABI which we have to
maintain forever.

From experience I just know that every time we extended the futex
interface we opened another can of worms which hunted us for years if
not for more then a decade. Guess who has to deal with that. Surely not
the people who drive by and solve their real world usecases. Just go and
read the changelog history of futexes very carefully and you might
understand what kind of complex beasts they are.

At some point we simply have to say stop, sit down and figure out which
kind of functionality we really need in order to solve real world user
space problems and which of the gazillion futex (mis)features are just
there as historical ballast and do not have to be supported in a new
implementation, REQUEUE is just the most obvious example.

I completely understand that you want to stay practical and just want to
solve your particular itch, but please understand that the people who
have to deal with the fallout and have dealt with it for 15+ years have
very practical reasons to say no.

Note that it would have been nice to get that high-level feedback on the first version; instead we just received back specific feedback on the implementation itself, and questions about usecase/motivation that we tried to address, but that didn't elicit any follow-ups.

Please bear with me for a second in case you thought you were obviously very clear about the path forward, but are you saying that:

1. Our usecase is valid, but we're not correct about futex being the right fit for it, and we should design an implement a new primitive to handle it?

2. Our usecase is valid, and our research showing that futex is the optimal right fit for it might be correct, but futex has to be significantly refactored before accepting this new feature. (or any new feature?)

If it was 1., I think our new solution would either end up looking more or less exactly like futex, just with some of the more exotic functionality removed (although even that is arguable, since I wouldn't be surprised if we ended up using eg. requeue for some of the more complex migration scenarios). In which case I assume someone else would ask the question on why we're doing this new thing instead of adding to futex. OR, if intentionally made not futex-like, would end up not being optimal, which would make it not the right solution and a non-started to begin with. There's a reason we moved away from eventfd, even ignoring the fd exhaustion problem that some problematic apps fall victim to.

If it's 2., then we'd be hard-pressed to proceed forward without your guidance.

Conceptually it seems like multiple wait is an important missing feature in futex compared to core threading primitives of other platforms. It isn't the first time that the lack of it has come up for us and other game developers. Due to futex being so central and important, I completely understand it is tricky to get right and might be hard to maintain if not done correctly. It seems worthwhile to undertake, at least from our limited perspective. We'd be glad to help upstream get there, if possible.

Thanks,
- Pierre-Loup

Thanks,

tglx

Next message: Sasha Levin: "[PATCH AUTOSEL 5.4 51/58] bnxt_en: Issue PCIe FLR in kdump kernel to cleanup pending DMAs."
Previous message: Sasha Levin: "[PATCH AUTOSEL 5.4 54/58] csky/smp: Fixup boot failed when CONFIG_SMP"
Next in thread: Peter Zijlstra: "'simple' futex interface [Was: [PATCH v3 1/4] futex: Implement mechanism to wait on any of several futexes]"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]