Re: [RFC][PATCH v2 08/10] smp,irq_work: Use the new irq_work API

From: peterz
Date: Wed Aug 19 2020 - 15:41:41 EST


On Wed, Aug 19, 2020 at 11:50:55AM -0700, Linus Torvalds wrote:
> On Wed, Aug 19, 2020 at 12:22 AM <peterz@xxxxxxxxxxxxx> wrote:
> >
> > That is, the external serialization comes from the non-atomic
> > test-and-set they both have. This works nicely when there is external
> > state that already serializes things, but totally comes apart (and
> > causes trivial list corruption) when you get it wrong.
>
> Quite often, there just isn't any *need* for serialization, because
> there is only ever one op active.
>
> That can be either because the csd is fundamentally a single thing ("I
> will transfer this object to another CPU"), or it can be because the
> CSD is already per-cpu (ie smp_call_function_single).

Note that smp_call_function_single() doesn't use _async(), and while it
does use per-cpu CSDs, the reqular .wait=false case then potentially
waits before (instead of after) in the unlikely case the local CSD was
already taken.

This makes that we cannot use it with IRQs disabled.

> You seem to make that common situation much worse.

Probably -- like I said, I'm not really happy with this either.

> Not only do you add that expensive atomic op, you add that expensive
> "use irq_work queues" for something that doesn't _need_ to use them.

I'm not sure I get the "expensive irq_work queues" argument, I fully
agree with you that adding the atomic op is fairly crap.

The remote irq_work is exactly same llist and IPI vector as remote
smp_call_function_single(). And the new irq_work_queue_remote_static()
is exactly as cheap as smp_call_function_single_async() for not having
an atomic op, queues to the same llist, sends the same IPI, but uses the
embedded data structure instead of pass a data pointer pattern.

Because most of the async users end up having another data structure to
pass arguments around anyway.

> I have to say, I'm not a fan. What are the real advantages? Your
> listed disadvantages are very very questionable.
>
> IOW, what are the actual examples of "totally comes apart" that justifies this?
>
> If the example is theoretical ("if you use csd's wrong") then I think
> they are worthless.

Well, I did use the CSD's wrong. I forgot the gotcha and made a mess of
things and stuff crashed, the wreckage is in the git history :/ I fixed
it, but I got burned.

I then went and looked at a bunch of other _async users and it wasn't
immediately obvious that they were doing it right (they were).

I wanted to improve things, but I'm willing to admit to just making it
worse in these last few patches.

Anyway, the pattern I wanted is more easily expressed with the new
irq_work_queue_remote(), even though I don't need it anymore. I can
throw away all the patches after that without loosing too much sleep.