Re: [PATCH v4] tracepoint: Do not fail unregistering a probe due to memory failure

From: Mathieu Desnoyers
Date: Wed Jan 27 2021 - 15:18:35 EST


----- On Jan 27, 2021, at 2:16 PM, rostedt rostedt@xxxxxxxxxxx wrote:

> On Wed, 27 Jan 2021 13:13:22 -0500 (EST)
> Mathieu Desnoyers <mathieu.desnoyers@xxxxxxxxxxxx> wrote:
>
>> > Thanks for bringing that up.
>>
>> Requiring an RCU synchronize on element removal is quite intrusive, and can
>> be problematic if tracepoint removal is called from e.g. preempt-off context.
>
> But how often do you remove more than one callback from the same
> tracepoint? Or should I say, from a lot of tracepoints?
>
> This will only synchronize for the following case:
>
> Add three callbacks to a single tracepoint.
> Remove the first one.
> <rcu callback to update the counters>
> Remove the second one
> <triggers a synchronization if the counters have not been finished
> updating>
> Remove the third one.
> <no synchronization needed, because it's being freed>
>
> And we may be able to make this work in batch too.
>
> More to come, but I really like this approach over the others because it
> does not increase the size of the kernel for a failure that should never
> happen in practice.

My concern with introducing a scheme based on synchronize_rcu to handle out-of-memory
scenarios is not about how frequently synchronize_rcu will be called, but
about the added complexity this adds to the tracepoints insertion/removal.
This has chances to explode in unlikely scenarios, which will become harder to test
for. This will also introduce constraints on which kernel context can perform
tracepoint removal.

I notice that the error report which leads to this v4 is against v2 of the patch [1],
which has known issues. I wonder whether there are any such issues with v3, which is
using a function stub ? [2]

The main concern I had about v3 of the patch was that the prototype of the
stub (void argument list) does not match the prototype of the called function. As
discussed in [2], there are other scenarios where the kernel expects this to work,
based on the expectation that all architectures supported by the Linux kernel have
a calling convention where the caller performs the clean-up.

So I would prefer the approach taken in v3 rather than adding the kind of complexity
introduced in v4. Ideally we should document, near the stub function, that this
expects the calling convention to have the caller perform the clean-up (cdecl family),
and that the returned type (void) of the function always match. It's only the arguments
which may differ.

There is still one thing that I'm not 100% sure about. It's related to this comment
from Kees Cook [3], hinting that in a CFI build the function prototype of the call
site and the function need to match. So do we need extra annotation of the stub function
to make this work in a CFI build ?

Thanks,

Mathieu

[1] https://lore.kernel.org/bpf/20201117211836.54acaef2@xxxxxxxxxxxxxxxx/
[2] https://lore.kernel.org/bpf/20201118093405.7a6d2290@xxxxxxxxxxxxxxxxxx/
[3] https://lore.kernel.org/bpf/202011171330.94C6BA7E93@keescook/
--
Mathieu Desnoyers
EfficiOS Inc.
http://www.efficios.com