Re: [RFC PATCH 1/1] x86/ftrace: make ftrace_int3_handler() not to skip fops invocation

From: Nicolai Stange
Date: Wed Apr 24 2019 - 02:20:17 EST


Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
> On Tue, 23 Apr 2019 20:15:49 +0200
> Nicolai Stange <nstange@xxxxxxx> wrote:
>> Steven Rostedt <rostedt@xxxxxxxxxxx> writes:
>> > For 32 bit, we could add 4 variables on the thread_info and make 4
>> > trampolines, one for each context (normal, softirq, irq and NMI), and
>> > have them use the variable stored in the thread_info for that location:
>> >
>> > ftrace_int3_tramp_normal
>> > push %eax # just for space
>> > push %eax
>> > mov whatever_to_get_thread_info, %eax
>> > mov normal(%eax), %eax
>> > mov %eax, 4(%esp)
>> > pop %eax
>> > jmp ftrace_caller
>> >
>> > ftrace_int3_tramp_softiqr
>> > push %eax # just for space
>> > push %eax
>> > mov whatever_to_get_thread_info, %eax
>> > mov softirq(%eax), %eax
>> > mov %eax, 4(%esp)
>> > pop %eax
>> > jmp ftrace_caller
>> >
>> > etc..
>> >
>> >
>> > A bit crazier for 32 bit but it can still be done. ;-)
>>
>> Ok, but currently CONFIG_HAVE_LIVEPATCH=n for x86 && !x86_64,
>> so I'd rather not invest too much energy into screwing 32 bit up ;)
>>
>
> Probably not something you care about, but something that I do. Which
> means it will have to go on my TODO list. I care about missed functions
> being called. This means, if you have something tracing a function, and
> then enable something else to trace that same function, you might miss
> the first one.

Alright, if there's a use case beyond live patching, I can try to handle
32 bits alongside, of course.

However, some care will be needed when determining the actual context
from ftrace_int3_handler(): tracing anything before the addition or
after the subtraction of HARDIRQ_OFFSET to/from preempt_count in
irq_enter() resp. irq_exit() could otherwise clobber the "normal" state
in thread_info, correct?

How about having a fixed size stack with three entries shared between
"normal", "irq" and "softirq" in thread_info, as well as a dedicated
slot for nmi context? (The stack would be modified/consumed only with
irqs disabled).

Which makes me wonder:
- if we disabled preemption from ftrace_int3_handler() and reenabled it
again from the *_int3_tramp*s, these stacks could be made per-CPU,
AFAICS,
- and why not use this strategy on x86_64, too? The advantages would be
a common implemention between 32 and 64 bit and more importantly, not
relying on that %r11 hack. When doing the initial RFC patch, it
wasn't clear to me that at most one stack slot per context would be
needed, i.e. only four in total...

What do you think?

Nicolai

--
SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton,
HRB 21284 (AG NÃrnberg)