Re: [RFC PATCH 0/3] Enable kprobe to monitor sdei event handler

From: Xiongfeng Wang
Date: Fri Apr 26 2019 - 04:20:03 EST


Hi James,

Thanks for your reply!

On 2019/4/25 0:20, James Morse wrote:
> Hi Xiongfeng Wang,
>
> On 12/04/2019 13:04, Xiongfeng Wang wrote:
>> When I use kprobe to monitor a sdei event handler,
>
> Don't do this! SDEI is like an NMI, it isn't safe to kprobe it as it can interrupt the
> kprobe code, causing it become re-entrant.
>
>
>> the CPU will hang. It's
>> because when I probe the event handler, the instruction will be replaced with
>> brk instruction and brk exception is unmaskable. But 'vbar_el1' contains
>> 'tramp_vectors' in '_sdei_handler' when SDEI events interrupt userspace, so
>> we will go to the wrong place if brk exception happens.
>
> This was lucky! Its even more fun if the SDEI event interrupted a guest: the kvm vectors
> will give you a hyp-panic.
>
> The __kprobes and NOKPROBE_SYMBOL() litter should stop you doing this.
>
>
>> I notice that 'ghes_sdei_normal_callback' call several funtions that are not
>> marked as 'nokprobe'.
>
> Bother. We should probably blacklist those too, its not safe.
>
>
>> So I was wondering if we can enable kprobe in '_sdei_handler'.
>
> I don't think this can be done safely.
>
>
> If you need to monitor your SDEI event handler you can just use printk(). Once nmi_enter()
> has been called these are safe as they stash data in a per-cpu buffer. The SDEI handler
> will exit via the IRQ vector if it can, which will cause this buffer to be flushed to the
> console in a timely manner.
>

Thanks for your advice. I agree it's really not a good idea to take exception in NMI context.

>
> Why do you need to kprobe an NMI handler?
>

Because that 'Pseudo NMI' has a great effect on performance, we are still planning to
use SDEI for hardlockup detection.

When someone kprobe the functions in _sdei_handler, things will go wrong.
It's not that we want to monitor the SDEI event handler. It's just that we want to make sure
the system goes well even some people are monitoring any functions available. Some test engineer
may test 'kprobe' by monitoring all the functions that are allowed to be kprobed.
Anyway, thanks for your advice. I think I will need to mark all the functions called
in __sdei_handler as 'nokprobe'.

Thanks,
Xiongfeng

>
>
> Thanks!
>
> James
>
> .
>