Re: [BUG RT] dump-capture kernel not executed for panic in interrupt context

From: peterz
Date: Mon Sep 07 2020 - 12:24:16 EST


On Mon, Sep 07, 2020 at 02:03:09PM +0200, Joerg Vehlow wrote:
>
>
> On 9/7/2020 1:46 PM, peterz@xxxxxxxxxxxxx wrote:
> > I think it's too complicated for that is needed, did you see my
> > suggestion from a year ago? Did i miss something obvious?
> >
> This one? https://lore.kernel.org/linux-fsdevel/20191219090535.GV2844@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx/
>
> I think it may be a bit incorrect?
> According to the original comment in __crash_kexec, the mutex was used to
> prevent a sys_kexec_load, while crash_kexec is executed. Your proposed patch
> does not lock the mutex in crash_kexec.

Sure, but any mutex taker will (spin) wait for panic_cpu==CPU_INVALID.
And if the mutex is already held, we'll not run __crash_kexec() just
like the trylock() would do today.

> This does not cover the original use
> case anymore. The only thing that is protected now are two panicing cores at
> the same time.

I'm not following. AFAICT it does exactly what the old code did.
Although maybe I didn't replace all kexec_mutex users, I now see that
thing isn't static.

> Actually, this implementation feels even more hacky to me....

It's more minimal ;-) It's simpler in that it only provides the required
semantics (as I understand them) and does not attempt to implement a
more general trylock() like primitive that isn't needed.

Also, read the kexec_lock() implementation you posted and explain to me
what happens when kexec_busy is elevated. Also note the lack of
confusing loops in my code.