Re: pstore dump inside an nmi handler

From: Don Zickus
Date: Mon Jul 11 2011 - 17:55:48 EST


On Fri, Jul 08, 2011 at 02:40:13PM -0700, Luck, Tony wrote:
> > Inside pstore_dump(), the first thing it tries to grab is a mutex_lock()
> > (inside an nmi hander). This seems to be the root cause of my problems.
>
> Someone else pointed out that mutex_lock() is a problem here too. They
> wondered whether spin_lock_irqsave() would work - or whether pstore
> backends were allowed to sleep - to which I said I hoped they didn't,
> but wasn't really sure what the future will hold.
>
> So ... ideas (and patches) are most welcome.

I tested the spin_lock_irqsave thing on my one box where it was failing
and got past my initial problem into kdump. So that is a positive and I
can post the patch for that. Though it probably isn't a complete
solution, it is better than a mutex.

However, I have been scratching my head at a follow up problem, which is
when I inject an error which produces an NMI->GHES->panic, the error
record doesn't get stored under pstore (or maybe ERST too). I do see the
ERST code follow all the correct steps in storing the kmsg_dump logs into
the ERST table. Just on the reboot, when I mount pstore it isn't there.

When I perform an 'echo c > /proc/sysrq-trigger', it shows up on the
reboot. Not sure what can be going wrong. I cc'd Ying with hopes he
might have some thoughts.

Cheers,
Don
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/