Re: d_lookup: Unable to handle kernel paging request

From: Vicente Bergas
Date: Sat Jun 22 2019 - 14:02:26 EST


Hi Al,
i think have a hint of what is going on.
With the last kernel built with your sentinels at hlist_bl_*lock
it is very easy to reproduce the issue.
In fact it is so unstable that i had to connect a serial port
in order to save the kernel trace.
Unfortunately all the traces are at different addresses and
your sentinel did not trigger.

Now i am writing this email from that same buggy kernel, which is
v5.2-rc5-224-gbed3c0d84e7e.

The difference is that I changed the bootloader.
Before was booting 5.1.12 and kexec into this one.
Now booting from u-boot into this one.
I will continue booting with u-boot for some time to be sure it is
stable and confirm this is the cause.

In case it is, who is the most probable offender?
the kernel before kexec or the kernel after?

The original report was sent to you because you appeared as the maintainer
of fs/dcache.c, which appeared on the trace. Should this be redirected
somewhere else now?

Regards,
VicenÃ.

On Wednesday, June 19, 2019 7:09:40 PM CEST, Al Viro wrote:
On Wed, Jun 19, 2019 at 06:51:51PM +0200, Vicente Bergas wrote:

What's your config, BTW? SMP and DEBUG_SPINLOCK, specifically...

Hi Al,
here it is:
https://paste.debian.net/1088517

Aha... So LIST_BL_LOCKMASK is 1 there (same as on distro builds)...

Hell knows - how about
static inline void hlist_bl_lock(struct hlist_bl_head *b)
{
BUG_ON(((u32)READ_ONCE(*b)&~LIST_BL_LOCKMASK) == 0x01000000);
bit_spin_lock(0, (unsigned long *)b);
}

and

static inline void hlist_bl_unlock(struct hlist_bl_head *b)
{
__bit_spin_unlock(0, (unsigned long *)b);
BUG_ON(((u32)READ_ONCE(*b)&~LIST_BL_LOCKMASK) == 0x01000000);
}

to see if we can narrow down where that happens?