Re: x60: warnings on boot and resume, arch/x86/mm/tlb.c:257 initialize_ ... was Re: [PATCH 0/2] Fix resume failure due to PCID

From: Ingo Molnar
Date: Fri Sep 15 2017 - 04:39:10 EST



* Pavel Machek <pavel@xxxxxx> wrote:

> On Wed 2017-09-06 20:25:10, Linus Torvalds wrote:
> > On Wed, Sep 6, 2017 at 7:54 PM, Andy Lutomirski <luto@xxxxxxxxxx> wrote:
> > > Patch 1 is the fix. Patch 2 is a comment that would have kept me from
> > > chasing down a false lead.
> >
> > Yes, this seems to fix things for me. Thanks.
> >
> > Of course, right now that laptop has no working wifi with tip-of-tree
> > due to some issues with the networking tree, but that's an independent
> > thing and I could suspend and resume with this. So applied and pushed
> > out,
>
> Ok, seems this is still not completely right, I'm now getting WARN_ON
> during boot and on every resume... but machine works.
>
> 4.14-rc0, 32-bit.

Which SHA1, just to make sure? (Please enable CONFIG_LOCALVERSION_AUTO=y.)

> [ 0.004000] Initializing CPU#1
> [ 0.004000] ------------[ cut here ]------------
> [ 0.004000] WARNING: CPU: 1 PID: 0 at arch/x86/mm/tlb.c:257 initialize_tlbstate_and_flush+0x27/0xcf
> [ 0.004000] Modules linked in:
> [ 0.004000] CPU: 1 PID: 0 Comm: swapper/1 Not tainted 4.13.0+ #429
> [ 0.004000] Hardware name: LENOVO 17097HU/17097HU, BIOS 7BETD8WW (2.19 ) 03/31/2011
> [ 0.004000] task: f5ca2080 task.stack: f5cc4000
> [ 0.004000] EIP: initialize_tlbstate_and_flush+0x27/0xcf
> [ 0.004000] EFLAGS: 00210087 CPU: 1
> [ 0.004000] EAX: 00000000 EBX: c506d540 ECX: 051b2000 EDX: 00000000
> [ 0.004000] ESI: 0503f000 EDI: c51b2000 EBP: f5cc5f54 ESP: f5cc5f48
> [ 0.004000] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [ 0.004000] CR0: 80050033 CR2: 00000000 CR3: 0503f000 CR4: 000006b0
> [ 0.004000] Call Trace:
> [ 0.004000] cpu_init+0xdc/0x2f0
> [ 0.004000] start_secondary+0x34/0x1c6
> [ 0.004000] startup_32_smp+0x164/0x166
> [ 0.004000] ? startup_32_smp+0x164/0x166

Could you please try the debug patch below, so that we get a bit more info?

Thanks,

Ingo

===============>

arch/x86/mm/tlb.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
index 1ab3821f9e26..f98feb4b39a7 100644
--- a/arch/x86/mm/tlb.c
+++ b/arch/x86/mm/tlb.c
@@ -254,7 +254,8 @@ void initialize_tlbstate_and_flush(void)
unsigned long cr3 = __read_cr3();

/* Assert that CR3 already references the right mm. */
- WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd));
+ if (WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd)))
+ printk("# CR3: %016lx, __pa(mm->pgd): %016lx\n", cr3, __pa(mm->pgd));

/*
* Assert that CR4.PCIDE is set if needed. (CR4.PCIDE initialization