Re: kexec/kdump kernel fails to start

From: Flavio Leitner
Date: Thu Oct 18 2012 - 12:27:55 EST


On Thu, 18 Oct 2012 14:33:23 +0800
Dave Young <dyoung@xxxxxxxxxx> wrote:
[...]
> Just see Yinghai's coments, later init_memory_mapping cleanup
> will also address the 4k pages in first 2/4M, so revert them should be better.
> https://lkml.org/lkml/2012/9/4/533
>
> Here is a patch for the reverting:
> ---
> x86 mm: Revert find_early_table_space fix
>
> 722bc6b16771ed80871e1fd81c86d3627dda2ac8 Try to address the issue that the
> first 2/4M should use 4k pages if PSE enabled. but extra counts should only
> valid for x86_32. This commit cause kdump regression, kdump kernel hangs happens
> with it.
>
> As Yinghai Lu said they should be reverted. see below post:
> https://lkml.org/lkml/2012/9/4/533
>
> As there's a later fix to above fix which is bd2753b2dda7bb43c7468826de75f49c6a7e8965
> So we need revert both of these two commits.
>
> Tested kdump on physical and virutual machines.
>
> Reverted commits:
> commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8
> Author: WANG Cong <xiyou.wangcong@xxxxxxxxx>
> Date: Mon Mar 5 15:05:13 2012 -0800
>
> x86/mm: Fix the size calculation of mapping tables
>
> For machines that enable PSE, the first 2/4M memory region still uses
> 4K pages, so needs more PTEs in this case, but
> find_early_table_space() doesn't count this.
>
> This patch fixes it.
>
> The bug was found via code review, no misbehavior of the kernel
> was observed.
>
> Signed-off-by: WANG Cong <xiyou.wangcong@xxxxxxxxx>
> Cc: Yinghai Lu <yinghai@xxxxxxxxxx>
> Cc: Tejun Heo <tj@xxxxxxxxxx>
> Cc: <ianfang.cn@xxxxxxxxx>
> Signed-off-by: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Link: http://lkml.kernel.org/n/tip-kq6a00qe33h7c7ais2xsywnh@xxxxxxxxxxxxxx
> Signed-off-by: Ingo Molnar <mingo@xxxxxxx>
>
> commit bd2753b2dda7bb43c7468826de75f49c6a7e8965
> Author: Yinghai Lu <yinghai@xxxxxxxxxx>
> Date: Wed Jun 6 10:55:40 2012 -0700
>
> x86/mm: Only add extra pages count for the first memory range during pre-allocatio
>
> Robin found this regression:
>
> | I just tried to boot an 8TB system. It fails very early in boot with:
> | Kernel panic - not syncing: Cannot find space for the kernel page tables
>
> git bisect commit 722bc6b16771ed80871e1fd81c86d3627dda2ac8.
>
> A git revert of that commit does boot past that point on the 8TB
> configuration.
>
> That commit will add up extra pages for all memory range even
> above 4g.
>
> Try to limit that extra page count adding to first entry only.
>
> Bisected-by: Robin Holt <holt@xxxxxxx>
> Tested-by: Robin Holt <holt@xxxxxxx>
> Signed-off-by: Yinghai Lu <yinghai@xxxxxxxxxx>
> Cc: WANG Cong <xiyou.wangcong@xxxxxxxxx>
> Cc: Linus Torvalds <torvalds@xxxxxxxxxxxxxxxxxxxx>
> Cc: Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx>
> Cc: Peter Zijlstra <a.p.zijlstra@xxxxxxxxx>
> Link: http://lkml.kernel.org/r/CAE9FiQUj3wyzQxtq9yzBNc9u220p8JZ1FYHG7t%3DMOzJ%3D9B
> Signed-off-by: Ingo Molnar <mingo@xxxxxxxxxx>
>
>
> Signed-off-by: Dave Young <dyoung@xxxxxxxxxx>
> ---
> arch/x86/mm/init.c | 22 +++++++++-------------
> 1 file changed, 9 insertions(+), 13 deletions(-)

The patch looks good.

I reproduced the issue with last upstream
commit 43c422eda99b894f18d1cca17bcd2401efaf7bd0
and confirmed that it does work with the patch applied.

thanks a lot!

Acked-by: Flavio Leitner <fbl@xxxxxxxxxx>
Tested-by: Flavio Leitner <fbl@xxxxxxxxxx>

fbl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/