Re: [3.9-rc1 x86] Bug in ioremap code?

From: Dave Hansen
Date: Wed Mar 06 2013 - 12:52:07 EST


On 03/06/2013 06:58 AM, Tetsuo Handa wrote:
> Borislav Petkov wrote:
>> Ok, before we continue guessing stuff, Tetsuo, can you please explain
>> how exactly you're triggering this. More specifically, we need .config,
>> hypervisor version, I'm assuming kernel is 3.9-rc1, Linux is guest/host
>> etc, etc.
>
> I'm using CentOS 6.3 x86_32 guest running on VMware Workstation 6.5.5 for
> Windows XP x86_32 host and VMware Player 4.0.5 for Windows 7 x86_64 host.
>
> Kernel version is 3.9-rc1 x86_32. This bug can be triggered only when the
> guest has little RAM such that /proc/meminfo reports that HighTotal == 0.
> Config is at http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-acpi .
>
> I don't know why but changing kernel config to CONFIG_ACPI=n
> ( http://I-love.SAKURA.ne.jp/tmp/config-3.9-rc1-noacpi ) solves this bug.
> Well, should I run bisection on ACPI code?

I was able to reproduce and got some better debugging out of this:

[ 0.193170] __cpa_process_fault(c1673e90, e4afa000, 1)
[ 0.208752] max_pfn_mapped: 150528
[ 0.218886] PAGE_OFFSET: c0000000
[ 0.228597] PAGE_OFFSET + (max_pfn_mapped << PAGE_SHIFT)): e4c00000
[ 0.247837] slow_virt_to_phys(e4afa000): 0

The pte looks to actually _be_ empty:

[ 44.038145] slow_virt_to_phys() pte: 0000000000000000 level: 1

Not sure what's going on in the end, but it does appear this is another
win for the new BUG_ON(). There really does look to be a real bug here.

BTW, the BUG_ON() is proving to be woefully inadequate. We need some
better diagnostic messages out of there, and probably a nice dump of the
pagetable walk too.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/