kvm: WARNING in mmu_spte_clear_track_bits

From: Dmitry Vyukov
Date: Mon Jan 23 2017 - 09:19:28 EST


Hello,

I've started seeing the following WARNING while running syzkaller fuzzer:

Out of memory: Kill process 30627 (syz-execprog) score 57 or sacrifice child
Killed process 30962 (syz-executor) total-vm:20996kB, anon-rss:64kB,
file-rss:0kB, shmem-rss:0kB
oom_reaper: reaped process 3916 (syz-executor), now anon-rss:0kB,
file-rss:0kB, shmem-rss:4kB
------------[ cut here ]------------
WARNING: CPU: 8 PID: 3916 at arch/x86/kvm/mmu.c:614
mmu_spte_clear_track_bits+0x32d/0x3a0 arch/x86/kvm/mmu.c:614
Kernel panic - not syncing: panic_on_warn set ...

CPU: 8 PID: 3916 Comm: syz-executor Not tainted 4.10.0-rc5+ #186
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:15 [inline]
dump_stack+0x2ee/0x3ef lib/dump_stack.c:51
panic+0x1fb/0x412 kernel/panic.c:179
__warn+0x1c4/0x1e0 kernel/panic.c:539
warn_slowpath_null+0x2c/0x40 kernel/panic.c:582
mmu_spte_clear_track_bits+0x32d/0x3a0 arch/x86/kvm/mmu.c:614
drop_spte+0x24/0x280 arch/x86/kvm/mmu.c:1182
mmu_page_zap_pte+0x204/0x300 arch/x86/kvm/mmu.c:2298
kvm_mmu_page_unlink_children arch/x86/kvm/mmu.c:2320 [inline]
kvm_mmu_prepare_zap_page+0x1b6/0x1320 arch/x86/kvm/mmu.c:2364
kvm_zap_obsolete_pages arch/x86/kvm/mmu.c:4932 [inline]
kvm_mmu_invalidate_zap_all_pages+0x488/0x660 arch/x86/kvm/mmu.c:4973
kvm_arch_flush_shadow_all+0x15/0x20 arch/x86/kvm/x86.c:8264
kvm_mmu_notifier_release+0x71/0xb0
arch/x86/kvm/../../../virt/kvm/kvm_main.c:470
__mmu_notifier_release+0x1f9/0x6c0 mm/mmu_notifier.c:74
mmu_notifier_release include/linux/mmu_notifier.h:235 [inline]
exit_mmap+0x3cc/0x490 mm/mmap.c:2918
__mmput kernel/fork.c:873 [inline]
mmput+0x22b/0x6e0 kernel/fork.c:895
exit_mm kernel/exit.c:521 [inline]
do_exit+0x9cf/0x28a0 kernel/exit.c:826
do_group_exit+0x149/0x420 kernel/exit.c:943
get_signal+0x7e0/0x1820 kernel/signal.c:2313
do_signal+0xd2/0x2190 arch/x86/kernel/signal.c:807
exit_to_usermode_loop+0x200/0x2a0 arch/x86/entry/common.c:156
syscall_return_slowpath arch/x86/entry/common.c:190 [inline]
do_syscall_64+0x6fc/0x930 arch/x86/entry/common.c:285
entry_SYSCALL64_slow_path+0x25/0x25
RIP: 0033:0x447859
RSP: 002b:0000000001a2fbd0 EFLAGS: 00000202 ORIG_RAX: 0000000000000038
RAX: fffffffffffffdff RBX: 00007fbd700c9700 RCX: 0000000000447859
RDX: 00007fbd700c99d0 RSI: 00007fbd700c8db0 RDI: 00000000003d0f00
RBP: 0000000001a2fcd8 R08: 00007fbd700c9700 R09: 00007fbd700c9700
R10: 00007fbd700c99d0 R11: 0000000000000202 R12: 0000000000000000
R13: 0000000000000000 R14: 00007fbd700c99c0 R15: 00007fbd700c9700


These warnings are always preceded by oom reaping the process. So
that's probably related.


On commit 7a308bb3016f57e5be11a677d15b821536419d36 (Jan 22).
I also have some local kvm-related changes on top to fix/silence known
kvm bugs, but hey should be unrealted:
https://gist.githubusercontent.com/dvyukov/fab99e9750bd93eefdab90418f04ea3d/raw/86a4e1475d73e8447782da6dc1824b2b7b31fa9e/gistfile1.txt


FTR, this was reproduced by running 150 of the following programs in
parallel in a VM with 512MB of memory:
https://gist.githubusercontent.com/dvyukov/cdf6dd9f7dbf9b207f836eac088dc11d/raw/686d73ec312492428bedd9f6438f28a7f559a916/gistfile1.txt