Re: PANIC: double fault, error_code: 0x0 in 4.0.0-rc3-2, kvm related?

From: Stefan Seyfried
Date: Wed Mar 18 2015 - 15:03:31 EST


Hi all,

first, I'm kind of happy that I'm not the only one seeing this, and
thus my beloved Thinkpad can stay for a bit longer... :-)

Then, I'm mostly an amateur when it comes to kernel debugging, so bear
with me when I'm stumbling through the code...

Am 18.03.2015 um 19:03 schrieb Andy Lutomirski:
> On Wed, Mar 18, 2015 at 10:46 AM, Takashi Iwai <tiwai@xxxxxxx> wrote:
>> At Wed, 18 Mar 2015 18:43:52 +0100,
>> Takashi Iwai wrote:
>>>
>>> At Wed, 18 Mar 2015 15:16:42 +0100,
>>> Takashi Iwai wrote:
>>>>
>>>> At Sun, 15 Mar 2015 09:17:15 +0100,
>>>> Stefan Seyfried wrote:
>>>>>
>>>>> Hi all,
>>>>>
>>>>> in 4.0-rc I have recently seen a few crashes, always when running
>>>>> KVM guests (IIRC). Today I was able to capture a crash dump, this
>>>>> is the backtrace from dmesg.txt:
>>>>>
>>>>> [242060.604870] PANIC: double fault, error_code: 0x0
>
> OK, we double faulted. Too bad that x86 CPUs don't tell us why.
>
>>>>> [242060.604878] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G W 4.0.0-rc3-2.gd5c547f-desktop #1
>>>>> [242060.604880] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>> [242060.604883] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>> [242060.604885] RIP: 0010:[<ffffffff816834ad>] [<ffffffff816834ad>] page_fault+0xd/0x30
>
> The double fault happened during page fault processing. Could you
> disassemble your page_fault function to find the offending
> instruction?

This one is easy:

crash> disassemble page_fault
Dump of assembler code for function page_fault:
0xffffffff816834a0 <+0>: data32 xchg %ax,%ax
0xffffffff816834a3 <+3>: data32 xchg %ax,%ax
0xffffffff816834a6 <+6>: data32 xchg %ax,%ax
0xffffffff816834a9 <+9>: sub $0x78,%rsp
0xffffffff816834ad <+13>: callq 0xffffffff81683620 <error_entry>
0xffffffff816834b2 <+18>: mov %rsp,%rdi
0xffffffff816834b5 <+21>: mov 0x78(%rsp),%rsi
0xffffffff816834ba <+26>: movq $0xffffffffffffffff,0x78(%rsp)
0xffffffff816834c3 <+35>: callq 0xffffffff810504e0 <do_page_fault>
0xffffffff816834c8 <+40>: jmpq 0xffffffff816836d0 <error_exit>
End of assembler dump.


>>>>> [242060.604893] RSP: 0018:00007fffa55eafb8 EFLAGS: 00010016
>
> Uh, what? That RSP is a user address.
>
>>>>> [242060.604895] RAX: 000000000000aa40 RBX: 0000000000000001 RCX: ffffffff81682237
>>>>> [242060.604896] RDX: 000000000000aa40 RSI: 0000000000000000 RDI: 00007fffa55eb078
>>>>> [242060.604898] RBP: 00007fffa55f1c1c R08: 0000000000000008 R09: 0000000000000000
>>>>> [242060.604900] R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000004a
>>>>> [242060.604902] R13: 00007ffa356b5d60 R14: 000000000000000f R15: 00007ffa3556cf20
>>>>> [242060.604904] FS: 00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>> [242060.604906] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [242060.604908] CR2: 00007fffa55eafa8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>> [242060.604909] Stack:
>>>>> [242060.604942] BUG: unable to handle kernel paging request at 00007fffa55eafb8
>>>>> [242060.604995] IP: [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>
> This is suspicious. We need to have died, again, of a fatal page
> fault while dumping the stack.

I posted the same problem to the opensuse kernel list shortly before turning
to LKML. There, Michal Kubecek noted:

"I encountered a similar problem recently. The thing is, x86
specification says that on a double fault, RIP and RSP registers are
undefined, i.e. you not only can't expect them to contain values
corresponding to the first or second fault but you can't even expect
them to have any usable values at all. Unfortunately the kernel double
fault handler doesn't take this into account and does try to display
usual crash related information so that it itself does usually crash
when trying to show stack content (that's the show_stack_log_lvl()
crash).

The result is a double fault (which itself would be very hard to debug)
followed by a crash in its handler so that analysing the outcome is
extremely difficult."

I cannot judge if this is true, but it sounded related to solving the
problem to me.

>>>>> [242060.605036] PGD 4779a067 PUD 40e3e067 PMD 4769e067 PTE 0
>>>>> [242060.605078] Oops: 0000 [#1] PREEMPT SMP
>>>>> [242060.605106] Modules linked in: vhost_net vhost macvtap macvlan nfsv3 nfs_acl rpcsec_gss_krb5 auth_rpcgss nfsv4 dns_resolver nfs lockd grace sunrpc fscache nls_iso8859_1 nls_cp437 vfat fat ppp_deflate bsd_comp ppp_async crc_ccitt ppp_generic slhc ses enclosure uas usb_storage cmac algif_hash ctr ccm rfcomm fuse xt_CHECKSUM iptable_mangle ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_nat_ipv4 nf_nat nf_conntrack_ipv4 nf_defrag_ipv4 xt_conntrack nf_conntrack ipt_REJECT xt_tcpudp tun bridge stp llc ebtable_filter ebtables ip6table_filter ip6_tables iptable_filter ip_tables x_tables af_packet bnep dm_crypt ecb cbc algif_skcipher af_alg xfs libcrc32c snd_hda_codec_conexant snd_hda_codec_generic iTCO_wdt iTCO_vendor_support snd_hda_intel snd_hda_controller snd_hda_codec snd_hwdep snd_pcm_oss snd_pcm
>>>>> [242060.605396] dm_mod snd_seq snd_seq_device snd_timer coretemp kvm_intel kvm snd_mixer_oss cdc_ether cdc_wdm cdc_acm usbnet mii arc4 uvcvideo videobuf2_vmalloc videobuf2_memops thinkpad_acpi videobuf2_core btusb v4l2_common videodev i2c_i801 iwldvm bluetooth serio_raw mac80211 pcspkr e1000e iwlwifi snd lpc_ich mei_me ptp mfd_core pps_core mei cfg80211 shpchp wmi soundcore rfkill battery ac tpm_tis tpm acpi_cpufreq i915 xhci_pci xhci_hcd i2c_algo_bit drm_kms_helper drm thermal video button processor sg loop
>>>>> [242060.605396] CPU: 1 PID: 2132 Comm: qemu-system-x86 Tainted: G W 4.0.0-rc3-2.gd5c547f-desktop #1
>>>>> [242060.605396] Hardware name: LENOVO 74665EG/74665EG, BIOS 6DET71WW (3.21 ) 12/13/2011
>>>>> [242060.605396] task: ffff880103f46150 ti: ffff8801013d4000 task.ti: ffff8801013d4000
>>>>> [242060.605396] RIP: 0010:[<ffffffff81005b44>] [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>> [242060.605396] RSP: 0018:ffff88023bc84e88 EFLAGS: 00010046
>>>>> [242060.605396] RAX: 00007fffa55eafc0 RBX: 00007fffa55eafb8 RCX: ffff88023bc7ffc0
>>>>> [242060.605396] RDX: 0000000000000000 RSI: ffff88023bc84f58 RDI: 0000000000000000
>>>>> [242060.605396] RBP: ffff88023bc83fc0 R08: ffffffff81a2fe15 R09: 0000000000000020
>>>>> [242060.605396] R10: 0000000000000afb R11: ffff88023bc84bee R12: ffff88023bc84f58
>>>>> [242060.605396] R13: 0000000000000000 R14: ffffffff81a2fe15 R15: 0000000000000000
>>>>> [242060.605396] FS: 00007ffa33dbfa80(0000) GS:ffff88023bc80000(0000) knlGS:0000000000000000
>>>>> [242060.605396] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>>> [242060.605396] CR2: 00007fffa55eafb8 CR3: 0000000002d7e000 CR4: 00000000000427e0
>>>>> [242060.605396] Stack:
>>>>> [242060.605396] 0000000002d7e000 0000000000000008 ffff88023bc84ee8 00007fffa55eafb8
>>>>> [242060.605396] 0000000000000000 ffff88023bc84f58 00007fffa55eafb8 0000000000000040
>>>>> [242060.605396] 00007ffa356b5d60 000000000000000f 00007ffa3556cf20 ffffffff81005c36
>>>>> [242060.605396] Call Trace:
>>>>> [242060.605396] [<ffffffff81005c36>] show_regs+0x86/0x210
>>>>> [242060.605396] [<ffffffff8104636f>] df_debug+0x1f/0x30
>>>>> [242060.605396] [<ffffffff810041a4>] do_double_fault+0x84/0x100
>>>>> [242060.605396] [<ffffffff81683088>] double_fault+0x28/0x30
>>>>> [242060.605396] [<ffffffff816834ad>] page_fault+0xd/0x30
>>>>> [242060.605396] Code: fe a2 81 31 c0 89 54 24 08 48 89 0c 24 48 8b 5b f8 e8 cc 06 67 00 48 8b 0c 24 8b 54 24 08 85 d2 74 05 f6 c2 03 74 48 48 8d 43 08 <48> 8b 33 48 c7 c7 0d fe a2 81 89 54 24 14 48 89 4c 24 08 48 89
>>>>> [242060.605396] RIP [<ffffffff81005b44>] show_stack_log_lvl+0x124/0x190
>>>>> [242060.605396] RSP <ffff88023bc84e88>
>>>>> [242060.605396] CR2: 00007fffa55eafb8
>>>>>
>>>>> I would not totally rule out a hardware problem, since this machine had
>>>>> another weird crash where it crashed and the bios beeper was constant
>>>>> on until I hit the power button for 5 seconds.
>>>>>
>>>>> Unfortunately, I cannot load the crashdump with the crash version in
>>>>> openSUSE Tumbleweed, so the backtrace is all I have for now.
>>>>
>>>> Just "me too", I'm getting the very same crash out of sudden with the
>>>> recent 4.0-rc. Judging from the very same pattern (usually crash
>>>> happened while using KVM (-smp 4) and kernel builds with -j8), I don't
>>>> think it's a hardware problem.
>>>
>>> The git bisection pointed to the commit:
>>> commit b926e6f61a26036ee9eabe6761483954d481ad25
>>> x86, traps: Fix ist_enter from userspace
>>>
>>> And reverting this on top of the latest Linus tree seems working.
>>> Seife, could you verify on your machine, too?
>>
>> Argh, false positive. Right after I wrote this mail, I got the very
>> same crash. I seem to need running the test much longer than I
>> thought.
>>
>> But somehow the commits around the above smell suspicious...
>>
>
> Those commits shouldn't really have affected page fault or double
> fault behavior. They made big changes to MCE, breakpoints, and debug
> exceptions.
>
> Something's very wrong here. I'm guessing that we somehow ended up in
> page_fault in a completely invalid context.
>
> One hairy code path that could plausibly do this is:
>
> 1. syscall
>
> 2. vmalloc fault accessing old_rsp aka rsp_scratch (or kernel_stack --
> same issue)
>
> 3. page fault. Now we're on the user stack and we do awful things.
> If we run off the end of the presently writable portion of the stack,
> we double fault.

Maybe Michal's idea from above points in the right direction?

Now since I have a crash dump and the corresponding debuginfo at hand,
might this help somehow to find out where the problem originated from?

I mean -- it's only 508 processes to look at :-) but if I knew what to
look for in their backtraces, I would try to do it.

Best regards,

Stefan
--
Stefan Seyfried
Linux Consultant & Developer -- GPG Key: 0x731B665B

B1 Systems GmbH
OsterfeldstraÃe 7 / 85088 Vohburg / http://www.b1-systems.de
GF: Ralph Dehner / Unternehmenssitz: Vohburg / AG: Ingolstadt,HRB 3537
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/