ktime_get_ts64() splat during resume

From: Borislav Petkov
Date: Fri Jun 17 2016 - 06:54:41 EST


Hi guys,

look what I've found this morning during resume:

[ 45.732934] PM: Image restored successfully.
[ 45.738064] PM: Basic memory bitmaps freed
[ 45.742914] Restarting tasks ...
[ 45.746236] BUG: unable to handle kernel done.
[ 45.752542] NULL pointer dereference at 0000000000000001
[ 45.752544] IP: [<0000000000000001>] 0x1
[ 45.752547] PGD 37922067 PUD 3791a067 PMD 0
[ 45.752548] Oops: 0010 [#1] PREEMPT SMP
[ 45.752557] Modules linked in: binfmt_misc ipv6 vfat fat amd64_edac_mod edac_mce_amd fuse dm_crypt dm_mod kvm_amd kvm amdkfd amd_iommu_v2 irqbypass crc32_pclmul radeon aesni_intel aes_x86_64 lrw gf128mul glue_helper ablk_helper cryptd fam15h_power k10temp acpi_cpufreq
[ 45.752559] CPU: 2 PID: 1 Comm: init Not tainted 4.7.0-rc3+ #1
[ 45.752560] Hardware name: To be filled by O.E.M. To be filled by O.E.M./M5A97 EVO R2.0, BIOS 1503 01/16/2013
[ 45.752560] task: ffff88042b958000 ti: ffff88042b954000 task.ti: ffff88042b954000
[ 45.752562] RIP: 0010:[<0000000000000001>] [<0000000000000001>] 0x1
[ 45.752562] RSP: 0018:ffff88042b957e50 EFLAGS: 00010246
[ 45.752563] RAX: 0000000000000000 RBX: ffffffff81181e1e RCX: fffffffffffffdfe
[ 45.752564] RDX: ffff88042b958000 RSI: ffff88042b954000 RDI: ffffffff8168aeac
[ 45.752564] RBP: 0000000000000430 R08: 0000000000000000 R09: 0000000000000002
[ 45.752565] R10: 0000000000000000 R11: 0000000000000001 R12: ffffff9c00000002
[ 45.752565] R13: ffff88042b09a300 R14: ffffffff811782bf R15: 0000000000000011
[ 45.752566] FS: 00007f9d470ab800(0000) GS:ffff88043dc80000(0000) knlGS:0000000000000000
[ 45.752567] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 45.752567] CR2: 0000000000000001 CR3: 0000000037923000 CR4: 00000000000406e0
[ 45.752568] Stack:
[ 45.752569] 0000000000001180 0000000000000000 ffff88042b957e88 ffffffff810bb33a
[ 45.752570] 00000000fffffdfe ffff88042b957f00 00007ffcfa91b1f0 ffff88042b957ee8
[ 45.752571] ffffffff81185fc7 0000000000000003 0000000021b8e1cd 0000000000000003
[ 45.752571] Call Trace:
[ 45.752576] [<ffffffff810bb33a>] ? ktime_get_ts64+0x4a/0xf0
[ 45.752578] [<ffffffff81185fc7>] ? poll_select_copy_remaining+0xe7/0x130
[ 45.752581] [<ffffffff8100263a>] ? exit_to_usermode_loop+0x8a/0xb0
[ 45.752582] [<ffffffff81002a6b>] ? syscall_return_slowpath+0x5b/0x70
[ 45.752584] [<ffffffff8168b372>] ? entry_SYSCALL_64_fastpath+0xa5/0xa7
[ 45.752587] Code: Bad RIP value.
[ 45.752588] RIP [<0000000000000001>] 0x1
[ 45.752589] RSP <ffff88042b957e50>
[ 45.752589] CR2: 0000000000000001
[ 45.752597] ---[ end trace 5334fe9eec2bfca9 ]---
[ 45.752737] Kernel panic - not syncing: Attempted to kill init! exitcode=0x00000009
[ 45.752737]
[ 45.758773] Kernel Offset: disabled

rIP is gone.

That's rc3+tip/master from Wed:

commit 043ae67b6fdd9db7fb54564124d2f2fa833c5ee6 (refs/remotes/tip/master)
Merge: 0bbd6fbba8c7 00688272157d
Author: Ingo Molnar <mingo@xxxxxxxxxx>
Date: Wed Jun 15 12:53:02 2016 +0200

Merge branch 'x86/platform'

The top function in the stacktrace is:

[ 45.752576] [<ffffffff810bb33a>] ? ktime_get_ts64+0x4a/0xf0

and that address is:

ffffffff810bb2f0 <ktime_get_ts64>:
ffffffff810bb2f0: e8 6b 21 5d 00 callq ffffffff8168d460 <__fentry__>
ffffffff810bb2f5: 55 push %rbp
ffffffff810bb2f6: 8b 05 14 52 bf 00 mov 0xbf5214(%rip),%eax # ffffffff81cb0510 <timekeeping_suspended>
ffffffff810bb2fc: 48 89 e5 mov %rsp,%rbp
ffffffff810bb2ff: 41 55 push %r13
ffffffff810bb301: 85 c0 test %eax,%eax
ffffffff810bb303: 41 54 push %r12
ffffffff810bb305: 53 push %rbx
ffffffff810bb306: 48 89 fb mov %rdi,%rbx
ffffffff810bb309: 0f 85 b6 00 00 00 jne ffffffff810bb3c5 <ktime_get_ts64+0xd5>
ffffffff810bb30f: 45 31 e4 xor %r12d,%r12d
ffffffff810bb312: 44 8b 2d a7 a9 f5 00 mov 0xf5a9a7(%rip),%r13d # ffffffff82015cc0 <tk_core>
ffffffff810bb319: 41 f6 c5 01 test $0x1,%r13b
ffffffff810bb31d: 0f 85 9b 00 00 00 jne ffffffff810bb3be <ktime_get_ts64+0xce>
ffffffff810bb323: 48 8b 05 0e aa f5 00 mov 0xf5aa0e(%rip),%rax # ffffffff82015d38 <tk_core+0x78>
ffffffff810bb32a: 48 89 03 mov %rax,(%rbx)
ffffffff810bb32d: 48 8b 3d 94 a9 f5 00 mov 0xf5a994(%rip),%rdi # ffffffff82015cc8 <tk_core+0x8>
ffffffff810bb334: ff 15 96 a9 f5 00 callq *0xf5a996(%rip) # ffffffff82015cd0 <tk_core+0x10>
ffffffff810bb33a: 48 2b 05 9f a9 f5 00 sub 0xf5a99f(%rip),%rax # ffffffff82015ce0 <tk_core+0x20> <--- HERE
ffffffff810bb341: 48 8b 15 90 a9 f5 00 mov 0xf5a990(%rip),%rdx # ffffffff82015cd8 <tk_core+0x18>
ffffffff810bb348: 44 8b 05 99 a9 f5 00 mov 0xf5a999(%rip),%r8d # ffffffff82015ce8 <tk_core+0x28>
ffffffff810bb34f: 48 8b 35 9a a9 f5 00 mov 0xf5a99a(%rip),%rsi # ffffffff82015cf0 <tk_core+0x30>
ffffffff810bb356: 8b 0d 90 a9 f5 00 mov 0xf5a990(%rip),%ecx # ffffffff82015cec <tk_core+0x2c>
ffffffff810bb35c: 48 8b 3d e5 a9 f5 00 mov 0xf5a9e5(%rip),%rdi # ffffffff82015d48 <tk_core+0x88>

i.e., right after the call to tkr->read():

static inline cycle_t timekeeping_get_delta(struct tk_read_base *tkr)
{
cycle_t cycle_now, delta;

/* read clocksource */
cycle_now = tkr->read(tkr->clock);
^^^^^^^^^^

Ring any bells about something corrupting tk_core.timekeeper.tkr_mono or
it being uninitialized after suspend?

Thanks.

--
Regards/Gruss,
Boris.

ECO tip #101: Trim your mails when you reply.