Re: [BUG] 3.4.109 - unable to handle kernel NULL pointer dereference at (null)

From: Cal Peake
Date: Sun Oct 04 2015 - 00:39:18 EST


On Thu, 1 Oct 2015, Steven Rostedt wrote:

>
> I merged 3.4.109 into 3.4-rt, and it bugged. I then booted 3.4.109
> vanilla and it bugged too. 3.4.108 is fine.
>

I'm getting a similar type bug here. I've bisected it down to this commit:

commit 961bd13539b9e7ca5d2e667668141496b7a1d6bc
Author: Michel Dänzer <michel.daenzer@xxxxxxx>
Date: Thu Apr 16 11:17:27 2015 +0900

drm/radeon: Use drm_calloc_ab for CS relocs

commit b421ed15d2c3039eb724680e4de1e4b2bd196a9a upstream.

The number of relocs is passed in by userspace and can be large. It has
been observed to cause kcalloc failures in the wild.


Backing it out of vanilla 3.4.109 has so far eliminated the problem.

Steven, you look to be using i915 graphics instead of radeon, so it seems
unlikely to me that we're hitting the same problem. Here's my oops for
comparison though:


BUG: unable to handle kernel NULL pointer dereference at 00000000000002f1
IP: [<ffffffffa016202a>] evdev_poll+0x2a/0x70 [evdev]
PGD 211441067 PUD 213771067 PMD 0
Oops: 0000 [#1] SMP
CPU 0
Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ipv6 iptable_nat nf_nat bridge stp llc nfs auth_rpcgss lockd ntfs cifs udf crypto_hash sunrpc crypto_algapi isofs crc_itu_t vfat msdos fat nls_cp437 nls_utf8 nls_iso8859_1 nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack_tftp nf_defrag_ipv4 nf_conntrack xt_LOG ipt_REJECT xt_tcpudp iptable_filter kvm_amd ip_tables kvm x_tables f71882fg af_packet edac_core msr pcspkr cpuid edac_mce_amd mousedev usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek usb_storage snd_hda_intel snd_hda_codec radeon sr_mod cdrom ttm drm_kms_helper ohci_hcd powernow_k8 freq_table psmouse evdev snd_pcm mperf snd_timer k10temp e1000 sata_sil drm snd 8250_pnp serio_raw 8250 serial_core floppy ehci_hcd microcode soundcore snd_page_alloc pata_atiixp backlight i2c_piix4 usbcore i2c_algo_bit processor i2c_core sg thermal_sys usb_common button power_supply r8169 hwmon firmware_class mii loop!
ext4
bd2 crc16 raid1 dm_mod raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx md_mod ahci libahci libata sd_mod scsi_mod

Pid: 1862, comm: X Not tainted 3.4.108-00059-g961bd13 #7 MICRO-STAR INTERNATIONAL CO.,LTD MS-7551/KA780G (MS-7551)
RIP: 0010:[<ffffffffa016202a>] [<ffffffffa016202a>] evdev_poll+0x2a/0x70 [evdev]
RSP: 0018:ffff880211dd79f8 EFLAGS: 00010246
RAX: 0000000000000000 RBX: ffff8802106b6800 RCX: 0000000000000069
RDX: ffff88021184f800 RSI: ffff880211dd7ad8 RDI: ffff88021184f800
RBP: 0000000000000019 R08: ffff880211dd7f48 R09: ffff880211dd7de0
R10: 0000000000000000 R11: 0000000000003246 R12: 0000000000010000
R13: 000000000007e000 R14: ffff88021184f800 R15: 0000000000000040
FS: 00007f649ddd08a0(0000) GS:ffff88021fc00000(0000) knlGS:00000000f70486d0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000002f1 CR3: 0000000212ac6000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 1862, threadinfo ffff880211dd6000, task ffff8802135e62d0)
Stack:
0000000000000001 0000000000000013 0000000000000010 ffffffff810d4b93
ffff880211f9c700 ffffffffa03d90b2 ffff880211dd7cc8 0000000000000000
000000000007e000 0000000000000000 0000000000000000 0000000111dd7cc8
Call Trace:
[<ffffffff810d4b93>] ? do_select+0x333/0x5f0
[<ffffffffa03d90b2>] ? r600_cs_packet_parse+0x42/0x140 [radeon]
[<ffffffff810d4500>] ? __pollwait+0x110/0x110
Oct 3 23:24:38 lancer last message repeated 7 times
[<ffffffff810ba216>] ? kmem_cache_free+0x86/0x90
[<ffffffff81038d92>] ? __dequeue_signal+0x102/0x190
[<ffffffff810d505c>] ? core_sys_select+0x20c/0x380
[<ffffffff8103b608>] ? set_current_blocked+0x38/0x60
[<ffffffff8103b6ec>] ? block_sigmask+0x3c/0x50
[<ffffffff81001c84>] ? do_signal+0x1d4/0x620
[<ffffffff8105b1ad>] ? ktime_get_ts+0x6d/0xe0
[<ffffffff810d5212>] ? sys_select+0x42/0x110
[<ffffffff81288982>] ? system_call_fastpath+0x16/0x1b
Code: <80> bd d8 02 00 00 01 8b 4b 04 48 8b 6c 24 10 19 c0 24 14 05 04 01
RIP [<ffffffffa016202a>] evdev_poll+0x2a/0x70 [evdev]
RSP <ffff880211dd79f8>
CR2: 00000000000002f1
---[ end trace 369d4585fbe82a04 ]---
BUG: unable to handle kernel NULL pointer dereference at 00000000000000a1
IP: [<ffffffff81285f4e>] mutex_lock_interruptible+0xe/0x40
PGD 0
Oops: 0002 [#2] SMP
CPU 0
Modules linked in: snd_seq_dummy snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device snd_pcm_oss snd_mixer_oss ipv6 iptable_nat nf_nat bridge stp llc nfs auth_rpcgss lockd ntfs cifs udf crypto_hash sunrpc crypto_algapi isofs crc_itu_t vfat msdos fat nls_cp437 nls_utf8 nls_iso8859_1 nf_conntrack_ftp nf_conntrack_ipv4 xt_state nf_conntrack_tftp nf_defrag_ipv4 nf_conntrack xt_LOG ipt_REJECT xt_tcpudp iptable_filter kvm_amd ip_tables kvm x_tables f71882fg af_packet edac_core msr pcspkr cpuid edac_mce_amd mousedev usbhid hid snd_hda_codec_hdmi snd_hda_codec_realtek usb_storage snd_hda_intel snd_hda_codec radeon sr_mod cdrom ttm drm_kms_helper ohci_hcd powernow_k8 freq_table psmouse evdev snd_pcm mperf snd_timer k10temp e1000 sata_sil drm snd 8250_pnp serio_raw 8250 serial_core floppy ehci_hcd microcode soundcore snd_page_alloc pata_atiixp backlight i2c_piix4 usbcore i2c_algo_bit processor i2c_core sg thermal_sys usb_common button power_supply r8169 hwmon firmware_class mii loop!
ext4
bd2 crc16 raid1 dm_mod raid456 async_pq async_xor xor async_memcpy async_raid6_recov raid6_pq async_tx md_mod ahci libahci libata sd_mod scsi_mod

Pid: 1862, comm: X Tainted: G D 3.4.108-00059-g961bd13 #7 MICRO-STAR INTERNATIONAL CO.,LTD MS-7551/KA780G (MS-7551)
RIP: 0010:[<ffffffff81285f4e>] [<ffffffff81285f4e>] mutex_lock_interruptible+0xe/0x40
RSP: 0018:ffff880211dd7698 EFLAGS: 00010296
RAX: 00000000ffffffff RBX: 00000000000000a1 RCX: 00000000000000e9
RDX: ffff880211dd7fd8 RSI: ffff8802105d1e40 RDI: 00000000000000a1
RBP: 0000000000000019 R08: 00000000000128c0 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000000 R12: ffff88021184f800
R13: ffff8802105d1e50 R14: 0000000000000010 R15: ffff8802105d1e40
FS: 00007f649ddd08a0(0000) GS:ffff88021fc00000(0000) knlGS:00000000f70486d0
CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
CR2: 00000000000000a1 CR3: 000000000160b000 CR4: 00000000000007f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Process X (pid: 1862, threadinfo ffff880211dd6000, task ffff8802135e62d0)
Stack:
00000000000000a1 ffffffffa01621e0 ffff8802105d1e40 ffff88021184ea00
ffff88021184f800 ffff8802105d1e40 0000000000000000 ffffffff810c0de8
0000000000000000 0000000000000001 000000000000001f ffffffff8102ef1d
Call Trace:
[<ffffffffa01621e0>] ? evdev_flush+0x30/0x80 [evdev]
[<ffffffff810c0de8>] ? filp_close+0x38/0x90
[<ffffffff8102ef1d>] ? put_files_struct+0x8d/0x100
[<ffffffff8102f69d>] ? do_exit+0x63d/0x870
[<ffffffff812857b1>] ? printk+0x40/0x45
[<ffffffff81005433>] ? oops_end+0x73/0xa0
[<ffffffff810210b2>] ? no_context+0x122/0x2d0
[<ffffffff81021b75>] ? do_page_fault+0x3c5/0x420
[<ffffffff8100aaad>] ? __switch_to_xtra+0xcd/0x130
[<ffffffff8100109f>] ? __switch_to+0x34f/0x3b0
[<ffffffff81287206>] ? wait_for_common+0xe6/0x190
[<ffffffff81052760>] ? try_to_wake_up+0x280/0x280
[<ffffffff8128849f>] ? page_fault+0x1f/0x30
[<ffffffffa016202a>] ? evdev_poll+0x2a/0x70 [evdev]
[<ffffffff810d4b93>] ? do_select+0x333/0x5f0
[<ffffffffa03d90b2>] ? r600_cs_packet_parse+0x42/0x140 [radeon]
[<ffffffff810d4500>] ? __pollwait+0x110/0x110
Oct 3 23:24:40 lancer last message repeated 7 times
[<ffffffff810ba216>] ? kmem_cache_free+0x86/0x90
[<ffffffff81038d92>] ? __dequeue_signal+0x102/0x190
[<ffffffff810d505c>] ? core_sys_select+0x20c/0x380
[<ffffffff8103b608>] ? set_current_blocked+0x38/0x60
[<ffffffff8103b6ec>] ? block_sigmask+0x3c/0x50
[<ffffffff81001c84>] ? do_signal+0x1d4/0x620
[<ffffffff8105b1ad>] ? ktime_get_ts+0x6d/0xe0
[<ffffffff810d5212>] ? sys_select+0x42/0x110
[<ffffffff81288982>] ? system_call_fastpath+0x16/0x1b
Code: 8b 44 24 08 48 89 42 08 48 89 10 80 43 04 01 b8 fc ff ff ff eb 96 0f 1f 80 00 00 00 00 53 48 89 fb e8 97 11 00 00 b8 ff ff ff ff <f0> 0f c1 03 ff c8 78 11 65 48 8b 04 25 c0 a7 00 00 48 89 43 18
RIP [<ffffffff81285f4e>] mutex_lock_interruptible+0xe/0x40
RSP <ffff880211dd7698>
CR2: 00000000000000a1
---[ end trace 369d4585fbe82a05 ]---
Fixing recursive fault but reboot is needed!

--
Cal Peake