Re: v4.14.9 BUG, regression

From: Greg KH
Date: Wed Dec 27 2017 - 11:51:27 EST


On Tue, Dec 26, 2017 at 03:00:55AM +0100, Petr Janecek wrote:
> Hi,
> the machine started dying reliably shortly after boot after upgrading
> to 4.14.9 from 4.14.8. Debian stretch, xfs on md raid10, btrfs.
>
>
> [ 230.855352] BUG: unable to handle kernel paging request at 0000000100000001
> [ 230.862449] IP: free_block+0x135/0x1f0
> [ 230.866301] PGD 0 P4D 0
> [ 230.868939] Oops: 0002 [#1] SMP
> [ 230.872178] Modules linked in: xfs x86_pkg_temp_thermal coretemp kvm_intel kvm irqbypass crc32_pclmul ghash_clmulni_intel pcbc aesni_intel mei_me iTCO_wdt aes_x86_64 crypto_simd cryptd glue_helper pcspkr iTCO_vendor_support ipmi_si tpm_tis mei evdev tpm_tis_core ipmi_devintf ipmi_msghandler battery sg video tpm acpi_power_meter button ie31200_edac shpchp nfsd auth_rpcgss oid_registry nfs_acl lockd grace sunrpc loop ip_tables x_tables autofs4 btrfs zstd_decompress zstd_compress xxhash raid456 async_raid6_recov async_memcpy async_pq async_xor async_tx xor raid6_pq libcrc32c crc32c_generic raid1 raid0 multipath linear raid10 md_mod uas usb_storage hid_generic usbhid hid sd_mod igb ahci xhci_pci i2c_algo_bit libahci mpt3sas xhci_hcd dca raid_class ptp libata i2c_i801 scsi_transport_sas crc32c_intel usbcore
> [ 230.943379] i2c_core pps_core usb_common scsi_mod fan thermal
> [ 230.949317] CPU: 0 PID: 57 Comm: kworker/0:1 Not tainted 4.14.9 #23
> [ 230.955688] Hardware name: Supermicro Super Server/X11SSL-CF, BIOS 1.0a 01/29/2016
> [ 230.963357] Workqueue: events cache_reap
> [ 230.967380] task: ffff8864b362c000 task.stack: ffff9f0d43380000
> [ 230.973420] RIP: 0010:free_block+0x135/0x1f0
> [ 230.977806] RSP: 0018:ffff9f0d43383d88 EFLAGS: 00010006
> [ 230.983146] RAX: ffffe8625d2e3908 RBX: 0000000080000000 RCX: 0000000000000003
> [ 230.990398] RDX: 00000000fffffffe RSI: ffff8864b7822df0 RDI: ffff8864b6c00480
> [ 230.997654] RBP: dead000000000200 R08: ffff8864b6c01558 R09: ffff8864b6c01540
> [ 231.004909] R10: 00000000006fe8f1 R11: ffff886496597e00 R12: dead000000000100
> [ 231.012162] R13: ffffffff80000000 R14: ffff8864b7822e28 R15: ffffe8625d2e3928
> [ 231.019416] FS: 0000000000000000(0000) GS:ffff8864b7800000(0000) knlGS:0000000000000000
> [ 231.027643] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 231.033509] CR2: 0000000100000001 CR3: 00000005e680b006 CR4: 00000000003606f0
> [ 231.040763] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 231.048017] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [ 231.055272] Call Trace:
> [ 231.057840] drain_array_locked+0x5a/0x90
> [ 231.061963] drain_array+0x60/0x80
> [ 231.065484] cache_reap+0x67/0x1d0
> [ 231.069004] process_one_work+0x1c0/0x3e0
> [ 231.073128] worker_thread+0x42/0x3e0
> [ 231.076907] kthread+0xf7/0x130
> [ 231.080167] ? create_worker+0x180/0x180
> [ 231.084206] ? kthread_create_on_node+0x40/0x40
> [ 231.088850] ret_from_fork+0x1f/0x30
> [ 231.092542] Code: 4f 1c 49 c1 ea 20 44 29 d2 d3 ea 0f b6 4f 1d 41 01 d2 41 d3 ea 8b 48 18 8d 51 ff 48 8b 48 10 89 50 18 48 85 c9 0f 84 a3 00 00 00 <44> 88 14 11 8b 50 18 85 d2 0f 84 26 ff ff ff 49 8b 51 20 48 83
> [ 231.111583] RIP: free_block+0x135/0x1f0 RSP: ffff9f0d43383d88
> [ 231.117443] CR2: 0000000100000001
> [ 231.120875] ---[ end trace e4cdf71e69fa010e ]---

Any chance you can run 'git bisect' to find the offending patch here?

thanks,

greg k-h