tasks hung forever (khugepaged blocked) with 4.3 kernel

From: David Madore
Date: Wed Nov 04 2015 - 05:05:52 EST


With a 4.3 kernel I compiled two days ago, I had various processes
stuck in 'D' state this morning, tried to unmount filesystems, which
made things worse and froze everything. In case this means anything,
reading from the hard drives with dd (e.g., dd if=/dev/sda bs=4096k
count=1) worked (data was not in cache), but not for large amounts of
data (count=64 hung forever). If this is of any use, I did an
alt-sysrq-t after an emergency sync, and I have the logs: full kernel
log is at link below, here are the beginning and end of it:

Nov 4 08:10:10 vega kernel: [141900.078018] INFO: task khugepaged:173 blocked for more than 300 seconds.
Nov 4 08:10:10 vega kernel: [141900.078024] Not tainted 4.3.0-vega #1
Nov 4 08:10:10 vega kernel: [141900.078026] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Nov 4 08:10:10 vega kernel: [141900.078029] khugepaged D ffff88023fc94f00 0 173 2 0x00000000
Nov 4 08:10:10 vega kernel: [141900.078035] ffff88023eb97748 0000000000000046 ffff88023e9a6d00 ffff880236e38d40
Nov 4 08:10:10 vega kernel: [141900.078040] ffff88023fc1ec88 ffff88023eb97728 ffff88023eb98000 ffff8800bba5e800
Nov 4 08:10:10 vega kernel: [141900.078044] ffff880234607518 0000000000000000 ffff880236e38d40 ffff88023eb97760
Nov 4 08:10:10 vega kernel: [141900.078048] Call Trace:
Nov 4 08:10:10 vega kernel: [141900.078057] [<ffffffff8153cdae>] schedule+0x2e/0x70
Nov 4 08:10:10 vega kernel: [141900.078086] [<ffffffffa0a58675>] _xfs_log_force_lsn+0x155/0x2b0 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078091] [<ffffffff81070250>] ? wake_up_q+0x70/0x70
Nov 4 08:10:10 vega kernel: [141900.078106] [<ffffffffa0a587f9>] xfs_log_force_lsn+0x29/0x80 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078123] [<ffffffffa0a4c014>] ? xfs_iunpin_wait+0x14/0x20 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078140] [<ffffffffa0a48b88>] __xfs_iunpin_wait+0x88/0x120 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078145] [<ffffffff81081cd0>] ? autoremove_wake_function+0x30/0x30
Nov 4 08:10:10 vega kernel: [141900.078161] [<ffffffffa0a4c014>] xfs_iunpin_wait+0x14/0x20 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078178] [<ffffffffa0a41b3d>] xfs_reclaim_inode+0x5d/0x320 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078196] [<ffffffffa0a42043>] xfs_reclaim_inodes_ag+0x243/0x360 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078214] [<ffffffffa0a42afe>] xfs_reclaim_inodes_nr+0x2e/0x40 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078229] [<ffffffffa0a50394>] xfs_fs_free_cached_objects+0x14/0x20 [xfs]
Nov 4 08:10:10 vega kernel: [141900.078233] [<ffffffff81164b89>] super_cache_scan+0x179/0x180
Nov 4 08:10:10 vega kernel: [141900.078239] [<ffffffff811267c5>] shrink_slab.part.62.constprop.72+0x1c5/0x340
Nov 4 08:10:10 vega kernel: [141900.078243] [<ffffffff81128d06>] shrink_zone+0x166/0x170
Nov 4 08:10:10 vega kernel: [141900.078246] [<ffffffff81128e83>] do_try_to_free_pages+0x173/0x350
Nov 4 08:10:10 vega kernel: [141900.078249] [<ffffffff81129100>] try_to_free_pages+0xa0/0x130
Nov 4 08:10:10 vega kernel: [141900.078253] [<ffffffff8111e01e>] __alloc_pages_nodemask+0x40e/0x790
Nov 4 08:10:10 vega kernel: [141900.078258] [<ffffffff8115a84a>] khugepaged+0x14a/0x11c0
Nov 4 08:10:10 vega kernel: [141900.078261] [<ffffffff81081ca0>] ? wait_woken+0x80/0x80
Nov 4 08:10:10 vega kernel: [141900.078265] [<ffffffff8115a700>] ? use_zero_page_show+0x30/0x30
Nov 4 08:10:10 vega kernel: [141900.078269] [<ffffffff81066c44>] kthread+0xc4/0xe0
Nov 4 08:10:10 vega kernel: [141900.078273] [<ffffffff81066b80>] ? kthread_create_on_node+0x170/0x170
Nov 4 08:10:10 vega kernel: [141900.078277] [<ffffffff8153ffef>] ret_from_fork+0x3f/0x70
Nov 4 08:10:10 vega kernel: [141900.078280] [<ffffffff81066b80>] ? kthread_create_on_node+0x170/0x170
Nov 4 08:10:10 vega kernel: [141900.078285] INFO: task kswapd0:443 blocked for more than 300 seconds.
Nov 4 08:10:10 vega kernel: [141900.078287] Not tainted 4.3.0-vega #1
<snip>
Nov 4 10:10:41 vega kernel: [149114.106817]
Nov 4 10:10:41 vega kernel: [149114.106817] Showing busy workqueues and worker pools:
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue events: flags=0x0
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=9/256
Nov 4 10:10:41 vega kernel: [149114.106817] in-flight: 19153:do_sync_work
Nov 4 10:10:41 vega kernel: [149114.106817] pending: console_callback, dbs_timer, sysrq_reinject_alt_sysrq, cache_reap, vmstat_update, flush_to_ldisc, push_to_pool, kernfs_notify_workfn
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue events_power_efficient: flags=0x80
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: neigh_periodic_work
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue writeback: flags=0x4e
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 8: cpus=0-3 flags=0x4 nice=0 active=2/256
Nov 4 10:10:41 vega kernel: [149114.106817] in-flight: 14424:wb_workfn, 169(RESCUER):wb_workfn
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-reclaim/md115: flags=0x4
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 4: cpus=2 node=0 flags=0x0 nice=0 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_reclaim_worker [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-log/md115: flags=0x14
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=4/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-log/md117: flags=0x14
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_log_worker [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-log/md110: flags=0x14
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_log_worker [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-cil/md112: flags=0xc
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 2: cpus=1 node=0 flags=0x0 nice=0 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] in-flight: 19033:xlog_cil_push_work [xfs] BAR(19386) BAR(11638)
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-log/md112: flags=0x14
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=9/256
Nov 4 10:10:41 vega kernel: [149114.106817] in-flight: 11638:xfs_log_worker [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs], xfs_buf_ioend_work [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] workqueue xfs-log/md108: flags=0x14
Nov 4 10:10:41 vega kernel: [149114.106817] pwq 3: cpus=1 node=0 flags=0x0 nice=-20 active=1/256
Nov 4 10:10:41 vega kernel: [149114.106817] pending: xfs_log_worker [xfs]
Nov 4 10:10:41 vega kernel: [149114.106817] pool 0: cpus=0 node=0 flags=0x0 nice=0 workers=2 manager: 19219 idle: 19142
Nov 4 10:10:41 vega kernel: [149114.106817] pool 2: cpus=1 node=0 flags=0x0 nice=0 workers=3 manager: 18678 idle: 19179
Nov 4 10:10:41 vega kernel: [149114.106817] pool 3: cpus=1 node=0 flags=0x0 nice=-20 workers=2 manager: 19180
Nov 4 10:10:41 vega kernel: [149114.106817] pool 4: cpus=2 node=0 flags=0x0 nice=0 workers=2 manager: 19221
Nov 4 10:10:41 vega kernel: [149114.106817] pool 6: cpus=3 node=0 flags=0x0 nice=0 workers=2 manager: 18968 idle: 19121
Nov 4 10:10:41 vega kernel: [149114.106817] pool 8: cpus=0-3 flags=0x4 nice=0 workers=2 manager: 18518
Nov 4 10:11:37 vega kernel: [149187.409558] sysrq: SysRq : Emergency Sync
Nov 4 10:11:39 vega kernel: [149189.865554] sysrq: SysRq : Emergency Remount R/O

Kernel config:
http://www.madore.org/~david/.tmp/config.20151102.broken

Syslog:
http://www.madore.org/~david/.tmp/syslog.20151104

--
David A. Madore
( http://www.madore.org/~david/ )
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/