NFS / FuseFS Kernel Hangs Bug

From: sascha a.
Date: Thu Oct 01 2015 - 09:06:19 EST


Hello,


I want to report a Bug with NFS / FuseFS.

Theres trouble with mounting a NFS FS with FuseFS, if the NFS Server
is slowly responding.

The problem occurs, if you mount a NFS FS with FuseFS driver for
example with this command:

mount -t nfs -o vers=3,nfsvers=3,hard,intr,tcp server /dest

Working on this nfs overlay works like a charm, as long as the NFS
Server is not under heavy load. If it gets under HEAVY load from time
to time the kernel hangs (which should in my opinion never ever
occur).

Heres a stacktrace:

[468661.923371] INFO: task sync:2828 blocked for more than 120 seconds.
[468661.924202] Not tainted 4.0.0 #2
[468661.924996] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[468661.925806] sync D 0000000000000000 0 2828 26428 0x00000000
[468661.926605] ffff880036d9cb70 ffffffff8101c356 ffff8800797a2af0
ffffffff8101c3c5
[468661.927476] ffff88006312ffd8 7fffffffffffffff ffff88006312fea8
ffff880036d9cb70
[468661.928426] ffffffff811f3440 0000000000000000 ffffffff81585e5f
ffff88006312feb0
[468661.929269] Call Trace:
[468661.930108] [<ffffffff8101c356>] ? native_sched_clock+0x26/0x90
[468661.930975] [<ffffffff8101c3c5>] ? sched_clock+0x5/0x10
[468661.931742] [<ffffffff811f3440>] ? SyS_tee+0x390/0x390
[468661.932589] [<ffffffff81585e5f>] ? schedule+0x2f/0x80
[468661.933334] [<ffffffff81588c2a>] ? schedule_timeout+0x21a/0x290
[468661.934069] [<ffffffff81081c00>] ? insert_work+0x50/0xc0
[468661.934782] [<ffffffff81081d9b>] ? __queue_work+0x12b/0x340
[468661.935508] [<ffffffff81587270>] ? wait_for_completion+0xb0/0x120
[468661.936425] [<ffffffff810965b0>] ? wake_up_state+0x20/0x20
[468661.937107] [<ffffffff811f3440>] ? SyS_tee+0x390/0x390
[468661.937745] [<ffffffff811ec563>] ? sync_inodes_sb+0xa3/0x1c0
[468661.938432] [<ffffffff811f3440>] ? SyS_tee+0x390/0x390
[468661.939093] [<ffffffff811c6a9c>] ? iterate_supers+0xac/0x100
[468661.939737] [<ffffffff811f37a3>] ? sys_sync+0x33/0xa0
[468661.940511] [<ffffffff81589d8d>] ? system_call_fastpath+0x16/0x1b

[367651.029922] INFO: task kworker/u16:1:21362 blocked for more than
120 seconds.
[367651.030722] Not tainted 4.0.0 #2
[367651.031497] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[367651.032364] kworker/u16:1 D ffff88107fcd4440 0 21362 2 0x00000000
[367651.033131] Workqueue: writeback bdi_writeback_workfn (flush-0:44)
[367651.033882] ffff880efccacc30 0000000000000000 ffff881038a553d0
7fffffffffffffff
[367651.034619] ffff88000e317fd8 ffff88107ffaf090 ffff88102c23c1a0
0000000000000000
[367651.035440] ffff88102c23c270 00000001057745ec ffffffff81585e5f
ffff880302f0a1c8
[367651.036193] Call Trace:
[367651.037021] [<ffffffff81585e5f>] ? schedule+0x2f/0x80
[367651.037745] [<ffffffff811ec210>] ? inode_sleep_on_writeback+0x80/0xa0
[367651.038473] [<ffffffff810aab70>] ? wait_woken+0x90/0x90
[367651.039176] [<ffffffff811ef36a>] ? wb_writeback+0x14a/0x2c0
[367651.039881] [<ffffffff810867be>] ? set_worker_desc+0x7e/0x90
[367651.040588] [<ffffffff811efb81>] ? bdi_writeback_workfn+0x2a1/0x420
[367651.041308] [<ffffffff81083c1d>] ? process_one_work+0x14d/0x410
[367651.042035] [<ffffffff8108466b>] ? worker_thread+0x6b/0x4a0
[367651.042716] [<ffffffff81084600>] ? rescuer_thread+0x310/0x310
[367651.043428] [<ffffffff81089413>] ? kthread+0xd3/0xf0
[367651.044718] [<ffffffff81089340>] ? kthread_create_on_node+0x180/0x180
[367651.045623] [<ffffffff81589cd8>] ? ret_from_fork+0x58/0x90
[367651.046409] [<ffffffff81089340>] ? kthread_create_on_node+0x180/0x180

I tried several kernel-versions, and all of them seems to be affected
by this issue. I guess the timeout occurs somewhere inside the fusefs
part in the kernel. Anyways even if the NFS Server is slowly/not
responding the kernel should not hang in kernel space, should it?

What i verified:
This issue occurs on two different servers (clients), even on two
diffrent NFS servers.

Heres some env. stuff:

cat /proc/version
Linux version 4.0.0 (root@xxx) (gcc version 4.9.2 (Debian 4.9.2-10) )
#2 SMP Wed Aug 26 21:00:33 UTC 2015

cat /proc/cpuinfo
processor : 0
vendor_id : GenuineIntel
cpu family : 6
model : 62
model name : Intel(R) Xeon(R) CPU E5-1620 v2 @ 3.70GHz
stepping : 4
microcode : 0x415
cpu MHz : 3700.000
cache size : 10240 KB
physical id : 0
siblings : 8
core id : 0
cpu cores : 4
apicid : 0
initial apicid : 0
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe
syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts
rep_good nopl xtopology nonstop_tsc aperfmperf eagerfpu pni pclmulqdq
dtes64 monitor ds_cpl vmx smx est tm2 ssse3 cx16 xtpr pdcm pcid dca
sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer aes xsave avx f16c
rdrand lahf_lm ida arat epb pln pts dtherm tpr_shadow vnmi
flexpriority ept vpid fsgsbase smep erms xsaveopt
bugs :
bogomips : 7399.22
clflush size : 64
cache_alignment : 64
address sizes : 46 bits physical, 48 bits virtual
power management:

cat /proc/modules
nfsv3 40960 1 - Live 0xffffffffa0a67000
veth 16384 0 - Live 0xffffffffa07d1000
xt_nat 16384 1 - Live 0xffffffffa07cc000
xt_tcpudp 16384 129 - Live 0xffffffffa07b8000
xt_addrtype 16384 4 - Live 0xffffffffa07b3000
xt_conntrack 16384 133 - Live 0xffffffffa0a9a000
ipt_MASQUERADE 16384 4 - Live 0xffffffffa0a95000
nf_nat_masquerade_ipv4 16384 1 ipt_MASQUERADE, Live 0xffffffffa0a90000
iptable_nat 16384 1 - Live 0xffffffffa0a8b000
nf_conntrack_ipv4 20480 134 - Live 0xffffffffa0a81000
nf_defrag_ipv4 16384 1 nf_conntrack_ipv4, Live 0xffffffffa0a7c000
nf_nat_ipv4 16384 1 iptable_nat, Live 0xffffffffa087b000
iptable_filter 16384 1 - Live 0xffffffffa082c000
ip_tables 28672 2 iptable_nat,iptable_filter, Live 0xffffffffa0a74000
x_tables 28672 7
xt_nat,xt_tcpudp,xt_addrtype,xt_conntrack,ipt_MASQUERADE,iptable_filter,ip_tables,
Live 0xffffffffa08c5000
nf_nat 20480 3 xt_nat,nf_nat_masquerade_ipv4,nf_nat_ipv4, Live
0xffffffffa07bd000
nf_conntrack 94208 5
xt_conntrack,nf_nat_masquerade_ipv4,nf_conntrack_ipv4,nf_nat_ipv4,nf_nat,
Live 0xffffffffa0979000
bridge 102400 0 - Live 0xffffffffa08ab000
stp 16384 1 bridge, Live 0xffffffffa0778000
llc 16384 2 bridge,stp, Live 0xffffffffa076a000
nfsd 278528 2 - Live 0xffffffffa0a22000
auth_rpcgss 53248 1 nfsd, Live 0xffffffffa0993000
oid_registry 16384 1 auth_rpcgss, Live 0xffffffffa0714000
nfs_acl 16384 2 nfsv3,nfsd, Live 0xffffffffa06e2000
nfs 204800 2 nfsv3, Live 0xffffffffa09ef000
lockd 90112 3 nfsv3,nfsd,nfs, Live 0xffffffffa0962000
grace 16384 2 nfsd,lockd, Live 0xffffffffa06b0000
fscache 49152 1 nfs, Live 0xffffffffa0754000
sunrpc 270336 18 nfsv3,nfsd,auth_rpcgss,nfs_acl,nfs,lockd, Live
0xffffffffa09ac000
x86_pkg_temp_thermal 16384 0 - Live 0xffffffffa06dd000
intel_powerclamp 20480 0 - Live 0xffffffffa07d8000
coretemp 16384 0 - Live 0xffffffffa0765000
kvm_intel 151552 0 - Live 0xffffffffa0885000
kvm 425984 1 kvm_intel, Live 0xffffffffa08f9000
crct10dif_pclmul 16384 0 - Live 0xffffffffa0880000
crc32_pclmul 16384 0 - Live 0xffffffffa0773000
ghash_clmulni_intel 16384 0 - Live 0xffffffffa073e000
mgag200 40960 1 - Live 0xffffffffa09a1000
aesni_intel 167936 0 - Live 0xffffffffa08cf000
snd_pcm 90112 0 - Live 0xffffffffa0864000
aes_x86_64 20480 1 aesni_intel, Live 0xffffffffa0738000
lrw 16384 1 aesni_intel, Live 0xffffffffa074f000
gf128mul 16384 1 lrw, Live 0xffffffffa074a000
iTCO_wdt 16384 0 - Live 0xffffffffa06ca000
ttm 77824 1 mgag200, Live 0xffffffffa0850000
snd_timer 28672 1 snd_pcm, Live 0xffffffffa0730000
glue_helper 16384 1 aesni_intel, Live 0xffffffffa04cd000
iTCO_vendor_support 16384 1 iTCO_wdt, Live 0xffffffffa0831000
ablk_helper 16384 1 aesni_intel, Live 0xffffffffa07c7000
drm_kms_helper 102400 1 mgag200, Live 0xffffffffa0836000
psmouse 110592 0 - Live 0xffffffffa077d000
snd 69632 2 snd_pcm,snd_timer, Live 0xffffffffa07a1000
cryptd 20480 3 ghash_clmulni_intel,aesni_intel,ablk_helper, Live
0xffffffffa0744000
drm 274432 4 mgag200,ttm,drm_kms_helper, Live 0xffffffffa07e8000
soundcore 16384 1 snd, Live 0xffffffffa0719000
serio_raw 16384 0 - Live 0xffffffffa070f000
pcspkr 16384 0 - Live 0xffffffffa070a000
sb_edac 28672 0 - Live 0xffffffffa06fe000
edac_core 45056 1 sb_edac, Live 0xffffffffa06f2000
lpc_ich 24576 0 - Live 0xffffffffa06d2000
i2c_i801 20480 0 - Live 0xffffffffa06a4000
mei_me 20480 0 - Live 0xffffffffa06c4000
mfd_core 16384 1 lpc_ich, Live 0xffffffffa06bf000
evdev 20480 5 - Live 0xffffffffa06aa000
tpm_tis 20480 0 - Live 0xffffffffa07de000
joydev 20480 0 - Live 0xffffffffa0343000
ipmi_watchdog 24576 0 - Live 0xffffffffa079a000
mei 77824 1 mei_me, Live 0xffffffffa0e70000
tpm 32768 1 tpm_tis, Live 0xffffffffa0e67000
wmi 20480 0 - Live 0xffffffffa0e61000
processor 28672 0 - Live 0xffffffffa047c000
shpchp 32768 0 - Live 0xffffffffa06e9000
ioatdma 61440 16 - Live 0xffffffffa0720000
thermal_sys 32768 3 x86_pkg_temp_thermal,intel_powerclamp,processor,
Live 0xffffffffa06b6000
button 16384 0 - Live 0xffffffffa04f5000
ipmi_si 49152 2 - Live 0xffffffffa0697000
ipmi_poweroff 16384 0 - Live 0xffffffffa0485000
ipmi_devintf 20480 0 - Live 0xffffffffa0431000
ipmi_msghandler 40960 4
ipmi_watchdog,ipmi_si,ipmi_poweroff,ipmi_devintf, Live
0xffffffffa067f000
fuse 86016 1 - Live 0xffffffffa0669000
autofs4 36864 2 - Live 0xffffffffa068d000
ext4 499712 3 - Live 0xffffffffa05ee000
crc16 16384 1 ext4, Live 0xffffffffa03fc000
jbd2 86016 1 ext4, Live 0xffffffffa05d8000
mbcache 20480 1 ext4, Live 0xffffffffa02f7000
btrfs 905216 0 - Live 0xffffffffa04fa000
ohci_hcd 45056 0 - Live 0xffffffffa04e9000
uhci_hcd 45056 0 - Live 0xffffffffa04dd000
pata_via 16384 0 - Live 0xffffffffa04d4000
netxen_nic 110592 0 - Live 0xffffffffa04b1000
3w_9xxx 45056 0 - Live 0xffffffffa04a1000
qlge 90112 0 - Live 0xffffffffa048a000
ixgbe 245760 0 - Live 0xffffffffa043f000
mdio 16384 1 ixgbe, Live 0xffffffffa03e4000
sata_nv 28672 0 - Live 0xffffffffa0437000
forcedeth 65536 0 - Live 0xffffffffa0420000
via686a 20480 0 - Live 0xffffffffa041a000
mptctl 36864 0 - Live 0xffffffffa0410000
mptsas 49152 0 - Live 0xffffffffa0403000
mptspi 24576 0 - Live 0xffffffffa03dd000
mptscsih 28672 2 mptsas,mptspi, Live 0xffffffffa02ef000
mptbase 73728 4 mptctl,mptsas,mptspi,mptscsih, Live 0xffffffffa03e9000
dm_crypt 24576 0 - Live 0xffffffffa039b000
dm_mirror 24576 0 - Live 0xffffffffa036a000
dm_region_hash 16384 1 dm_mirror, Live 0xffffffffa030a000
dm_log 20480 2 dm_mirror,dm_region_hash, Live 0xffffffffa0226000
dm_mod 94208 3 dm_crypt,dm_mirror,dm_log, Live 0xffffffffa03c5000
sata_via 20480 0 - Live 0xffffffffa0323000
ata_piix 36864 0 - Live 0xffffffffa03b7000
sata_sis 16384 0 - Live 0xffffffffa028b000
pata_sis 20480 1 sata_sis, Live 0xffffffffa0231000
sym53c8xx 81920 0 - Live 0xffffffffa03a2000
megaraid_sas 114688 0 - Live 0xffffffffa037e000
megaraid 45056 0 - Live 0xffffffffa0372000
aic7xxx 131072 0 - Live 0xffffffffa0349000
scsi_transport_spi 28672 3 mptspi,sym53c8xx,aic7xxx, Live 0xffffffffa02b1000
3w_xxxx 36864 0 - Live 0xffffffffa0339000
sky2 57344 0 - Live 0xffffffffa032a000
r8169 77824 0 - Live 0xffffffffa030f000
skge 49152 0 - Live 0xffffffffa02fd000
e1000e 212992 0 - Live 0xffffffffa02ba000
e1000 126976 0 - Live 0xffffffffa0291000
via_rhine 28672 0 - Live 0xffffffffa0283000
sis900 28672 0 - Live 0xffffffffa0277000
8139too 32768 0 - Live 0xffffffffa026e000
e100 36864 0 - Live 0xffffffffa0264000
mii 16384 5 r8169,via_rhine,sis900,8139too,e100, Live 0xffffffffa022c000
raid456 77824 1 - Live 0xffffffffa0208000
async_raid6_recov 20480 1 raid456, Live 0xffffffffa0202000
async_memcpy 16384 2 raid456,async_raid6_recov, Live 0xffffffffa01fd000
async_pq 16384 2 raid456,async_raid6_recov, Live 0xffffffffa01d6000
async_xor 16384 3 raid456,async_raid6_recov,async_pq, Live 0xffffffffa018e000
async_tx 16384 5
raid456,async_raid6_recov,async_memcpy,async_pq,async_xor, Live
0xffffffffa0105000
xor 24576 2 btrfs,async_xor, Live 0xffffffffa00fa000
raid6_pq 98304 3 btrfs,async_raid6_recov,async_pq, Live 0xffffffffa0113000
raid1 36864 1 - Live 0xffffffffa00f0000
raid0 20480 1 - Live 0xffffffffa003e000
md_mod 118784 6 raid456,raid1,raid0, Live 0xffffffffa01df000
sg 32768 0 - Live 0xffffffffa01cd000
sd_mod 40960 15 - Live 0xffffffffa0091000
hid_generic 16384 0 - Live 0xffffffffa000f000
usbhid 45056 0 - Live 0xffffffffa00e4000
hid 102400 2 hid_generic,usbhid, Live 0xffffffffa0077000
crc32c_intel 24576 1 - Live 0xffffffffa017f000
isci 122880 0 - Live 0xffffffffa01ae000
ahci 32768 0 - Live 0xffffffffa01a5000
mpt2sas 151552 12 - Live 0xffffffffa0159000
libsas 69632 1 isci, Live 0xffffffffa0193000
libahci 28672 1 ahci, Live 0xffffffffa0186000
raid_class 16384 1 mpt2sas, Live 0xffffffffa010e000
ehci_pci 16384 0 - Live 0xffffffffa002c000
ehci_hcd 73728 1 ehci_pci, Live 0xffffffffa00d1000
libata 180224 9
pata_via,sata_nv,sata_via,ata_piix,sata_sis,pata_sis,ahci,libsas,libahci,
Live 0xffffffffa0237000
scsi_transport_sas 36864 4 mptsas,isci,mpt2sas,libsas, Live 0xffffffffa021c000
igb 176128 0 - Live 0xffffffffa012d000
i2c_algo_bit 16384 2 mgag200,igb, Live 0xffffffffa0025000
usbcore 196608 5 ohci_hcd,uhci_hcd,usbhid,ehci_pci,ehci_hcd, Live
0xffffffffa00a0000
scsi_mod 196608 19
3w_9xxx,mptctl,mptsas,mptspi,mptscsih,sym53c8xx,megaraid_sas,megaraid,aic7xxx,scsi_transport_spi,3w_xxxx,sg,sd_mod,isci,mpt2sas,libsas,raid_class,libata,scsi_transport_sas,
Live 0xffffffffa0046000
i2c_core 49152 6 mgag200,drm_kms_helper,drm,i2c_i801,igb,i2c_algo_bit,
Live 0xffffffffa0031000
usb_common 16384 1 usbcore, Live 0xffffffffa001e000
dca 16384 3 ioatdma,ixgbe,igb, Live 0xffffffffa000a000
ptp 20480 3 ixgbe,e1000e,igb, Live 0xffffffffa0014000
pps_core 20480 1 ptp, Live 0xffffffffa0000000

Greetings

Sascha
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/