Re: nfsd deadlock, 2.6.36-rc3

From: Maciej Rutecki
Date: Fri Sep 03 2010 - 15:12:56 EST


On Åroda, 1 wrzeÅnia 2010 o 17:39:55 Tim Gardner wrote:
> I've been pursuing a simple reproducer for an NFS lockup that shows up
> under stress. There is a bunch of info (some of it extraneous) in
> http://bugs.launchpad.net/bugs/561210. I can reproduce it by writing
> loop mounted NFS exports:
>
> /etc/fstab: 127.0.0.1:/srv /mnt/srv nfs rw 0 2
> /etc/exports: /srv 127.0.0.1(rw,insecure,no_subtree_check)
>
> See the attached scripts test_master.sh and test_client.sh. I simply
> repeat './test_master.sh wait' until nfsd locks up, typically within 1-3
> cycles, e.g.,
>
> cd /mnt/srv
> while true; do ./test_master.sh wait; done
>
> Note that this test will run indefinitely if invoked from /srv, e.g.,
>
> cd /srv
> while true; do ./test_master.sh wait; done
>
> This issue, or something like it, appears to exist as far back as I've
> tested (Ubuntu Lucid 2.6.32.21). For now I'm assuming that, since the
> symptoms are similar, any lockup bug found in -rc3 is the likely culprit.
>
> See attached dmesg and config. Debug options of interest that I've
> enabled are CONFIG_DEBUG_SLAB, CONFIG_DEBUG_SLAB_LEAK,
> CONFIG_DEBUG_SPINLOCK, CONFIG_DEBUG_MUTEXES.
>
> dmesg.txt contains the initial 'INFO: task nfsd:1263 blocked for more
> than 120 seconds.' complaints as well as information dumped from
>
> echo d | sudo tee /proc/sysrq-trigger
> echo w | sudo tee /proc/sysrq-trigger
>
> Anything else I can provide?
>
> rtg

I created a Bugzilla entry at
https://bugzilla.kernel.org/show_bug.cgi?id=17762
for your bug report, please add your address to the CC list in there, thanks!

--
Maciej Rutecki
http://www.maciek.unixy.pl
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/