Re: 2.6.xx: NFS: directory motion/cam2 contains a readdir loop

From: Justin Piszcz
Date: Wed Jul 27 2011 - 17:24:17 EST




On Wed, 27 Jul 2011, Trond Myklebust wrote:

On Wed, 2011-07-27 at 16:54 -0400, Trond Myklebust wrote:
On Wed, 2011-07-27 at 16:37 -0400, Trond Myklebust wrote:
On Wed, 2011-07-27 at 15:47 -0400, Christoph Hellwig wrote:
On Wed, Jul 27, 2011 at 03:44:20PM -0400, Justin Piszcz wrote:


On Wed, 27 Jul 2011, Christoph Hellwig wrote:

On Wed, Jul 27, 2011 at 03:35:01PM -0400, Justin Piszcz wrote:
Currently I do not see any dupes, however I have a script that moves
images out of the directory once an hour:
0 * * * * /usr/local/bin/move_to_old2.sh > /dev/null 2>&1

Do you keep adding files to the directory while you move files out?
Yes, otherwise there are too many files in the directory and viewers, e.g.,
each geeqie (picture viewer) will use > 4-6GB of memory, so I try to keep
it around 5,000 pictures or less.

What's the rate of additions/removals to the directory?
Additions it depends, around 5,000 over a 12hr period, 416/hr, current:

atom:/d1/motion# find cam1|wc
5215 5215 166853
atom:/d1/motion# find cam2|wc
5069 5069 162181
atom:/d1/motion# find cam3|wc
5594 5594 178981
atom:/d1/motion#

This sounds a lot like xfs simply filling up the directory index slots
of files that you just moved out with new files, and nfs falsely
claiming that this is a problem.

Yep. There is an existing bugzilla report for this bug at

https://bugzilla.kernel.org/show_bug.cgi?id=38572

I have a preliminary patch there that attempts to turn off the loop
detection when the directory is seen to change, however that patch still
appears to have a bug in it, and I haven't had time to figure out what
is wrong yet.

Can you perhaps take a look, Bryan?

Actually, Justin, can you test the following slight variant on the patch
in the bugzilla?

Doh! This one will actually compile....

Hi,

Should I try 3.0 first or retry 2.6.38 w/ this patch?

Justin.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/