Re: Many open/close on same files yeilds "No such file or directory".

From: Jesper Krogh
Date: Fri May 09 2008 - 02:09:50 EST


Andrew Morton wrote:
My feeling is that the script below may reveal the bug on any "busy" volume, where busy is lots of activity in the OS-cache of the volume,
not on the actual drives.

By this do you mean that there has to be a lot of other activity on the
system to reproduce it? Stuff which is turning over memory?

Yes, something like that. (sorry for not being able to be more
concrete). The applications has "high activity" on a few files, not
spread activity throughout the volume.

Because one possiblity is that the cached dentry got reclaimed by memory
pressure and we have some race bug which causes us to think that the file
doesn't exist.

What can i do to explore this theory? Can I disable caching of dentries
and see it go away? Does it fit the pattern that it is only the "open"-syscall that is hit (not read for example)?

(That still shouldn't happen because the dentry should be marked
recently-accessed, but perhaps the underlying inode gets reclaimed or
something. Grasping at straws here)

When I disabled the NFS-server and rand my "real-world" program on a
single processor (make -j 1). It ran through fine. It basically
gets around 20 million chunks out of differnet file and assemble the
chuncks in a few other files. This processes more or less 5 individual
sections, so make can run effectively with a concurrency of 5.

I dont know if there can be any technical reasons for not seeing it on
internal attached disks? (other than I just hadn't been able to
reproduce the same error conditions there.

Jesper
--
Jesper
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/