VFS nr_files accounting

From: David S. Miller
Date: Sat Mar 04 2006 - 05:24:32 EST



I just wanted to report that I am hitting the "VFS: file-max limit xxx
reached" problem quite easily on my 32-cpu Niagara machine with 16GB
of ram with current 2.6.x GIT.

It seems far too easy to get a box into this state due to SLAB
fragmentation and RCU. And once you get a machine into this state it
is totally unusable.

Our test case is usually a "make -j8192" kernel build along with a
parallel bootstrap of gcc. That puts about 256 processes on each
cpu's runqueue, I doubt ksoftirqd can run much at all.

I think part of what helps trigger it might be ccache, which we are
using on this machine. ccache seems to open up a ton of files each
build invocation.

Usually within an hour of that load you'll hit the nr_files limit and
you can't run anything and have to power-cycle.

I think we need to think seriously about this problem.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/