Re: aggressive random read/write on large files == oops (page allocationfailure)

From: Steve Lord
Date: Tue Jun 29 2004 - 15:03:50 EST


Xavier Roche wrote:
Hi,

We (exalead.com) are encountering oops'es and then a partial filesystem hang (ls /proc freezes, ls in other directories also freezes randomly, the machine is "half dead") when agressively accessing random data through large mapped (mmap) memory areas. The system apparently oops'ed while failing to allocate memory somewhere in xfs.

The kernel first message is:
"kswapd0: page allocation failure. order:5, mode:0x50"
(see complete dump below)

The only notable running process was a single process mapping ~100 GB of data, doing aggressively:
- random read(2) i/o on a 5 GB file
- random read/write accesses in the mapped data
- all on the large xfs filesystem.

Could it be a VM problem (no more available pages due to aggressive access to mmap'ed memory ?) or a synchronization problem ? Any hint/suggestion is welcome - and we can issue more tests with symbols-enabled kernel.


Looks like you are getting some fragmented files in xfs, and the
in memory copy of the extents is getting too large for your system to
find memory for. There are some recent changes to the memory
interface in xfs which should make it try a lot harder to get
this memory. You need Linus's bitkeeper tree or the sgi's xfs cvs
tree to get this code at the moment.

Steve
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/