Re: [PATCH][BUGFIX][RFC] fix soft lock up at NFS mount by making limitation of dentry_unused

From: David Chinner
Date: Thu Mar 06 2008 - 00:55:14 EST


On Thu, Mar 06, 2008 at 01:41:29PM +0900, Kentaro Makita wrote:
> [Summary]
> Make a limitation of dentry_unused to avoid soft lock up at NFS mounts
> and remounting any filesystem.
>
> [Descriptions]
> - background
> dentry_unused is a list of dentries which is not in use. This works
> as a cache against not-exisiting files. dentry_unused grows up when
> directories or files are removed. This list can be *verrry* long
> if there is no memory pressure, because there is no limit.
>
> - what's problem
> When prune_dcache() is called, it scans *all* dentry_unused linearly
> under spin_lock(). This scan costs very much if there are many entries.
> For example, prune_dcache() is called at mounting NFS.
> In our test, when there are 100,000,000 of unused dentries, mounting
> NFS took 1 minutes and almost all user programs hang during it.
>
> 100,000,000 is possible number on large systems.
>
> This problem already happend on our system.
> Therefore, we need a limitation of dentry_unused.

No, we need a smarter free list structure. There have been several attempts
at this in the past. Two that I can recall off the top of my head:

- per node unused LRUs
- per superblock unusued LRUs

I guess we need to revisit this again, because limiting the size of
the cache like this is not an option.

> I feel we need more tests to determine resonable value to any system.
> So, please test.
.....
> Tested on Intel Itanium 2 9050 (dualcore) x12 MEM 24GB , kernel-2.6.25-rc4
> I found no peformance regression in my tests.

Try something that relies on leaving the working set on the unused
list, like NFS server benchmarks that have a working set of tens of
million of files....

> Signed-off-by: Kentaro Makita <k-makita@xxxxxxxxxxxxxxxxxx>
> ---
> fs/dcache.c | 7 +++++++
> 1 files changed, 7 insertions(+)
> diff -rupN -X linux-2.6.25-rc4/Documentation/dontdiff linux-2.6.25-rc4/fs/dcache.c linux-2.6.25-rc4mod/fs/dcache.c
> --- linux-2.6.25-rc4/fs/dcache.c 2008-03-05 13:33:54.000000000 +0900
> +++ linux-2.6.25-rc4mod/fs/dcache.c 2008-03-05 16:47:18.000000000 +0900
> @@ -42,6 +42,8 @@ __cacheline_aligned_in_smp DEFINE_SEQLOC
>
> EXPORT_SYMBOL(dcache_lock);
>
> +/* threshold to limit dentry_unused */
> +unsigned int dentry_unused_ratio = 10000;
> static struct kmem_cache *dentry_cache __read_mostly;
>
> #define DNAME_INLINE_LEN (sizeof(struct dentry)-offsetof(struct dentry,d_iname))
> @@ -61,6 +63,7 @@ static unsigned int d_hash_mask __read_m
> static unsigned int d_hash_shift __read_mostly;
> static struct hlist_head *dentry_hashtable __read_mostly;
> static LIST_HEAD(dentry_unused);
> +static void prune_dcache(int count, struct super_block *sb);
>
> /* Statistics gathering. */
> struct dentry_stat_t dentry_stat = {
> @@ -214,6 +217,10 @@ repeat:
> }
> spin_unlock(&dentry->d_lock);
> spin_unlock(&dcache_lock);
> + /* Prune unused dentry over threshold level */
> + int nr_in_use = (dentry_stat.nr_dentry - dentry_stat.nr_unused);
> + if (dentry_stat.nr_dentry > nr_in_use * dentry_unused_ratio / 100)
> + prune_dcache(dentry_stat.nr_unused * 5 / 100 , NULL);

nr_in_use is going to overflow 32 bits with this calculation.

Cheers,

Dave.
--
Dave Chinner
Principal Engineer
SGI Australian Software Group
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/