Re: [PATCH] mnt: Silence lockdep warning about i_mutex, mount and reclaim

From: Dmitry Torokhov
Date: Tue Sep 29 2015 - 19:39:50 EST


On Tue, Sep 08, 2015 at 04:40:43PM -0700, Dmitry Torokhov wrote:
> We are holding i_mutex over GFP_KERNEL allocation and that causes lockdep
> complain when we are using ashmem, because it's shrinker descends into
> shmem fallocate code which also tries to take i_mutex (usually different
> one):
>
> <4>[18464.436926] =================================
> <4>[18464.441273] [ INFO: inconsistent lock state ]
> <4>[18464.445624] 3.18.0-06812-ga3d8c04 #1 Tainted: G U W
> <4>[18464.451359] ---------------------------------
> <4>[18464.455706] inconsistent {RECLAIM_FS-ON-W} -> {IN-RECLAIM_FS-W} usage.
> <4>[18464.462224] kswapd0/66 [HC0[0]:SC0[0]:HE1:SE1] takes:
> <4>[18464.467265] (&sb->s_type->i_mutex_key#9){+.+.?.}, at: [<ffffffc0002ddb00>] shmem_fallocate+0x50/0x328
> <4>[18464.476610] {RECLAIM_FS-ON-W} state was registered at:
> <4>[18464.481737] [<ffffffc000257d1c>] mark_lock+0x2dc/0x6a8
> <4>[18464.487053] [<ffffffc000258184>] mark_held_locks+0x9c/0xc4
> <4>[18464.492712] [<ffffffc000258808>] lockdep_trace_alloc+0xc4/0xd8
> <4>[18464.498718] [<ffffffc00030473c>] kmem_cache_alloc_trace+0x38/0x168
> <4>[18464.505073] [<ffffffc000329cc4>] lock_mount+0x9c/0x184
> <4>[18464.510386] [<ffffffc00032a604>] do_add_mount+0x20/0xe4
> <4>[18464.515784] [<ffffffc00032bf34>] do_mount+0xa10/0xa4c
> <4>[18464.521009] [<ffffffc00032c1cc>] SyS_mount+0x84/0xc0
> <4>[18464.526146] [<ffffffc0002033ec>] el0_svc_naked+0x20/0x28
> <4>[18464.531634] irq event stamp: 8877137
> <4>[18464.535202] hardirqs last enabled at (8877137): [<ffffffc000a85fe4>] mutex_trylock+0x190/0x1d0
> <4>[18464.543900] hardirqs last disabled at (8877136): [<ffffffc000a85ee0>] mutex_trylock+0x8c/0x1d0
> <4>[18464.552508] softirqs last enabled at (8874674): [<ffffffc000220ad0>] __do_softirq+0x2b4/0x330
> <4>[18464.561117] softirqs last disabled at (8874651): [<ffffffc000220df0>] irq_exit+0x74/0xd0
> <4>[18464.569205]
> <4>[18464.569205] other info that might help us debug this:
> <4>[18464.575723] Possible unsafe locking scenario:
> <4>[18464.575723]
> <4>[18464.581632] CPU0
> <4>[18464.584070] ----
> <4>[18464.586509] lock(&sb->s_type->i_mutex_key#9);
> <4>[18464.591056] <Interrupt>
> <4>[18464.593667] lock(&sb->s_type->i_mutex_key#9);
> <4>[18464.598387]
> <4>[18464.598387] *** DEADLOCK ***
> <4>[18464.598387]
> <4>[18464.604298] 2 locks held by kswapd0/66:
> <4>[18464.608124] #0: (shrinker_rwsem){++++..}, at: [<ffffffc0002d7508>] shrink_slab+0x40/0x104
> <4>[18464.616506] #1: (ashmem_mutex){+.+.+.}, at: [<ffffffc00086a69c>] ashmem_shrink_scan+0x3c/0x128
> <4>[18464.625324]
> <4>[18464.625324] stack backtrace:
> <4>[18464.629675] CPU: 1 PID: 66 Comm: kswapd0 Tainted: G U W 3.18.0-06812-ga3d8c04 #1
> <4>[18464.637926] Hardware name: Google Tegra210 Smaug Rev 1+ (DT)
> <0>[18464.643575] Call trace:
> <4>[18464.646016] [<ffffffc000207264>] dump_backtrace+0x0/0x10c
> <4>[18464.651408] [<ffffffc000207380>] show_stack+0x10/0x1c
> <4>[18464.656452] [<ffffffc000a7fd98>] dump_stack+0x74/0x94
> <4>[18464.661494] [<ffffffc000a7f298>] print_usage_bug.part.35+0x270/0x28c
> <4>[18464.667838] [<ffffffc000257e8c>] mark_lock+0x44c/0x6a8
> <4>[18464.672969] [<ffffffc000259ee4>] __lock_acquire+0x98c/0x191c
> <4>[18464.678620] [<ffffffc00025b5f8>] lock_acquire+0xec/0x128
> <4>[18464.683923] [<ffffffc000a8422c>] mutex_lock_nested+0x58/0x354
> <4>[18464.689659] [<ffffffc0002ddafc>] shmem_fallocate+0x4c/0x328
> <4>[18464.695223] [<ffffffc00086a6ec>] ashmem_shrink_scan+0x8c/0x128
> <4>[18464.701046] [<ffffffc0002d6c1c>] shrink_slab_node+0x178/0x260
> <4>[18464.706783] [<ffffffc0002d7548>] shrink_slab+0x80/0x104
> <4>[18464.712000] [<ffffffc0002d9c54>] balance_pgdat+0x310/0x4e4
> <4>[18464.717477] [<ffffffc0002da1c8>] kswapd+0x3a0/0x410
> <4>[18464.722347] [<ffffffc000237cf4>] kthread+0xdc/0xe8
>
> To silence it let's switch to using GFP_NOFS allocation when allocating new
> mount point.
>
> Signed-off-by: Dmitry Torokhov <dmitry.torokhov@xxxxxxxxx>

Al, does this make any sense to you or I was barking at a wrong tree?

Thanks.

> ---
> fs/namespace.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/fs/namespace.c b/fs/namespace.c
> index 2b8aa15..25f72d0 100644
> --- a/fs/namespace.c
> +++ b/fs/namespace.c
> @@ -749,7 +749,12 @@ static struct mountpoint *new_mountpoint(struct dentry *dentry)
> struct mountpoint *mp;
> int ret;
>
> - mp = kmalloc(sizeof(struct mountpoint), GFP_KERNEL);
> + /*
> + * We are allocating as GFP_NOFS to appease lockdep:
> + * since we are holding i_mutex we should not try to
> + * recurse into filesystem code.
> + */
> + mp = kmalloc(sizeof(struct mountpoint), GFP_NOFS);
> if (!mp)
> return ERR_PTR(-ENOMEM);
>
> --
> 2.6.0.rc0.131.gf624c3d
>
>
> --
> Dmitry

--
Dmitry
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/