Re: fs/dcache.c - BUG: soft lockup - CPU#5 stuck for 22s! [systemd-udevd:1667]

From: Linus Torvalds
Date: Mon May 26 2014 - 14:17:52 EST


On Mon, May 26, 2014 at 8:27 AM, Al Viro <viro@xxxxxxxxxxxxxxxxxx> wrote:
>
> That's the livelock. OK.

Hmm. Is there any reason we don't have some exclusion around
"check_submounts_and_drop()"?

That would seem to be the simplest way to avoid any livelock: just
don't allow concurrent calls (we could make the lock per-filesystem or
whatever). This whole case should all be for just exceptional cases
anyway.

We already sleep in that thing (well, "cond_resched()"), so taking a
mutex should be fine.

The attached (TOTALLY UNTESTED!) patch as an example. Avert your eyes.
Mika, does this make any difference?

Linus
fs/dcache.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/dcache.c b/fs/dcache.c
index 42ae01eefc07..663fd04614cc 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -1333,16 +1333,19 @@ int check_submounts_and_drop(struct dentry *dentry)

for (;;) {
struct select_data data;
+ static DEFINE_MUTEX(mutex);

INIT_LIST_HEAD(&data.dispose);
data.start = dentry;
data.found = 0;

+ mutex_lock(&mutex);
d_walk(dentry, &data, check_and_collect, check_and_drop);
ret = data.found;

if (!list_empty(&data.dispose))
shrink_dentry_list(&data.dispose);
+ mutex_unlock(&mutex);

if (ret <= 0)
break;