[PATCH][RESEND] 2.4 devfs deadlock on concurrent lookups on non-existent entry

From: by way of Andrey Borzenkov (arvidjaar@mail.ru)
Date: Sun Jun 22 2003 - 13:06:32 EST

I resend this patch on request of Pavel Roskin. Unfortunately, I do not know
who is current active devfs maintainer.

It appears, so far no negative effects with this patch was observed; the patch
is used in unofficial Mandrake Club kernel as well. The problem is real,
recently there was increased number of complaints on a.o.l.m

Original message follows:


This problem has first been reported for over two years ago. Usually it
happened during system boot using RH initscripts with two minilogds hanging
on access to /dev/log and blocking rc.sysinit; this condition was triggered
by unrelated bug in minilogd.

Pavel Roskin provided detailed debug info for this problem; details including
stack are available at
<https://bugzilla.redhat.com/bugzilla/show_bug.cgi?id=85621>. It turned out a
deadlock between devfs_lookup and devfs_d_revalidate_wait for cases when
->d_revalidate was called under parent->i_sem (mostly all places where
lookup_hash was used). The deadlock looked like:

                               path_lookup("dev/log", LOOKUP_PARENT, &nd);
                                -> yields dentry for </dev>

path_lookup("dev/log", LOOKUP_PARENT, &nd);
   -> yields dentry for </dev>
down(</dev>->i_sem) holds i_sem

                                        down(</dev>->i_sem) - sleeps

lookup_hash("log", </dev>) .
devfs_lookup(</dev>, "log") .
   MISS .
set "log"->d_op to &devfs_wait_dops; .
init "log"->wait_queue .
up(</dev>->i_sem) .
                                             obtains i_sem
                                 lookup_hash("log", </dev>);
                                 cached_lookup(</dev>, "log", 0)
                                 devfs_d_revalidate_wait("log", 0)
                                 wait on "log"->wait_queue
                                ... waits to be waked up by devfs_lookup


the patch fixes it by moving i_sem re-acquire after
wake_up(&lookup_info.wait_queue). It does not look like it adds any
additional races. Please check.

Pavel said it applies unchanged to 2.5 tree. Which may indicate it has the
same race condition. I do not have 2.5 available.

please consider for 2.4.21



To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

This archive was generated by hypermail 2b29 : Mon Jun 23 2003 - 22:00:38 EST