Re: [PATCH/RFC] Futex mmap_sem deadlock

From: Andrew Morton
Date: Tue Feb 22 2005 - 14:58:32 EST


olof@xxxxxxxxxxxxxx (Olof Johansson) wrote:
>
> Hi,
>
> Consider a small testcase that spawns off two threads, either thread
> doing a loop of:
>
> buf = mmap /dev/zero MAP_SHARED for 0x100000 bytes
> call sys_futex (buf+page, FUTEX_WAIT, 1, NULL, NULL) for each page in said mmap
> munmap(buf)
> repeat
>
> This will quickly lock up, since the futex_wait code dows a
> down_read(mmap_sem), then a get_user().
>
> The do_page_fault code on ppc64 (as well as other architectures) needs
> to take the same semaphore for reading. This is all good until the
> second thread comes into play: Its mmap call tries to take the same
> semaphore for writing which causes in the do_page_fault down_read()
> to get stuck. Classic deadlock.

Yup. Jamie says that the futex code _has_ to hold mmap_sem across the
get_user(). I forget (but could probably locate) the details.

>
> One attempt to fix this is included below. It works, but I'm not entirely
> happy with the fact that it's a bit messy solution. If anyone has a
> better idea for how to solve it I'd be all ears.

It's fairly sane. Style-wise I'd be inclined to turn this:

down_read(&current->mm->mmap_sem);
while (!check_user_page_readable(current->mm, uaddr1)) {
up_read(&current->mm->mmap_sem);
/* Fault in the page through get_user() but discard result */
if (get_user(curval, (int __user *)uaddr1) != 0)
return -EFAULT;
down_read(&current->mm->mmap_sem);
}

into a standalone helper function.

> --- linux-2.5.orig/mm/mempolicy.c 2005-02-04 00:27:40.000000000 -0600
> +++ linux-2.5/mm/mempolicy.c 2005-02-21 16:43:08.000000000 -0600
> @@ -486,6 +486,7 @@
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma = NULL;
> struct mempolicy *pol = current->mempolicy;
> + DECLARE_BITMAP(nodes, MAX_NUMNODES);
>
> if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
> return -EINVAL;
> @@ -524,16 +525,21 @@
> } else
> pval = pol->policy;
>
> + if (nmask)
> + get_zonemask(pol, nodes);
> +
> + if (vma) {
> + up_read(&current->mm->mmap_sem);
> + vma = NULL;
> + }
> +

OK.

> err = -EFAULT;
> if (policy && put_user(pval, policy))
> goto out;
>
> err = 0;
> - if (nmask) {
> - DECLARE_BITMAP(nodes, MAX_NUMNODES);
> - get_zonemask(pol, nodes);
> + if (nmask)
> err = copy_nodes_to_user(nmask, maxnode, nodes, sizeof(nodes));
> - }
>
> out:
> if (vma)

I don't think we need to hold mmap_sem while running get_zonemask(). `pol'
is a copy of current->mempolicy, and it won't be going away.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/