Re: [shm][hugetlb] Fix get_policy for stacked shared memory files

From: Adam Litke
Date: Tue Jun 12 2007 - 10:33:18 EST


On 6/12/07, Eric W. Biederman <ebiederm@xxxxxxxxxxxx> wrote:
Adam Litke <agl@xxxxxxxxxx> writes:

> Here's another breakage as a result of shared memory stacked files :(
>
> The NUMA policy for a VMA is determined by checking the following (in the order
> given):
>
> 1) vma->vm_ops->get_policy() (if defined)
> 2) vma->vm_policy (if defined)
> 3) task->mempolicy (if defined)
> 4) Fall back to default_policy
>
> By switching to stacked files for shared memory, get_policy() is now always set
> to shm_get_policy which is a wrapper function. This causes us to stop at step
> 1, which yields NULL for hugetlb instead of task->mempolicy which was the
> previous (and correct) result.
>
> This patch modifies the shm_get_policy() wrapper to maintain steps 1-3 for the
> wrapped vm_ops. Andi and Christoph, does this look right to you?

I'm confused.

I agree that the behavior you describe is correct.
However I only see two code paths were get_policy is called and
both of them take a NULL result and change it to task->mempolicy:

The coffee hasn't quite absorbed yet, but don't those two code paths
take a NULL result from get_policy() and turn it into default_policy,
not task->mempolicy?

From mm/mempolicy.c

> long do_get_mempolicy(int *policy, nodemask_t *nmask,
> unsigned long addr, unsigned long flags)
> {
> int err;
> struct mm_struct *mm = current->mm;
> struct vm_area_struct *vma = NULL;
> struct mempolicy *pol = current->mempolicy;
>
> cpuset_update_task_memory_state();
> if (flags & ~(unsigned long)(MPOL_F_NODE|MPOL_F_ADDR))
> return -EINVAL;
> if (flags & MPOL_F_ADDR) {
> down_read(&mm->mmap_sem);
> vma = find_vma_intersection(mm, addr, addr+1);
> if (!vma) {
> up_read(&mm->mmap_sem);
> return -EFAULT;
> }
> if (vma->vm_ops && vma->vm_ops->get_policy)
> pol = vma->vm_ops->get_policy(vma, addr);
> else
> pol = vma->vm_policy;
> } else if (addr)
> return -EINVAL;
>
> if (!pol)
> pol = &default_policy;



> /* Return effective policy for a VMA */
> static struct mempolicy * get_vma_policy(struct task_struct *task,
> struct vm_area_struct *vma, unsigned long addr)
> {
> struct mempolicy *pol = task->mempolicy;
>
> if (vma) {
> if (vma->vm_ops && vma->vm_ops->get_policy)
> pol = vma->vm_ops->get_policy(vma, addr);
> else if (vma->vm_policy &&
> vma->vm_policy->policy != MPOL_DEFAULT)
> pol = vma->vm_policy;
> }
> if (!pol)
> pol = &default_policy;
> return pol;
> }


Does this perhaps need to be:
> Signed-off-by: Adam Litke <agl@xxxxxxxxxx>
>
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 4fefbad..8d2672d 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -254,8 +254,10 @@ struct mempolicy *shm_get_policy(struct vm_area_struct
> *vma, unsigned long addr)

+ pol = NULL;
>
> if (sfd->vm_ops->get_policy)
> pol = sfd->vm_ops->get_policy(vma, addr);
> - else
> + else if (vma->vm_policy && vma->vm_policy->policy != MPOL_DEFAULT)
> pol = vma->vm_policy;
> return pol;
> }
> #endif

afaict this would provide no way for pol to be set to task->mempolicy
for hugetlb per my comment above.

--
Adam Litke ( agl at us.ibm.com )
IBM Linux Technology Center
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/