Re: could you clarify mm/mempolicy: fix !vma in new_vma_page()

From: Bob Liu
Date: Tue Jan 07 2014 - 19:56:58 EST


On Wed, Jan 8, 2014 at 1:30 AM, Michal Hocko <mhocko@xxxxxxx> wrote:
> On Tue 07-01-14 11:22:12, Michal Hocko wrote:
>> On Tue 07-01-14 13:29:31, Bob Liu wrote:
>> > On Mon, Jan 6, 2014 at 10:18 PM, Michal Hocko <mhocko@xxxxxxx> wrote:
>> > > On Mon 06-01-14 20:45:54, Bob Liu wrote:
>> > > [...]
>> > >> 544 if (PageAnon(page)) {
>> > >> 545 struct anon_vma *page__anon_vma = page_anon_vma(page);
>> > >> 546 /*
>> > >> 547 * Note: swapoff's unuse_vma() is more efficient with this
>> > >> 548 * check, and needs it to match anon_vma when KSM is active.
>> > >> 549 */
>> > >> 550 if (!vma->anon_vma || !page__anon_vma ||
>> > >> 551 vma->anon_vma->root != page__anon_vma->root)
>> > >> 552 return -EFAULT;
>> > >> 553 } else if (page->mapping && !(vma->vm_flags & VM_NONLINEAR)) {
>> > >> 554 if (!vma->vm_file ||
>> > >> 555 vma->vm_file->f_mapping != page->mapping)
>> > >> 556 return -EFAULT;
>> > >> 557 } else
>> > >> 558 return -EFAULT;
>> > >>
>> > >> That's the "other conditions" and the reason why we can't use
>> > >> BUG_ON(!vma) in new_vma_page().
>> > >
>> > > Sorry, I wasn't clear with my question. I was interested in which of
>> > > these triggered and why only for hugetlb pages?
>> > >
>> >
>> > Sorry I didn't analyse the root cause. They are several checks in
>> > page_address_in_vma() so I think it might be not difficult to hit one
>> > of them.
>>
>> I would be really curious when anon_vma or f_mapping would be out of
>> sync, that's why I've asked in the first place.
>>
>> > For example, if the page was mapped to vma by nonlinear
>> > mapping?
>>
>> Hmm, ok !private shmem/hugetlbfs might be remapped as non-linear.
>
> OK, it didn't let go away from my head so I had to check. hugetlbfs
> cannot be remmaped as non-linear because it is missing its vm_ops is
> missing remap_pages implementation. So this case is impossible for these
> pages. So at least the PageHuge part of the patch is bogus AFAICS.
>
> We still have shmem and even then I am curious whether we are doing the
> right thing. The loop is inteded to handle range spanning multiple VMAs
> (as per 3ad33b2436b54 (Migration: find correct vma in new_vma_page()))
> and it doesn't seem to be VM_NONLINEAR aware. It will always fail for
> shared shmem and so we always fallback to task/system default mempolicy.
> Whether somebody uses mempolicy on VM_NONLINEAR mappings is hard to
> tell. I am not familiar with this feature much.
>
> That being said. The BUG_ON(!vma) was bogus for VM_NONLINEAR cases.
> The changed code could keep it for hugetlbfs path because we shouldn't
> see NULL vma there AFAICS.
>

Sounds reasonable, but as your said we'd better find out the root
cause before making any changes.
Do you think below debug info is enough? If yes, then we can ask Sasha
help us having a test.

diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 12733f5..86c5cc0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1189,11 +1189,21 @@ static struct page *new_vma_page(struct page
*page, unsigned long private, int *
{
struct vm_area_struct *vma = (struct vm_area_struct *)private;
unsigned long uninitialized_var(address);
diff --git a/mm/mempolicy.c b/mm/mempolicy.c
index 12733f5..86c5cc0 100644
--- a/mm/mempolicy.c
+++ b/mm/mempolicy.c
@@ -1189,11 +1189,21 @@ static struct page *new_vma_page(struct page
*page, unsigned long private, int *
{
struct vm_area_struct *vma = (struct vm_area_struct *)private;
unsigned long uninitialized_var(address);
+ unsigned long uninitialized_var(address2);

while (vma) {
address = page_address_in_vma(page, vma);
if (address != -EFAULT)
break;
+#if 1
+ address2 = vma_address(page, vma);
+ if (address2 >= vma->vm_start && address2 < vma->vm_end) {
+ printk("other condition happened\n");
+ if (vma->vm_flags & VM_NONLINEAR)
+ printk("non linear map\n");
+ dump_page(page);
+ }
+#endif
vma = vma->vm_next;
}
/*
diff --git a/mm/rmap.c b/mm/rmap.c
index d792e71..4d35d5c 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -529,7 +529,7 @@ vma_address(struct page *page, struct vm_area_struct *vma)
unsigned long address = __vma_address(page, vma);

/* page should be within @vma mapping range */
- VM_BUG_ON(address < vma->vm_start || address >= vma->vm_end);
+ //VM_BUG_ON(address < vma->vm_start || address >= vma->vm_end);

return address;
}

--
Regards,
--Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/