RE: [PATCH v8 05/22] Add vm_replace_mixed()

From: Zhang, Tianfei
Date: Mon Jul 28 2014 - 21:56:20 EST




> -----Original Message-----
> From: owner-linux-mm@xxxxxxxxx [mailto:owner-linux-mm@xxxxxxxxx] On
> Behalf Of Kirill A. Shutemov
> Sent: Monday, July 28, 2014 9:26 PM
> To: Matthew Wilcox
> Cc: Wilcox, Matthew R; linux-fsdevel@xxxxxxxxxxxxxxx; linux-mm@xxxxxxxxx;
> linux-kernel@xxxxxxxxxxxxxxx
> Subject: Re: [PATCH v8 05/22] Add vm_replace_mixed()
>
> On Fri, Jul 25, 2014 at 03:44:50PM -0400, Matthew Wilcox wrote:
> > On Wed, Jul 23, 2014 at 06:55:00PM +0300, Kirill A. Shutemov wrote:
> > > > update_hiwater_rss(mm);
> > >
> > > No: you cannot end up with lower rss after replace, iiuc.
> >
> > Actually, you can ... when we replace a real page with a PFN, our rss
> > decreases.
>
> Okay.
>
> > > Do you mean you pointed to new file all the time? O_CREAT doesn't
> > > truncate file if it exists, iirc.
> >
> > It was pointing to a new file. Still not sure why that one failed to
> > trigger the problem. The slightly modified version attached triggered
> > the problem *just fine* :-)
> >
> > I've attached all the patches in my tree so far. For the v9 patch
> > kit, I'll keep patch 3 as a separate patch, but roll patches 1, 2 and
> > 4 into other patches.
> >
> > I am seeing something odd though. When I run double-map with
> > debugging printks inserted in strategic spots in the kernel, I see
> > four calls to do_dax_fault(). The first two, as expected, are the
> > loads from the two mapped addresses. The third is via mkwrite, but
> > then the fourth time I get a regular page fault for write, and I don't
> understand why I get it.
> >
> > Any ideas?
>
> unmap_mapping_range() clears pte you've just set by vm_replace_mixed() on
> third fault.
>
> And locking looks wrong: it seems you need to hold i_mmap_mutex while
> replacing hole page with pfn. Your VM_BUG_ON() in zap_pte_single() triggers
> on my setup.
>
> > +static void zap_pte_single(struct vm_area_struct *vma, pte_t *pte,
> > + unsigned long addr)
> > +{
> > + struct mm_struct *mm = vma->vm_mm;
> > + int force_flush = 0;
> > + int rss[NR_MM_COUNTERS];
> > +
> > +
> VM_BUG_ON(!mutex_is_locked(&vma->vm_file->f_mapping->i_mmap_m
> utex));
>
> It's wrong place for VM_BUG_ON(): zap_pte_single() on anon mapping should
> work fine)

Hi Shutemov:
1. I am confuse that why insert_page() drop the PageAnon, insert_pfn() donot drop PageAnon?

2.
remap_vmalloc_range() -> vm_insert_page()->insert_page() -> inc_mm_counter_fast(mm, MM_FILEPAGES);

so, in this scenario, this vmalloc page maybe not sure be a page cache, why increase the MM_FILEPAGES ?


>
> > +
> > + init_rss_vec(rss);
>
> Vector to commit single update to mm counters? What about inline counters
> update for rss == NULL case?
>
> > + update_hiwater_rss(mm);
> > + flush_cache_page(vma, addr, pte_pfn(*pte));
> > + zap_pte(NULL, vma, pte, addr, NULL, rss, &force_flush);
> > + flush_tlb_page(vma, addr);
> > + add_mm_rss_vec(mm, rss);
> > +}
> > +
>
> --
> Kirill A. Shutemov
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to
> majordomo@xxxxxxxxxx For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@xxxxxxxxx";> email@xxxxxxxxx </a>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/