Re: [PATCH, RFC] x86/mm/pat: Restore large pages after fragmentation

From: Kirill A. Shutemov
Date: Fri Apr 17 2020 - 12:54:58 EST


On Fri, Apr 17, 2020 at 05:47:14PM +0200, Peter Zijlstra wrote:
> On Fri, Apr 17, 2020 at 12:32:29AM +0300, Kirill A. Shutemov wrote:
> > +static void cpa_restore_large_pages(struct cpa_data *cpa,
> > + struct list_head *pgtables)
> > +{
> > + unsigned long start, addr, end;
> > + int i;
> > +
>
> > + start = __cpa_addr(cpa, 0);
> > + end = start + PAGE_SIZE * cpa->numpages;
> > +
> > + for (addr = start; addr >= start && addr < end; addr += PUD_SIZE)
> > + restore_large_pages(addr, pgtables);
>
> Isn't that loop slightly broken?
>
> Consider:
>
> s e
> |---------|---------|---------|---------|
> a0 a1 a2 a3
>
> Where s,e are @start,@end resp. and a# are the consecutive values of
> @addr with PUD sized steps.
>
> Then, since a3 is >= @end, we'll not take that iteration and we'll not
> try and merge that last PUD, even though we possibly could. One fix is
> to truncate @start (and with that @addr) to the beginning of the PUD.

... or round_up() end. I'll fix it.

> Also, I'm afraid that with my proposal this loop needs to do PMD size
> steps. In that regard your version does make some sense. But it is
> indeed less efficient for small ranges.
>
> One possible fix is to pass @start,@end into the
> restore/reconstruct/collapse such that we can iterate the minimal set of
> page-tables for each level.

Yeah, I'll rework it.

I just realized I missed TLB flush: we need to flush TLB twice here. First
to get rid of all TLB entires for change we've made (before
reconstruction) and then the second time to get rid of small page TLB
entries. That's unfortunate.

--
Kirill A. Shutemov