Re: [PATCH -mm -v10 1/3] mm, THP, swap: Delay splitting THP during swap out

From: Johannes Weiner
Date: Thu Apr 27 2017 - 09:37:35 EST


On Thu, Apr 27, 2017 at 03:12:34PM +0800, Huang, Ying wrote:
> Minchan Kim <minchan@xxxxxxxxxx> writes:
> > On Tue, Apr 25, 2017 at 08:56:56PM +0800, Huang, Ying wrote:
> >> @@ -178,20 +192,12 @@ int add_to_swap(struct page *page, struct list_head *list)
> >> VM_BUG_ON_PAGE(!PageLocked(page), page);
> >> VM_BUG_ON_PAGE(!PageUptodate(page), page);
> >>
> >> - entry = get_swap_page();
> >> +retry:
> >> + entry = get_swap_page(page);
> >> if (!entry.val)
> >> - return 0;
> >> -
> >> - if (mem_cgroup_try_charge_swap(page, entry)) {
> >> - swapcache_free(entry);
> >> - return 0;
> >> - }
> >> -
> >> - if (unlikely(PageTransHuge(page)))
> >> - if (unlikely(split_huge_page_to_list(page, list))) {
> >> - swapcache_free(entry);
> >> - return 0;
> >> - }
> >> + goto fail;
> >
> > So, with non-SSD swap, THP page *always* get the fail to get swp_entry_t
> > and retry after split the page. However, it makes unncessary get_swap_pages
> > call which is not trivial. If there is no SSD swap, thp-swap out should
> > be void without adding any performance overhead.
> > Hmm, but I have no good idea to do it simple. :(
>
> For HDD swap, the device raw throughput is so low (< 100M Bps
> typically), that the added overhead here will not be a big issue. Do
> you agree?

I fully agree. If you swap to spinning rust, an extra function call
here is the least of your concern.