Re: [PATCH 4/4] thp: rewrite freeze_page()/unfreeze_page() with generic rmap walkers

From: Kirill A. Shutemov
Date: Thu Feb 04 2016 - 18:59:08 EST


On Wed, Feb 03, 2016 at 07:42:01AM -0800, Dave Hansen wrote:
> On 02/03/2016 07:14 AM, Kirill A. Shutemov wrote:
> > But the new variant is somewhat slower. Current helpers iterates over
> > VMAs the compound page is mapped to, and then over ptes within this VMA.
> > New helpers iterates over small page, then over VMA the small page
> > mapped to, and only then find relevant pte.
>
> The code simplification here is really attractive. Can you quantify
> what the slowdown is? Is it noticeable, or would it be in the noise
> during all the other stuff that happens under memory pressure?

Okay, here's more realistic scenario: migration 8GiB worth of THP.

Testcase:
#define _GNU_SOURCE
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mman.h>
#include <linux/mempolicy.h>
#include <numaif.h>

#define MB (1024UL * 1024)
#define SIZE (4 * 1024 * 2 * MB)
#define BASE ((void *)0x400000000000)

#include <time.h>

void timespec_diff(struct timespec *start, struct timespec *stop,
struct timespec *result)
{
if ((stop->tv_nsec - start->tv_nsec) < 0) {
result->tv_sec = stop->tv_sec - start->tv_sec - 1;
result->tv_nsec = stop->tv_nsec - start->tv_nsec + 1000000000;
} else {
result->tv_sec = stop->tv_sec - start->tv_sec;
result->tv_nsec = stop->tv_nsec - start->tv_nsec;
}
}

int main()
{
char *p;
unsigned long ret, node_mask;
struct timespec start, stop, result;

node_mask = 0b01;
ret = set_mempolicy(MPOL_BIND, &node_mask, 64);
if (ret == -1)
perror("set_mempolicy"), exit(1);
p = mmap(BASE, SIZE, PROT_READ | PROT_WRITE,
MAP_FIXED | MAP_PRIVATE | MAP_ANONYMOUS | MAP_POPULATE,
-1, 0);
if (p == MAP_FAILED)
perror("mmap"), exit(1);

system("grep thp /proc/vmstat");
clock_gettime(CLOCK_MONOTONIC, &start);
node_mask = 0b10;
ret = mbind(p, SIZE, MPOL_BIND, &node_mask, 64, MPOL_MF_MOVE);
if (ret == -1)
perror("mbind"), exit(1);
clock_gettime(CLOCK_MONOTONIC, &stop);
system("grep thp /proc/vmstat");

timespec_diff(&start, &stop, &result);
printf("--------------------------\n");
printf("%ld.%09lds\n", result.tv_sec, result.tv_nsec);

return 0;
}

Baseline: 25.146 ± 0.141
Patched: 28.684 ± 0.298
Slowdown: 1.14x

Can we tolerate this?

--
Kirill A. Shutemov