Re: [PATCH 16/19] mm: numa: Add pte updates, hinting and migrationstats

From: Rik van Riel
Date: Tue Nov 06 2012 - 14:52:35 EST

Next message: David Rientjes: "Re: [PATCH 6/7] ACPI / PM: Move device PM functions related to sleepstates"
Previous message: Konrad Rzeszutek Wilk: "Re: [PATCH] add tpm_xenu.ko: Xen Virtual TPM frontend driver"
In reply to: Mel Gorman: "[PATCH 16/19] mm: numa: Add pte updates, hinting and migration stats"
Next in thread: Mel Gorman: "Re: [PATCH 16/19] mm: numa: Add pte updates, hinting and migrationstats"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 11/06/2012 04:14 AM, Mel Gorman wrote:

It is tricky to quantify the basic cost of automatic NUMA placement in a
meaningful manner. This patch adds some vmstats that can be used as part
of a basic costing model.

u = basic unit = sizeof(void *)
Ca = cost of struct page access = sizeof(struct page) / u
Cpte = Cost PTE access = Ca
Cupdate = Cost PTE update = (2 * Cpte) + (2 * Wlock)
where Cpte is incurred twice for a read and a write and Wlock
is a constant representing the cost of taking or releasing a
lock
Cnumahint = Cost of a minor page fault = some high constant e.g. 1000
Cpagerw = Cost to read or write a full page = Ca + PAGE_SIZE/u
Ci = Cost of page isolation = Ca + Wi
where Wi is a constant that should reflect the approximate cost
of the locking operation
Cpagecopy = Cpagerw + (Cpagerw * Wnuma) + Ci + (Ci * Wnuma)
where Wnuma is the approximate NUMA factor. 1 is local. 1.2
would imply that remote accesses are 20% more expensive

Balancing cost = Cpte * numa_pte_updates +
Cnumahint * numa_hint_faults +
Ci * numa_pages_migrated +
Cpagecopy * numa_pages_migrated

Note that numa_pages_migrated is used as a measure of how many pages
were isolated even though it would miss pages that failed to migrate. A
vmstat counter could have been added for it but the isolation cost is
pretty marginal in comparison to the overall cost so it seemed overkill.

The ideal way to measure automatic placement benefit would be to count
the number of remote accesses versus local accesses and do something like

benefit = (remote_accesses_before - remove_access_after) * Wnuma

but the information is not readily available. As a workload converges, the
expection would be that the number of remote numa hints would reduce to 0.

convergence = numa_hint_faults_local / numa_hint_faults
where this is measured for the last N number of
numa hints recorded. When the workload is fully
converged the value is 1.

This can measure if the placement policy is converging and how fast it is
doing it.

Signed-off-by: Mel Gorman <mgorman@xxxxxxx>

I'm skipping the ACKing of the policy patches, which
appear to be meant to be placeholders for a "real"
policy. However, you have a few more mechanism patches
left in the series, which would be required regardless
of what policy gets merged, so ...

Reviewed-by: Rik van Riel <riel@xxxxxxxxxx>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: David Rientjes: "Re: [PATCH 6/7] ACPI / PM: Move device PM functions related to sleepstates"
Previous message: Konrad Rzeszutek Wilk: "Re: [PATCH] add tpm_xenu.ko: Xen Virtual TPM frontend driver"
In reply to: Mel Gorman: "[PATCH 16/19] mm: numa: Add pte updates, hinting and migration stats"
Next in thread: Mel Gorman: "Re: [PATCH 16/19] mm: numa: Add pte updates, hinting and migrationstats"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]