On 07/05/2024 09:25, Kefeng Wang wrote:
Hi Ryan, Yang and all,
We see another regression on arm64(no issue on x86) when test memory
latency from lmbench,
./lat_mem_rd -P 1 512M 128
Do you know exectly what this test is doing?
memory latency(smaller is better)
MiB 6.9-rc7 6.9-rc7+revert
And what exactly have you reverted? I'm guessing just commit efa7df3e3bb5 ("mm:
align larger anonymous mappings on THP boundaries")?
0.00049 1.539 1.539
0.00098 1.539 1.539
0.00195 1.539 1.539
0.00293 1.539 1.539
0.00391 1.539 1.539
0.00586 1.539 1.539
0.00781 1.539 1.539
0.01172 1.539 1.539
0.01562 1.539 1.539
0.02344 1.539 1.539
0.03125 1.539 1.539
0.04688 1.539 1.539
0.0625 1.540 1.540
0.09375 3.634 3.086
So the first regression is for 96K - I'm guessing that's the mmap size? That
size shouldn't even be affected by this patch, apart from a few adds and a
compare which determines the size is too small to do PMD alignment for.
0.125 3.874 3.175
0.1875 3.544 3.288
0.25 3.556 3.461
0.375 3.641 3.644
0.5 4.125 3.851
0.75 4.968 4.323
1 5.143 4.686
1.5 5.309 4.957
2 5.370 5.116
3 5.430 5.471
4 5.457 5.671
6 6.100 6.170
8 6.496 6.468
-----------------------s
* L1 cache = 8M, it is no big changes below 8M *
* but the latency reduce a lot when revert this patch from L2 *
12 6.917 6.840
16 7.268 7.077
24 7.536 7.345
32 10.723 9.421
48 14.220 11.350
64 16.253 12.189
96 14.494 12.507
128 14.630 12.560
192 15.402 12.967
256 16.178 12.957
384 15.177 13.346
512 15.235 13.233
After quickly check the smaps, but don't find any clues, any suggestion?
Without knowing exactly what the test does, it's difficult to know what to
suggest. If you want to try something semi-randomly; it might be useful to ruleI don't enabled mTHP, so it should be not related about ARM64_CONTPTE, but will have a try.
out the arm64 contpte feature. I don't see how that would be interacting here if
mTHP is disabled (is it?). But its new for 6.9 and arm64 only. Disable with
ARM64_CONTPTE (needs EXPERT) at compile time.