Re: [RFC v2 00/11] dm-pcache – persistent-memory cache for block devices

From: Dongsheng Yang
Date: Mon Jun 23 2025 - 00:18:31 EST



在 6/23/2025 11:13 AM, Dongsheng Yang 写道:

Hi Mikulas:

     I will send dm-pcache V1 soon, below is my response to your comments.

在 6/13/2025 12:57 AM, Mikulas Patocka 写道:
Hi


On Thu, 5 Jun 2025, Dongsheng Yang wrote:

Hi Mikulas and all,

This is *RFC v2* of the *pcache* series, a persistent-memory backed cache.


----------------------------------------------------------------------
1. pmem access layer
----------------------------------------------------------------------

* All reads use *copy_mc_to_kernel()* so that uncorrectable media
  errors are detected and reported.
* All writes go through *memcpy_flushcache()* to guarantee durability
  on real persistent memory.
You could also try to use normal write and clflushopt for big writes - I 
found out that for larger regions it is better - see the function 
memcpy_flushcache_optimized in dm-writecache. Test, which way is better.

I did a test with fio on /dev/pmem0, with an attached patch on nd_pmem.ko:

when I use memmap pmem device, I got a similar result with the comment in memcpy_flushcache_optimized():

Test (memmap pmem) clflushopt flushcache ------------------------------------------------- test_randwrite_512 200 MiB/s 228 MiB/s test_randwrite_1024 378 MiB/s 431 MiB/s test_randwrite_2K 773 MiB/s 769 MiB/s test_randwrite_4K 1364 MiB/s 1272 MiB/s test_randwrite_8K 2078 MiB/s 1817 MiB/s test_randwrite_16K 2745 MiB/s 2098 MiB/s test_randwrite_32K 3232 MiB/s 2231 MiB/s test_randwrite_64K 3660 MiB/s 2411 MiB/s test_randwrite_128K 3922 MiB/s 2513 MiB/s test_randwrite_1M 3824 MiB/s 2537 MiB/s test_write_512 228 MiB/s 228 MiB/s test_write_1024 439 MiB/s 423 MiB/s test_write_2K 841 MiB/s 800 MiB/s test_write_4K 1364 MiB/s 1308 MiB/s test_write_8K 2107 MiB/s 1838 MiB/s test_write_16K 2752 MiB/s 2166 MiB/s test_write_32K 3213 MiB/s 2247 MiB/s test_write_64K 3661 MiB/s 2415 MiB/s test_write_128K 3902 MiB/s 2514 MiB/s test_write_1M 3808 MiB/s 2529 MiB/s

But I got a different result when I use Optane pmem100:

Test (Optane pmem100) clflushopt flushcache ------------------------------------------------- test_randwrite_512 167 MiB/s 226 MiB/s test_randwrite_1024 301 MiB/s 420 MiB/s test_randwrite_2K 615 MiB/s 639 MiB/s test_randwrite_4K 967 MiB/s 1024 MiB/s test_randwrite_8K 1047 MiB/s 1314 MiB/s test_randwrite_16K 1096 MiB/s 1377 MiB/s test_randwrite_32K 1155 MiB/s 1382 MiB/s test_randwrite_64K 1184 MiB/s 1452 MiB/s test_randwrite_128K 1199 MiB/s 1488 MiB/s test_randwrite_1M 1178 MiB/s 1499 MiB/s test_write_512 233 MiB/s 233 MiB/s test_write_1024 424 MiB/s 391 MiB/s test_write_2K 706 MiB/s 760 MiB/s test_write_4K 978 MiB/s 1076 MiB/s test_write_8K 1059 MiB/s 1296 MiB/s test_write_16K 1119 MiB/s 1380 MiB/s test_write_32K 1158 MiB/s 1387 MiB/s test_write_64K 1184 MiB/s 1448 MiB/s test_write_128K 1198 MiB/s 1481 MiB/s test_write_1M 1178 MiB/s 1486 MiB/s


So for now I’d rather keep using flushcache in pcache. In future, once we’ve come up with a general-purpose optimization, we can switch to that.

Sorry for the formatting issue—the table can be checked in attachment <pmem_test_result>

Thanx

Dongsheng

    

Test (memmap pmem) clflushopt flushcache
-------------------------------------------------
test_randwrite_512 200 MiB/s 228 MiB/s
test_randwrite_1024 378 MiB/s 431 MiB/s
test_randwrite_2K 773 MiB/s 769 MiB/s
test_randwrite_4K 1364 MiB/s 1272 MiB/s
test_randwrite_8K 2078 MiB/s 1817 MiB/s
test_randwrite_16K 2745 MiB/s 2098 MiB/s
test_randwrite_32K 3232 MiB/s 2231 MiB/s
test_randwrite_64K 3660 MiB/s 2411 MiB/s
test_randwrite_128K 3922 MiB/s 2513 MiB/s
test_randwrite_1M 3824 MiB/s 2537 MiB/s
test_write_512 228 MiB/s 228 MiB/s
test_write_1024 439 MiB/s 423 MiB/s
test_write_2K 841 MiB/s 800 MiB/s
test_write_4K 1364 MiB/s 1308 MiB/s
test_write_8K 2107 MiB/s 1838 MiB/s
test_write_16K 2752 MiB/s 2166 MiB/s
test_write_32K 3213 MiB/s 2247 MiB/s
test_write_64K 3661 MiB/s 2415 MiB/s
test_write_128K 3902 MiB/s 2514 MiB/s
test_write_1M 3808 MiB/s 2529 MiB/s


Test (Optane pmem100) clflushopt flushcache
-------------------------------------------------
test_randwrite_512 167 MiB/s 226 MiB/s
test_randwrite_1024 301 MiB/s 420 MiB/s
test_randwrite_2K 615 MiB/s 639 MiB/s
test_randwrite_4K 967 MiB/s 1024 MiB/s
test_randwrite_8K 1047 MiB/s 1314 MiB/s
test_randwrite_16K 1096 MiB/s 1377 MiB/s
test_randwrite_32K 1155 MiB/s 1382 MiB/s
test_randwrite_64K 1184 MiB/s 1452 MiB/s
test_randwrite_128K 1199 MiB/s 1488 MiB/s
test_randwrite_1M 1178 MiB/s 1499 MiB/s
test_write_512 233 MiB/s 233 MiB/s
test_write_1024 424 MiB/s 391 MiB/s
test_write_2K 706 MiB/s 760 MiB/s
test_write_4K 978 MiB/s 1076 MiB/s
test_write_8K 1059 MiB/s 1296 MiB/s
test_write_16K 1119 MiB/s 1380 MiB/s
test_write_32K 1158 MiB/s 1387 MiB/s
test_write_64K 1184 MiB/s 1448 MiB/s
test_write_128K 1198 MiB/s 1481 MiB/s
test_write_1M 1178 MiB/s 1486 MiB/s