Re: [mm/filemap] cbd59c48ae: fxmark.hdd_ext4_no_jnl_DRBM_9_bufferedio.works/sec -7.6% regression

From: Matthew Wilcox
Date: Tue Mar 09 2021 - 08:07:22 EST


On Tue, Mar 09, 2021 at 03:57:06PM +0800, kernel test robot wrote:
> FYI, we noticed a -7.6% regression of fxmark.hdd_ext4_no_jnl_DRBM_9_bufferedio.works/sec due to commit:
>
> commit: cbd59c48ae2bcadc4a7599c29cf32fd3f9b78251 ("mm/filemap: use head pages in generic_file_buffered_read")
> https://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git master
>
> in testcase: fxmark
> on test machine: 288 threads Intel(R) Xeon Phi(TM) CPU 7295 @ 1.50GHz with 80G memory

Can you send me one of those to test on? ;-)

> %stddev %change %stddev
> \ | \
> 0.05 ± 5% -10.1% 0.05 ± 3% fxmark.hdd_ext4_no_jnl_DRBM_18_bufferedio.softirq_util
> 4168491 -7.6% 3849925 fxmark.hdd_ext4_no_jnl_DRBM_9_bufferedio.works/sec
> 300.00 +2.1% 306.16 fxmark.time.system_time
> 87.53 -6.7% 81.69 fxmark.time.user_time
> 784.83 ± 5% +23.6% 970.33 ± 7% perf-sched.wait_and_delay.count.preempt_schedule_common.__cond_resched.copy_page_to_iter.generic_file_buffered_read.new_sync_read

23% more delay while preempted copying to user? That seems bad, but I
don't see anything in this commit that would cause that.

> 7.59 -7.6 0.00 perf-profile.calltrace.cycles-pp.find_get_pages_contig.filemap_get_pages.generic_file_buffered_read.new_sync_read.vfs_read

That makes sense; we don't call find_get_pages_contig() any more, instead
we call ...

> 0.00 +11.9 11.90 perf-profile.calltrace.cycles-pp.filemap_get_read_batch.filemap_get_pages.generic_file_buffered_read.new_sync_read.vfs_read

filemap_get_read_batch() ... which is more expensive ;-(

if (PageReadahead(head))
break;
+ if (!PageHead(head))
+ continue;
xas.xa_index = head->index + thp_nr_pages(head) - 1;
xas.xa_offset = (xas.xa_index >> xas.xa_shift) & XA_CHUNK_MASK;

might be worth a try, but I have a medical appointment to get to.
I'll test it out later.