Re: [PATCH -mm -v4 3/5] mm, swap: VMA based swap readahead

From: Minchan Kim
Date: Thu Sep 14 2017 - 04:15:58 EST


On Thu, Sep 14, 2017 at 08:53:04AM +0800, Huang, Ying wrote:
> Hi, Andrew,
>
> Andrew Morton <akpm@xxxxxxxxxxxxxxxxxxxx> writes:
>
> > On Wed, 13 Sep 2017 10:40:19 +0900 Minchan Kim <minchan@xxxxxxxxxx> wrote:
> >
> >> Every zram users like low-end android device has used 0 page-cluster
> >> to disable swap readahead because it has no seek cost and works as
> >> synchronous IO operation so if we do readahead multiple pages,
> >> swap falut latency would be (4K * readahead window size). IOW,
> >> readahead is meaningful only if it doesn't bother faulted page's
> >> latency.
> >>
> >> However, this patch introduces additional knob /sys/kernel/mm/swap/
> >> vma_ra_max_order as well as page-cluster. It means existing users
> >> has used disabled swap readahead doesn't work until they should be
> >> aware of new knob and modification of their script/code to disable
> >> vma_ra_max_order as well as page-cluster.
> >>
> >> I say it's a *regression* and wanted to fix it but Huang's opinion
> >> is that it's not a functional regression so userspace should be fixed
> >> by themselves.
> >> Please look into detail of discussion in
> >> http://lkml.kernel.org/r/%3C1505183833-4739-4-git-send-email-minchan@xxxxxxxxxx%3E
> >
> > hm, tricky problem. I do agree that linking the physical and virtual
> > readahead schemes in the proposed fashion is unfortunate. I also agree
> > that breaking existing setups (a bit) is also unfortunate.
> >
> > Would it help if, when page-cluster is written to zero, we do
> >
> > printk_once("physical readahead disabled, virtual readahead still
> > enabled. Disable virtual readhead via
> > /sys/kernel/mm/swap/vma_ra_max_order").
> >
> > Or something like that. It's pretty lame, but it should help alert the
> > zram-readahead-disabling people to the issue?
>
> This sounds good for me.
>
> Hi, Minchan, what do you think about this? I think for low-end android
> device, the end-user may have no opportunity to upgrade to the latest
> kernel, the device vendor should care about this. For desktop users,
> the warning proposed by Andrew may help to remind them for the new knob.

Yes, it would be option. At least, we should alert to the user to make
a chance to fix. However, can't we make vma-based readahead new config
option? Please look at the detail in my reply of andrew.

With that, there is no regression with current users and as a bonus,
user can measure both algorithm with their real workload with both
algorithm rather than artificial benchmark. I think recency vs spartial
locality would have each pros and cons so that kind soft landing would
be safer option rather than sudden replacing.
After a while, we can set new algorithm as default.