Re: [patch 2/2] MM: allow per-cpu vmstat_threshold and vmstat_worker configuration

From: Marcelo Tosatti
Date: Thu May 25 2017 - 15:35:43 EST


On Fri, May 19, 2017 at 01:49:34PM -0400, Luiz Capitulino wrote:
> On Fri, 19 May 2017 12:13:26 -0500 (CDT)
> Christoph Lameter <cl@xxxxxxxxx> wrote:
>
> > > So why are you against integrating this simple, isolated patch which
> > > does not affect how current logic works?
> >
> > Frankly the argument does not make sense. Vmstat updates occur very
> > infrequently (probably even less than you IPIs and the other OS stuff that
> > also causes additional latencies that you seem to be willing to tolerate).
>
> Infrequently is not good enough. It only has to happen once to
> cause a problem.
>
> Also, IPIs take a few us, usually less. That's not a problem. In our
> testing we see the preemption caused by the kworker take 10us or
> even more. I've never seeing it take 3us. I'm not saying this is not
> true, I'm saying if this is causing a problem to us it will cause
> a problem to other people too.

Christoph,

Some data:

qemu-system-x86-12902 [003] ....1.. 6517.621557: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000fc
qemu-system-x86-12902 [003] d...2.. 6517.621557: kvm_entry: vcpu 2
qemu-system-x86-12902 [003] ....1.. 6517.621560: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000fc
qemu-system-x86-12902 [003] d...2.. 6517.621561: kvm_entry: vcpu 2
qemu-system-x86-12902 [003] ....1.. 6517.621563: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000fc
qemu-system-x86-12902 [003] d...2.. 6517.621564: kvm_entry: vcpu 2
qemu-system-x86-12902 [003] d..h1.. 6517.622037: empty_smp_call_func:
empty_smp_call_func ran
qemu-system-x86-12902 [003] ....1.. 6517.622040: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000fb
qemu-system-x86-12902 [003] d...2.. 6517.622041: kvm_entry: vcpu 2

empty_smp_function_call: 3us.

qemu-system-x86-12902 [003] ....1.. 6517.702739: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000ef
qemu-system-x86-12902 [003] d...2.. 6517.702741: kvm_entry: vcpu 2
qemu-system-x86-12902 [003] d..h1.. 6517.702758: scheduler_tick
<-update_process_times
qemu-system-x86-12902 [003] ....1.. 6517.702760: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000ef
qemu-system-x86-12902 [003] d...2.. 6517.702760: kvm_entry: vcpu 2

scheduler_tick: 2us.

qemu-system-x86-12902 [003] ....1.. 6518.194570: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000ef
qemu-system-x86-12902 [003] d...2.. 6518.194571: kvm_entry: vcpu 2
qemu-system-x86-12902 [003] ....1.. 6518.194591: kvm_exit: reason
EXTERNAL_INTERRUPT rip 0x4004f1 info 0 800000ef
qemu-system-x86-12902 [003] d...2.. 6518.194593: kvm_entry: vcpu 2

That, and the 10us number for kworker mentioned above changes your
point of view of your
"Frankly the argument does not make sense. Vmstat updates occur very
infrequently (probably even less than you IPIs and the other OS stuff that
also causes additional latencies that you seem to be willing to tolerate).
And you can configure the interval of vmstat updates freely.... Set
the vmstat_interval to 60 seconds instead of 2 for a try? Is that rare
enough?"

Argument? We're showing you the data that this is causing a latency
problem for us.

Is there anything you'd like to be improved on the patch?
Is there anything you dislike about it?

> No, we'd have to set it high enough to disable it and this will
> affect all CPUs.
>
> Something that crossed my mind was to add a new tunable to set
> the vmstat_interval for each CPU, this way we could essentially
> disable it to the CPUs where DPDK is running. What's the implications
> of doing this besides not getting up to date stats in /proc/vmstat
> (which I still have to confirm would be OK)? Can this break anything
> in the kernel for example?

Well, you get incorrect statistics.