Re: [PATCH] x86/irq: do not substract irq_tlb_count from irq_call_count

From: Huang\, Ying
Date: Thu Aug 11 2016 - 11:13:26 EST


Aaron Lu <aaron.lu@xxxxxxxxx> writes:

> Since commit 52aec3308db8 ("x86/tlb: replace INVALIDATE_TLB_VECTOR by
> CALL_FUNCTION_VECTOR"), the tlb remote shootdown is done through call
> function vector. That commit didn't take care of irq_tlb_count so later
> commit fd0f5869724f ("x86: Distinguish TLB shootdown interrupts from
> other functions call interrupts") tried to fix it.
>
> The fix assumes every increase of irq_tlb_count has a corresponding
> increase of irq_call_count. So the irq_call_count is always bigger than
> irq_tlb_count and we could substract irq_tlb_count from irq_call_count.
>
> Unfortunately this is not true for the smp_call_function_single case.
> The IPI is only sent if the target CPU's call_single_queue is empty when
> adding a csd into it in generic_exec_single. That means if two threads
> are both adding flush tlb csds to the same CPU's call_single_queue, only
> one IPI is sent. In other words, the irq_call_count is incremented by 1
> but irq_tlb_count is incremented by 2. Over time, irq_tlb_count will be
> bigger than irq_call_count and the substract will produce a very large
> irq_call_count value due to overflow.
>
> Considering that:
> 1 it's not worth to send more IPIs for the sake of accurate counting of
> irq_call_count in generic_exec_single;
> 2 it's not easy to tell if the call function interrupt is for TLB
> shootdown in __smp_call_function_single_interrupt.
> Not to exclude TLB shootdown from call function count seems to be the
> simplest fix and this patch just did that.
>
> This is found by LKP's cyclic performance regression tracking recently
> with the vm-scalability test suite. I have bisected to commit
> 0a7ce4b5a632 ("mm/rmap: share the i_mmap_rwsem"). This commit didn't do
> anything wrong but revealed the irq_call_count problem. IIUC, the commit
> makes rwc->remap_one in rmap_walk_file concurrent with multiple threads.
> When remap_one is try_to_unmap_one, then multiple threads could queue
> flush tlb to the same CPU but only one IPI will be sent.
>
> Since the commit enter Linux v3.19, the counting problem only shows up
> from v3.19. Considering this is a behaviour change, I'm not sure if I
> should add the stable tag here.
>
> Signed-off-by: Aaron Lu <aaron.lu@xxxxxxxxx>

Thanks for fix. You forget to add :)

Reported-by: "Huang, Ying" <ying.huang@xxxxxxxxx>

Best Regards,
Huang, Ying