Re: [PATCH 40/41] mm: separate vma->lock from vm_area_struct

From: Jann Horn
Date: Tue Jan 17 2023 - 14:35:38 EST

Next message: Rob Herring: "Re: [PATCH] dt-bindings: clock: qcom,a53pll: drop operating-points-v2"
Previous message: Linus Torvalds: "Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user"
Next in thread: Suren Baghdasaryan: "Re: [PATCH 40/41] mm: separate vma->lock from vm_area_struct"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Mon, Jan 9, 2023 at 9:55 PM Suren Baghdasaryan <surenb@xxxxxxxxxx> wrote:
> vma->lock being part of the vm_area_struct causes performance regression
> during page faults because during contention its count and owner fields
> are constantly updated and having other parts of vm_area_struct used
> during page fault handling next to them causes constant cache line
> bouncing. Fix that by moving the lock outside of the vm_area_struct.
> All attempts to keep vma->lock inside vm_area_struct in a separate
> cache line still produce performance regression especially on NUMA
> machines. Smallest regression was achieved when lock is placed in the
> fourth cache line but that bloats vm_area_struct to 256 bytes.

Just checking: When you tested putting the lock in different cache
lines, did you force the slab allocator to actually store the
vm_area_struct with cacheline alignment (by setting SLAB_HWCACHE_ALIGN
on the slab or with a ____cacheline_aligned_in_smp on the struct
definition)?

Next message: Rob Herring: "Re: [PATCH] dt-bindings: clock: qcom,a53pll: drop operating-points-v2"
Previous message: Linus Torvalds: "Re: [PATCHv14 08/17] x86/mm: Reduce untagged_addr() overhead until the first LAM user"
Next in thread: Suren Baghdasaryan: "Re: [PATCH 40/41] mm: separate vma->lock from vm_area_struct"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]