Re: [PATCH 00/15] Swap-over-NBD without deadlocking V8

From: Hillf Danton
Date: Tue Feb 07 2012 - 07:45:17 EST


On Tue, Feb 7, 2012 at 6:56 AM, Mel Gorman <mgorman@xxxxxxx> wrote:
>
> The core issue is that network block devices do not use mempools like normal
> block devices do. As the host cannot control where they receive packets from,
> they cannot reliably work out in advance how much memory they might need.
>
>
> Patch 1 serialises access to min_free_kbytes. It's not strictly needed
> Â Â Â Âby this series but as the series cares about watermarks in
> Â Â Â Âgeneral, it's a harmless fix. It could be merged independently.
>
>
Any light shed on tuning min_free_kbytes for every day work?


> Patch 2 adds knowledge of the PFMEMALLOC reserves to SLAB and SLUB to
> Â Â Â Âpreserve access to pages allocated under low memory situations
> Â Â Â Âto callers that are freeing memory.
>
> Patch 3 introduces __GFP_MEMALLOC to allow access to the PFMEMALLOC
> Â Â Â Âreserves without setting PFMEMALLOC.
>
> Patch 4 opens the possibility for softirqs to use PFMEMALLOC reserves
> Â Â Â Âfor later use by network packet processing.
>
> Patch 5 ignores memory policies when ALLOC_NO_WATERMARKS is set.
>
> Patches 6-11 allows network processing to use PFMEMALLOC reserves when
> Â Â Â Âthe socket has been marked as being used by the VM to clean
> Â Â Â Âpages. If packets are received and stored in pages that were
> Â Â Â Âallocated under low-memory situations and are unrelated to
> Â Â Â Âthe VM, the packets are dropped.
>
> Patch 12 is a micro-optimisation to avoid a function call in the
> Â Â Â Âcommon case.
>
> Patch 13 tags NBD sockets as being SOCK_MEMALLOC so they can use
> Â Â Â ÂPFMEMALLOC if necessary.
>
If it is feasible to bypass hang by tuning min_mem_kbytes, things may
become simpler if NICs are also tagged. Sock buffers, pre-allocated if
necessary just after NICs are turned on, are not handed back to kmem
cache but queued on local lists which are maintained by NIC driver, based
the on the info of min_mem_kbytes or similar, for tagged NICs.
Upside is no changes in VM core. Downsides?


> Patch 14 notes that it is still possible for the PFMEMALLOC reserve
> Â Â Â Âto be depleted. To prevent this, direct reclaimers get
> Â Â Â Âthrottled on a waitqueue if 50% of the PFMEMALLOC reserves are
> Â Â Â Âdepleted. ÂIt is expected that kswapd and the direct reclaimers
> Â Â Â Âalready running will clean enough pages for the low watermark
> Â Â Â Âto be reached and the throttled processes are woken up.
>
> Patch 15 adds a statistic to track how often processes get throttled
>
>
> For testing swap-over-NBD, a machine was booted with 2G of RAM with a
> swapfile backed by NBD. 8*NUM_CPU processes were started that create
> anonymous memory mappings and read them linearly in a loop. The total
> size of the mappings were 4*PHYSICAL_MEMORY to use swap heavily under
> memory pressure. Without the patches, the machine locks up within
> minutes and runs to completion with them applied.
>
>
While testing, what happens if the network wire is plugged off over
three minutes?

Thanks
Hillf
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/