Re: [PATCH 0/9] Reduce latencies and improve overall reclaimefficiency v1

From: Minchan Kim
Date: Mon Sep 13 2010 - 19:10:27 EST


On Mon, Sep 6, 2010 at 7:47 PM, Mel Gorman <mel@xxxxxxxxx> wrote:

<snip>

>
> These are just the raw figures taken from /proc/vmstat. It's a rough measure
> of reclaim activity. Note that allocstall counts are higher because we
> are entering direct reclaim more often as a result of not sleeping in
> congestion. In itself, it's not necessarily a bad thing. It's easier to
> get a view of what happened from the vmscan tracepoint report.
>
> FTrace Reclaim Statistics: vmscan
>            micro-traceonly-v1r5-micromicro-nocongest-v1r5-micromicro-lowlumpy-v1r5-micromicro-nodirect-v1r5-micro
>                traceonly-v1r5    nocongest-v1r5     lowlumpy-v1r5     nodirect-v1r5
> Direct reclaims                                152        941        967        729
> Direct reclaim pages scanned                507377    1404350    1332420    1450213
> Direct reclaim pages reclaimed               10968      72042      77186      41097
> Direct reclaim write file async I/O              0          0          0          0
> Direct reclaim write anon async I/O              0          0          0          0
> Direct reclaim write file sync I/O               0          0          0          0
> Direct reclaim write anon sync I/O               0          0          0          0
> Wake kswapd requests                        127195     241025     254825     188846
> Kswapd wakeups                                   6          1          1          1
> Kswapd pages scanned                       4210101    3345122    3427915    3306356
> Kswapd pages reclaimed                     2228073    2165721    2143876    2194611
> Kswapd reclaim write file async I/O              0          0          0          0
> Kswapd reclaim write anon async I/O              0          0          0          0
> Kswapd reclaim write file sync I/O               0          0          0          0
> Kswapd reclaim write anon sync I/O               0          0          0          0
> Time stalled direct reclaim (seconds)         7.60       3.03       3.24       3.43
> Time kswapd awake (seconds)                  12.46       9.46       9.56       9.40
>
> Total pages scanned                        4717478   4749472   4760335   4756569
> Total pages reclaimed                      2239041   2237763   2221062   2235708
> %age total pages scanned/reclaimed          47.46%    47.12%    46.66%    47.00%
> %age total pages scanned/written             0.00%     0.00%     0.00%     0.00%
> %age  file pages scanned/written             0.00%     0.00%     0.00%     0.00%
> Percentage Time Spent Direct Reclaim        43.80%    21.38%    22.34%    23.46%
> Percentage Time kswapd Awake                79.92%    79.56%    79.20%    80.48%


There is a nitpick about stalled reclaim time.
For example, In direct reclaim

===
trace_mm_vmscan_direct_reclaim_begin(order,
sc.may_writepage,
gfp_mask);

nr_reclaimed = do_try_to_free_pages(zonelist, &sc);

trace_mm_vmscan_direct_reclaim_end(nr_reclaimed);
===

In this case, Isn't this time accumulated value?
My point is following as.

Process A Process B
direct reclaim begin
do_try_to_free_pages
cond_resched

direct reclaim begin

do_try_to_free_pages

direct reclaim end
direct reclaim end


So A's result includes B's time so total stall time would be bigger than real.


--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/