Re: [PATCH] mm/vmscan: fix data races at kswapd_classzone_idx

From: Andrew Morton
Date: Tue Feb 25 2020 - 21:11:28 EST


On Tue, 25 Feb 2020 11:55:26 -0500 Qian Cai <cai@xxxxxx> wrote:

> pgdat->kswapd_classzone_idx could be accessed concurrently in
> wakeup_kswapd(). Plain writes and reads without any lock protection
> result in data races. Fix them by adding a pair of READ|WRITE_ONCE() as
> well as saving a branch (compilers might well optimize the original code
> in an unintentional way anyway). The data races were reported by KCSAN,
>
> ...
>
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -3961,11 +3961,10 @@ void wakeup_kswapd(struct zone *zone, gfp_t gfp_flags, int order,
> return;
> pgdat = zone->zone_pgdat;
>
> - if (pgdat->kswapd_classzone_idx == MAX_NR_ZONES)
> - pgdat->kswapd_classzone_idx = classzone_idx;
> - else
> - pgdat->kswapd_classzone_idx = max(pgdat->kswapd_classzone_idx,
> - classzone_idx);
> + if (READ_ONCE(pgdat->kswapd_classzone_idx) == MAX_NR_ZONES ||
> + READ_ONCE(pgdat->kswapd_classzone_idx) < classzone_idx)
> + WRITE_ONCE(pgdat->kswapd_classzone_idx, classzone_idx);
> +
> pgdat->kswapd_order = max(pgdat->kswapd_order, order);
> if (!waitqueue_active(&pgdat->kswapd_wait))
> return;

This is very partial, isn't it? The above code itself is racy against
other code which manipulates ->kswapd_classzone_idx and the
manipulation in allow_direct_reclaim() is performed by threads other
than kswapd and so need the READ_ONCE treatment and is still racy with
that?

I guess occasional races here don't really matter, but a grossly wrong
read from load tearing might matter. In which case shouldn't we be
defending against them in all cases where non-kswapd threads read this
field?