Re: [PATCH -next] mm/page_counter: mark intentional data races

From: Michal Hocko
Date: Wed Jan 29 2020 - 03:51:42 EST


On Tue 28-01-20 23:20:19, Qian Cai wrote:
> The commit 3e32cb2e0a12 ("mm: memcontrol: lockless page counters")
> had memcg->memsw->failcnt and ->watermark could be accessed concurrently
> as reported by KCSAN,
>
> Reported by Kernel Concurrency Sanitizer on:
> BUG: KCSAN: data-race in page_counter_try_charge / page_counter_try_charge
>
> read to 0xffff8fb18c4cd190 of 8 bytes by task 1081 on cpu 59:
> page_counter_try_charge+0x4d/0x150 mm/page_counter.c:138
> try_charge+0x131/0xd50
> __memcg_kmem_charge_memcg+0x58/0x140
> __memcg_kmem_charge+0xcc/0x280
> __alloc_pages_nodemask+0x1e1/0x450
> alloc_pages_current+0xa6/0x120
> pte_alloc_one+0x17/0xd0
> __pte_alloc+0x3a/0x1f0
> copy_p4d_range+0xc36/0x1990
> copy_page_range+0x21d/0x360
> dup_mmap+0x5f5/0x7a0
> dup_mm+0xa2/0x240
> copy_process+0x1b3f/0x3460
> _do_fork+0xaa/0xa20
> __x64_sys_clone+0x13b/0x170
> do_syscall_64+0x91/0xb47
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
>
> write to 0xffff8fb18c4cd190 of 8 bytes by task 1153 on cpu 120:
> page_counter_try_charge+0x5b/0x150 mm/page_counter.c:139
> try_charge+0x131/0xd50
> mem_cgroup_try_charge+0x159/0x460
> mem_cgroup_try_charge_delay+0x3d/0xa0
> wp_page_copy+0x14d/0x930
> do_wp_page+0x107/0x7b0
> __handle_mm_fault+0xce6/0xd40
> handle_mm_fault+0xfc/0x2f0
> do_page_fault+0x263/0x6f9
> page_fault+0x34/0x40
>
> Since the failcnt and watermark are tolerant of some inaccuracy, a data
> race will not be harmful, thus mark them as intentional data races with
> the data_race() macro.

I am not familiar with KCSAN and git grep for data_race on the current
linux-next doesn't really show any users of this macro. Is there a
general consensus that data_race is going to be used to silence all
KCSAN false positives?

> Signed-off-by: Qian Cai <cai@xxxxxx>
> ---
> mm/page_counter.c | 10 +++++-----
> 1 file changed, 5 insertions(+), 5 deletions(-)
>
> diff --git a/mm/page_counter.c b/mm/page_counter.c
> index de31470655f6..13934636eafd 100644
> --- a/mm/page_counter.c
> +++ b/mm/page_counter.c
> @@ -82,8 +82,8 @@ void page_counter_charge(struct page_counter *counter, unsigned long nr_pages)
> * This is indeed racy, but we can live with some
> * inaccuracy in the watermark.
> */
> - if (new > c->watermark)
> - c->watermark = new;
> + if (data_race(new > c->watermark))
> + data_race(c->watermark = new);
> }
> }
>
> @@ -126,7 +126,7 @@ bool page_counter_try_charge(struct page_counter *counter,
> * This is racy, but we can live with some
> * inaccuracy in the failcnt.
> */
> - c->failcnt++;
> + data_race(c->failcnt++);
> *fail = c;
> goto failed;
> }
> @@ -135,8 +135,8 @@ bool page_counter_try_charge(struct page_counter *counter,
> * Just like with failcnt, we can live with some
> * inaccuracy in the watermark.
> */
> - if (new > c->watermark)
> - c->watermark = new;
> + if (data_race(new > c->watermark))
> + data_race(c->watermark = new);
> }
> return true;
>
> --
> 2.21.0 (Apple Git-122.2)

--
Michal Hocko
SUSE Labs