[PATCH] mm, swap: Remove unnecessary smp_rmb() in swap_type_to_swap_info()

From: Huang Ying
Date: Thu May 13 2021 - 02:49:03 EST


Before commit c10d38cc8d3e ("mm, swap: bounds check swap_info array
accesses to avoid NULL derefs"), the typical code to reference the
swap_info[] is as follows,

type = swp_type(swp_entry);
if (type >= nr_swapfiles)
/* handle invalid swp_entry */;
p = swap_info[type];
/* access fields of *p. OOPS! p may be NULL! */

Because the ordering isn't guaranteed, it's possible that "p" is read
before checking "type". And that may result in NULL pointer
dereference.

So in commit c10d38cc8d3e, the code becomes,

struct swap_info_struct *swap_type_to_swap_info(int type)
{
if (type >= READ_ONCE(nr_swapfiles))
return NULL;
smp_rmb();
return READ_ONCE(swap_info[type]);
}

/* users */
type = swp_type(swp_entry);
p = swap_type_to_swap_info(type);
if (!p)
/* handle invalid swp_entry */;
/* access fields of *p */

Because "p" is checked to be non-zero before dereference, smp_rmb()
isn't needed anymore.

We still need to guarantee swap_info[type] is read before dereference.
That can be satisfied via the data dependency ordering of
READ_ONCE(swap_info[type]). The corresponding smp_wmb() is adjusted
in alloc_swap_info() too.

And, we don't need to read "nr_swapfiles" too. Because if
"type >= nr_swapfiles", swap_info[type] will be NULL. We just need
to make sure we will not access out of the boundary of the array.
With that change, nr_swapfiles will only be accessed with swap_lock
held, except in swapcache_free_entries(). Where the absolute
correctness of the value isn't needed, as described in the comments.

Signed-off-by: "Huang, Ying" <ying.huang@xxxxxxxxx>
Cc: Daniel Jordan <daniel.m.jordan@xxxxxxxxxx>
Cc: Dan Carpenter <dan.carpenter@xxxxxxxxxx>
Cc: Andrea Parri <andrea.parri@xxxxxxxxxxxxxxxxxxxx>
Cc: Peter Zijlstra (Intel) <peterz@xxxxxxxxxxxxx>
Cc: Andi Kleen <ak@xxxxxxxxxxxxxxx>
Cc: Dave Hansen <dave.hansen@xxxxxxxxxxxxxxx>
Cc: Omar Sandoval <osandov@xxxxxx>
Cc: Paul McKenney <paulmck@xxxxxxxxxx>
Cc: Tejun Heo <tj@xxxxxxxxxx>
Cc: Will Deacon <will.deacon@xxxxxxx>
Cc: Miaohe Lin <linmiaohe@xxxxxxxxxx>
---
mm/swapfile.c | 18 +++++++++---------
1 file changed, 9 insertions(+), 9 deletions(-)

diff --git a/mm/swapfile.c b/mm/swapfile.c
index 2aad85751991..4c1fb28bbe0e 100644
--- a/mm/swapfile.c
+++ b/mm/swapfile.c
@@ -100,10 +100,14 @@ atomic_t nr_rotate_swap = ATOMIC_INIT(0);

static struct swap_info_struct *swap_type_to_swap_info(int type)
{
- if (type >= READ_ONCE(nr_swapfiles))
+ if (type >= MAX_SWAPFILES)
return NULL;

- smp_rmb(); /* Pairs with smp_wmb in alloc_swap_info. */
+ /*
+ * The data dependency ordering from the READ_ONCE() pairs
+ * with smp_wmb() in alloc_swap_info() to guarantee the
+ * swap_info_struct fields are read after swap_info[type].
+ */
return READ_ONCE(swap_info[type]);
}

@@ -2884,14 +2888,10 @@ static struct swap_info_struct *alloc_swap_info(void)
}
if (type >= nr_swapfiles) {
p->type = type;
- WRITE_ONCE(swap_info[type], p);
- /*
- * Write swap_info[type] before nr_swapfiles, in case a
- * racing procfs swap_start() or swap_next() is reading them.
- * (We never shrink nr_swapfiles, we never free this entry.)
- */
+ /* Paired with READ_ONCE() in swap_type_to_swap_info() */
smp_wmb();
- WRITE_ONCE(nr_swapfiles, nr_swapfiles + 1);
+ WRITE_ONCE(swap_info[type], p);
+ nr_swapfiles++;
} else {
defer = p;
p = swap_info[type];
--
2.30.2