[PATCH sl-b 5/6] percpu_ref: Print allocator upon reference-count underflow

From: paulmck
Date: Fri Dec 04 2020 - 19:42:01 EST


From: "Paul E. McKenney" <paulmck@xxxxxxxxxx>

Reference-count underflow for percpu_ref is detected in the RCU callback
percpu_ref_switch_to_atomic_rcu(), and the resulting warning does not
print anything allowing easy identification of which percpu_ref use
case is underflowing. This is of course not normally a problem when
developing a new percpu_ref use case because it is most likely that
the problem resides in this new use case. However, when deploying a
new kernel to a large set of servers, the underflow might well be a new
corner case in any of the old percpu_ref use cases.

This commit therefore prints the percpu_ref allocation site using the
new kmem_last_alloc() and kmem_last_alloc_errstring() functions in order
to provide a bit more information for the kernel-deployment case.

Cc: Ming Lei <ming.lei@xxxxxxxxxx>
Cc: Jens Axboe <axboe@xxxxxxxxx>
Reported-by: Andrii Nakryiko <andrii@xxxxxxxxxx>
Signed-off-by: Paul E. McKenney <paulmck@xxxxxxxxxx>
---
lib/percpu-refcount.c | 15 ++++++++++++---
1 file changed, 12 insertions(+), 3 deletions(-)

diff --git a/lib/percpu-refcount.c b/lib/percpu-refcount.c
index e59eda0..8c7b21a0 100644
--- a/lib/percpu-refcount.c
+++ b/lib/percpu-refcount.c
@@ -169,6 +169,8 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
struct percpu_ref *ref = data->ref;
unsigned long __percpu *percpu_count = percpu_count_ptr(ref);
unsigned long count = 0;
+ void *allocaddr;
+ const char *allocerr;
int cpu;

for_each_possible_cpu(cpu)
@@ -191,9 +193,16 @@ static void percpu_ref_switch_to_atomic_rcu(struct rcu_head *rcu)
*/
atomic_long_add((long)count - PERCPU_COUNT_BIAS, &data->count);

- WARN_ONCE(atomic_long_read(&data->count) <= 0,
- "percpu ref (%ps) <= 0 (%ld) after switching to atomic",
- data->release, atomic_long_read(&data->count));
+ if (atomic_long_read(&data->count) <= 0) {
+ allocaddr = kmem_last_alloc(data);
+ allocerr = kmem_last_alloc_errstring(allocaddr);
+ if (allocerr)
+ WARN_ONCE(1, "percpu ref (%ps) <= 0 (%ld) after switching to atomic (%s)",
+ data->release, atomic_long_read(&data->count), allocerr);
+ else
+ WARN_ONCE(1, "percpu ref (%ps) <= 0 (%ld) after switching to atomic (allocated at %pS)",
+ data->release, atomic_long_read(&data->count), allocaddr);
+ }

/* @ref is viewed as dead on all CPUs, send out switch confirmation */
percpu_ref_call_confirm_rcu(rcu);
--
2.9.5