[PATCH] perf/core: fix mlock accounting in perf_mmap()

From: Song Liu
Date: Fri Jan 17 2020 - 18:45:20 EST


sysctl_perf_event_mlock and user->locked_vm can change value
independently, so we can't guarantee:

user->locked_vm <= user_lock_limit

When user->locked_vm is larger than user_lock_limit, we cannot simply
update extra and user_extra as:

extra = user_locked - user_lock_limit;
user_extra -= extra;

Otherwise, user_extra will be negative. In extreme cases, this may lead to
negative user->locked_vm (until this perf-mmap is closed), which break
locked_vm badly.

Fix this with two separate conditions, which make sure user_extra is
always positive.

Fixes: c4b75479741c ("perf/core: Make the mlock accounting simple again")
Signed-off-by: Song Liu <songliubraving@xxxxxx>
Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
Cc: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
Cc: Peter Zijlstra <peterz@xxxxxxxxxxxxx>
---
kernel/events/core.c | 28 ++++++++++++++++++++++++----
1 file changed, 24 insertions(+), 4 deletions(-)

diff --git a/kernel/events/core.c b/kernel/events/core.c
index a1f8bde19b56..89acdd1574ef 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -5920,11 +5920,31 @@ static int perf_mmap(struct file *file, struct vm_area_struct *vma)

if (user_locked > user_lock_limit) {
/*
- * charge locked_vm until it hits user_lock_limit;
- * charge the rest from pinned_vm
+ * sysctl_perf_event_mlock and user->locked_vm can change
+ * value independently, so we can't guarantee:
+ *
+ * user->locked_vm <= user_lock_limit
+ *
+ * We need be careful to make sure user_extra >=0.
+ *
+ * Using "user_locked - user_extra" to avoid calling
+ * atomic_long_read() again.
*/
- extra = user_locked - user_lock_limit;
- user_extra -= extra;
+ if (user_locked - user_extra >= user_lock_limit) {
+ /*
+ * already used all user_locked_limit, charge all
+ * to pinned_vm
+ */
+ extra = user_extra;
+ user_extra = 0;
+ } else {
+ /*
+ * charge locked_vm until it hits user_lock_limit;
+ * charge the rest from pinned_vm
+ */
+ extra = user_locked - user_lock_limit;
+ user_extra -= extra;
+ }
}

lock_limit = rlimit(RLIMIT_MEMLOCK);
--
2.17.1