RE: [PATCH v3 3/4] perf: Use CAP_SYSLOG with kptr_restrict checks

From: Lubashev, Igor
Date: Thu Aug 15 2019 - 18:29:05 EST


On Thu, August 15, 2019 at 4:17 PM Mathieu Poirier <mathieu.poirier@xxxxxxxxxx> wrote:
> On Wed, 14 Aug 2019 at 14:02, Lubashev, Igor <ilubashe@xxxxxxxxxx>
> wrote:
> >
> > > On Wed, August 14, 2019 at 2:52 PM Arnaldo Carvalho de Melo
> <arnaldo.melo@xxxxxxxxx> wrote:
> > > Em Wed, Aug 14, 2019 at 03:48:14PM -0300, Arnaldo Carvalho de Melo
> > > escreveu:
> > > > Em Wed, Aug 14, 2019 at 12:04:33PM -0600, Mathieu Poirier escreveu:
> > > > > # echo 0 > /proc/sys/kernel/kptr_restrict # ./tools/perf/perf
> > > > > record -e instructions:k uname
> > > > > perf: Segmentation fault
> > > > > Obtained 10 stack frames.
> > > > > ./tools/perf/perf(sighandler_dump_stack+0x44) [0x55af9e5da5d4]
> > > > > /lib/x86_64-linux-gnu/libc.so.6(+0x3ef20) [0x7fd31efb6f20]
> > > > > ./tools/perf/perf(perf_event__synthesize_kernel_mmap+0xa7)
> > > > > [0x55af9e590337]
> > > > > ./tools/perf/perf(+0x1cf5be) [0x55af9e50c5be]
> > > > > ./tools/perf/perf(cmd_record+0x1022) [0x55af9e50dff2]
> > > > > ./tools/perf/perf(+0x23f98d) [0x55af9e57c98d]
> > > > > ./tools/perf/perf(+0x23fc9e) [0x55af9e57cc9e]
> > > > > ./tools/perf/perf(main+0x369) [0x55af9e4f6bc9]
> > > > > /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0xe7)
> > > > > [0x7fd31ef99b97]
> > > > > ./tools/perf/perf(_start+0x2a) [0x55af9e4f704a] Segmentation
> > > > > fault
> > > > >
> > > > > I can reproduce this on both x86 and ARM64.
> > > >
> > > > I don't see this with these two csets removed:
> > > >
> > > > 7ff5b5911144 perf symbols: Use CAP_SYSLOG with kptr_restrict
> > > > checks d7604b66102e perf tools: Use CAP_SYS_ADMIN with
> > > > perf_event_paranoid checks
> > > >
> > > > Which were the ones I guessed were related to the problem you
> > > > reported, so they are out of my ongoing perf/core pull request to
> > > > Ingo/Thomas, now trying with these applied and your instructions...

SNIP

> I isolated the problem to libcap-dev - if it is not installed a segmentation fault
> will occur. Since this set is about using capabilities it is obvious that not
> having it on a system should fail a trace session, but it should not crash it.
>
> If libcap-dev is not installed function symbol__restricted_filename() will
> return true, which in turn will prevent symbol_name to be set in
> machine__get_running_kernel_start(). That prevents function
> map__set_kallsyms_ref_reloc_sym() from being called in
> machine__create_kernel_maps(), resulting in kmap->ref_reloc_sym being
> NULL in _perf_event__synthesize_kernel_mmap() and a segmentation fault.

Thank you, great find.

> I am not sure how this can be fixed. I counted a total of 19 instances where
> kmap->ref_reloc_sym->XYZ is called, only 2 of wich care to check if kmap-
> >ref_reloc_sym is valid before proceeding. As such I must hope that in the
> 17 other cases, kmap->ref_reloc_sym is guaranteed to be valid. If I am
> correct then all we need is to check for a valid pointer in
> _perf_event__synthesize_kernel_mmap().
> Otherwise it will be a little harder.

I also see 19 instances in 5 files, but by my inspection all cases but one are ok (the code checks for NULL earlier in the function or in a helper function).

The not ok case is __perf_event__synthesize_kermel_mmap(), which simply checks symbol_conf.kptr_restrict.

==== Option 1 =====
Fix __perf_event__synthesize_kermel_mmap(). This probably should be done regardless.

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index f440fdc3e953..b84f5f3c9651 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -913,11 +913,13 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
int err;
union perf_event *event;

- if (symbol_conf.kptr_restrict)
- return -1;
if (map == NULL)
return -1;

+ kmap = map__kmap(map);
+ if (!kmap->ref_reloc_sym)
+ return -1;
+
/*
* We should get this from /sys/kernel/sections/.text, but till that is
* available use this, and after it is use this as a fallback for older
@@ -940,7 +942,6 @@ static int __perf_event__synthesize_kernel_mmap(struct perf_tool *tool,
event->header.misc = PERF_RECORD_MISC_GUEST_KERNEL;
}

- kmap = map__kmap(map);
size = snprintf(event->mmap.filename, sizeof(event->mmap.filename),
"%s%s", machine->mmap_name, kmap->ref_reloc_sym->name) + 1;
size = PERF_ALIGN(size, sizeof(u64));
--

==== Option 2 =====
Move the new perf_event_paranoid() check from symbol__restricted_filename() to symbol__read_kptr_restrict().
Other than the use above, kptr_restrict is only used for printing warnings.

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 7409d2facd5b..035f2e75728c 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -898,11 +898,7 @@ bool symbol__restricted_filename(const char *filename,
{
bool restricted = false;

- /* Per kernel/kallsyms.c:
- * we also restrict when perf_event_paranoid > 1 w/o CAP_SYSLOG
- */
- if (symbol_conf.kptr_restrict ||
- (perf_event_paranoid() > 1 && !perf_cap__capable(CAP_SYSLOG))) {
+ if (symbol_conf.kptr_restrict) {
char *r = realpath(filename, NULL);

if (r != NULL) {
@@ -2209,6 +2205,12 @@ static bool symbol__read_kptr_restrict(void)
fclose(fp);
}

+ /* Per kernel/kallsyms.c:
+ * we also restrict when perf_event_paranoid > 1 w/o CAP_SYSLOG
+ */
+ if (perf_event_paranoid() > 1 && !perf_cap__capable(CAP_SYSLOG))
+ value = true;
+
return value;
}

--------- And then update the warnings. -----------

diff --git a/tools/perf/builtin-record.c b/tools/perf/builtin-record.c
index f71631f2bcb5..18505d92ff69 100644
--- a/tools/perf/builtin-record.c
+++ b/tools/perf/builtin-record.c
@@ -2372,7 +2372,7 @@ int cmd_record(int argc, const char **argv)
if (symbol_conf.kptr_restrict && !perf_evlist__exclude_kernel(rec->evlist))
pr_warning(
"WARNING: Kernel address maps (/proc/{kallsyms,modules}) are restricted,\n"
-"check /proc/sys/kernel/kptr_restrict.\n\n"
+"check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.\n\n"
"Samples in kernel functions may not be resolved if a suitable vmlinux\n"
"file is not found in the buildid cache or in the vmlinux path.\n\n"
"Samples in kernel modules won't be resolved at all.\n\n"
diff --git a/tools/perf/builtin-top.c b/tools/perf/builtin-top.c
index 5970723cd55a..29e910fb2d9a 100644
--- a/tools/perf/builtin-top.c
+++ b/tools/perf/builtin-top.c
@@ -770,7 +770,7 @@ static void perf_event__process_sample(struct perf_tool *tool,
if (!perf_evlist__exclude_kernel(top->session->evlist)) {
ui__warning(
"Kernel address maps (/proc/{kallsyms,modules}) are restricted.\n\n"
-"Check /proc/sys/kernel/kptr_restrict.\n\n"
+"Check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.\n\n"
"Kernel%s samples will not be resolved.\n",
al.map && map__has_symbols(al.map) ?
" modules" : "");
diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
index bc44ed29e05a..9443b8e05810 100644
--- a/tools/perf/builtin-trace.c
+++ b/tools/perf/builtin-trace.c
@@ -1381,7 +1381,7 @@ static char *trace__machine__resolve_kernel_addr(void *vmachine, unsigned long l

if (symbol_conf.kptr_restrict) {
pr_warning("Kernel address maps (/proc/{kallsyms,modules}) are restricted.\n\n"
- "Check /proc/sys/kernel/kptr_restrict.\n\n"
+ "Check /proc/sys/kernel/kptr_restrict and /proc/sys/kernel/perf_event_paranoid.\n\n"
"Kernel samples will not be resolved.\n");
machine->kptr_restrict_warned = true;
return NULL;


- Igor