Re: [PATCH v2 3/4] perf augmented_raw_syscalls: Support arm64 raw syscalls

From: Leo Yan
Date: Tue Jun 11 2019 - 00:23:16 EST


On Mon, Jun 10, 2019 at 03:47:54PM -0300, Arnaldo Carvalho de Melo wrote:

[...]

> > > I tested with the lastest perf/core branch which contains the patch:
> > > 'perf augmented_raw_syscalls: Tell which args are filenames and how
> > > many bytes to copy' and got the error as below:
> > >
> > > # perf trace -e string -e /mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c
> > > Error: Invalid syscall access, chmod, chown, creat, futimesat, lchown, link, lstat, mkdir, mknod, newfstatat, open, readlink, rename,
> > > rmdir, stat, statfs, symlink, truncate, unlink
>
> Humm, I think that we can just make the code that parses the
> tools/perf/trace/strace/groups/string file to ignore syscalls it can't
> find in the syscall_tbl, i.e. trace those if they exist in the arch.

Agree.

> > > Hint: try 'perf list syscalls:sys_enter_*'
> > > Hint: and: 'man syscalls'
> > >
> > > So seems mksyscalltbl has not included completely for syscalls, I
> > > use below command to generate syscalltbl_arm64[] array and it don't
> > > include related entries for access, chmod, chown, etc ...
>
> So, we need to investigate why is that these are missing, good thing we
> have this 'strings' group :-)
>
> > > You could refer the generated syscalltbl_arm64 in:
> > > http://paste.ubuntu.com/p/8Bj7Jkm2mP/
> >
> > After digging into this issue on Arm64, below is summary info:
> >
> > - arm64 uses the header include/uapi/linux/unistd.h to define system
> > call numbers, in this header some system calls are not defined (I
> > think the reason is these system calls are obsolete at the end) so the
> > corresponding strings are missed in the array syscalltbl_native,
> > for arm64 the array is defined in the file:
> > tools/perf/arch/arm64/include/generated/asm/syscalls.c.
>
> Yeah, I looked at the 'access' case and indeed it is not present in
> include/uapi/asm-generic/unistd.h, which is the place
> include/uapi/linux/unistd.h ends up.
>
> Ok please take a look at the patch at the end of this message, should be ok?
>
> I tested it by changing the strace/gorups/string file to have a few
> unknown syscalls, running it with -v we see:
>
> [root@quaco perf]# perf trace -v -e string ls
> Skipping unknown syscalls: access99, acct99, add_key99
> <SNIP other verbose messages>
> normal operation not considering those unknown syscalls.

I did testing with the patch, but it failed after I added eBPF event
with below command, I even saw segmentation fault; please see below
inline comments.

perf --debug verbose=10 trace -e string -e \
/mnt/linux-kernel/linux-cs-dev/tools/perf/examples/bpf/augmented_raw_syscalls.c

[...]

> commit e0b34a78c4ed0a6422f5b2dafa0c8936e537ee41
> Author: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
> Date: Mon Jun 10 15:37:45 2019 -0300
>
> perf trace: Skip unknown syscalls when expanding strace like syscall groups
>
> We have $INSTALL_DIR/share/perf-core/strace/groups/string files with
> syscalls that should be selected when 'string' is used, meaning, in this
> case, syscalls that receive as one of its arguments a string, like a
> pathname.
>
> But those were first selected and tested on x86_64, and end up failing
> in architectures where some of those syscalls are not available, like
> the 'access' syscall on arm64, which makes using 'perf trace -e string'
> in such archs to fail.
>
> Since this the routine doing the validation is used only when reading
> such files, do not fail when some syscall is not found in the
> syscalltbl, instead just use pr_debug() to register that in case people
> are suspicious of problems.
>
> Now using 'perf trace -e string' should work on arm64, selecting only
> the syscalls that have a string and are available on that architecture.
>
> Reported-by: Leo Yan <leo.yan@xxxxxxxxxx>
> Cc: Adrian Hunter <adrian.hunter@xxxxxxxxx>
> Cc: Alexander Shishkin <alexander.shishkin@xxxxxxxxxxxxxxx>
> Cc: Alexei Starovoitov <ast@xxxxxxxxxx>
> Cc: Daniel Borkmann <daniel@xxxxxxxxxxxxx>
> Cc: Jiri Olsa <jolsa@xxxxxxxxxx>
> Cc: Martin KaFai Lau <kafai@xxxxxx>
> Cc: Mathieu Poirier <mathieu.poirier@xxxxxxxxxx>
> Cc: Mike Leach <mike.leach@xxxxxxxxxx>
> Cc: Namhyung Kim <namhyung@xxxxxxxxxx>
> Cc: Song Liu <songliubraving@xxxxxx>
> Cc: Suzuki K Poulose <suzuki.poulose@xxxxxxx>
> Cc: Yonghong Song <yhs@xxxxxx>
> Link: https://lkml.kernel.org/n/tip-oa4c2x8p3587jme0g89fyg18@xxxxxxxxxxxxxx
> Signed-off-by: Arnaldo Carvalho de Melo <acme@xxxxxxxxxx>
>
> diff --git a/tools/perf/builtin-trace.c b/tools/perf/builtin-trace.c
> index 1a2a605cf068..eb70a4b71755 100644
> --- a/tools/perf/builtin-trace.c
> +++ b/tools/perf/builtin-trace.c
> @@ -1529,6 +1529,7 @@ static int trace__read_syscall_info(struct trace *trace, int id)
> static int trace__validate_ev_qualifier(struct trace *trace)
> {
> int err = 0, i;
> + bool printed_invalid_prefix = false;
> size_t nr_allocated;
> struct str_node *pos;
>
> @@ -1555,14 +1556,15 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> if (id >= 0)
> goto matches;
>
> - if (err == 0) {
> - fputs("Error:\tInvalid syscall ", trace->output);
> - err = -EINVAL;
> + if (!printed_invalid_prefix) {
> + pr_debug("Skipping unknown syscalls: ");
> + printed_invalid_prefix = true;
> } else {
> - fputs(", ", trace->output);
> + pr_debug(", ");
> }
>
> - fputs(sc, trace->output);
> + pr_debug("%s", sc);
> + continue;

Here adds 'continue' so that we want to let ev_qualifier_ids.entries
to only store valid system call ids. But this is not sufficient,
because we have initialized ev_qualifier_ids.nr at the beginning of
the function:

trace->ev_qualifier_ids.nr = strlist__nr_entries(trace->ev_qualifier);

This sentence will set ids number to the string table's length; but
actually some strings are not really supported; this leads to some
items in trace->ev_qualifier_ids.entries[] will be not initialized
properly.

If we want to get neat entries and entry number, I suggest at the
beginning of the function we use variable 'nr_allocated' to store
string table length and use it to allocate entries:

nr_allocated = strlist__nr_entries(trace->ev_qualifier);
trace->ev_qualifier_ids.entries = malloc(nr_allocated *
sizeof(trace->ev_qualifier_ids.entries[0]));

If we find any matched string, then increment the nr field under
'matches' tag:

matches:
trace->ev_qualifier_ids.nr++;
trace->ev_qualifier_ids.entries[i++] = id;

This can ensure the entries[0..nr-1] has valid id and we can use
ev_qualifier_ids.nr to maintain the valid system call numbers.

> }
> matches:
> trace->ev_qualifier_ids.entries[i++] = id;
> @@ -1591,15 +1593,14 @@ static int trace__validate_ev_qualifier(struct trace *trace)
> }
> }
>
> - if (err < 0) {
> - fputs("\nHint:\ttry 'perf list syscalls:sys_enter_*'"
> - "\nHint:\tand: 'man syscalls'\n", trace->output);
> -out_free:
> - zfree(&trace->ev_qualifier_ids.entries);
> - trace->ev_qualifier_ids.nr = 0;
> - }
> out:
> + if (printed_invalid_prefix)
> + pr_debug("\n");
> return err;
> +out_free:
> + zfree(&trace->ev_qualifier_ids.entries);
> + trace->ev_qualifier_ids.nr = 0;
> + goto out;

Nitpick: directly return err and 'goto out' is not necessary.

Thanks,
Leo Yan

> }
>
> /*