[PATCH] perf maps: Process kcore maps in order

From: Leo Yan
Date: Sun May 05 2024 - 16:29:03 EST


On Arm64, after enabling the 'DEBUG=1' build option, the tool exits
abnormally:

# ./perf report --itrace=Zi10ibT
# perf: util/maps.c:42: check_invariants: Assertion `map__end(prev) <= map__start(map) || map__start(prev) == map__start(map)' failed.
Aborted

The details for causing this error are described in below.

Firstly, machine__get_running_kernel_start() calculates the delta
between the '_stext' symbol and the '_edata' symbol for the kernel map,
alongside with eBPF maps:

DSO | Start address | End address
-----------------+--------------------+--------------------
kernel.kallsyms 0xffff800080000000 0xffff800082229200
bpf_prog 0xffff800082545aac 0xffff800082545c94
...

Then, the perf tool retrieves kcore maps:

Kcore maps | Start address | End address
-----------------+--------------------+--------------------
kcore_text 0xffff800080000000 0xffff8000822f0000
vmalloc 0xffff800080000000 0xfffffdffbf800000
...

Finally, the function dso__load_kcore() extends the kernel maps based on
the retrieved kcore info. Since it uses the inverted order for
processing the kcore maps, it extends maps for the vmalloc region prior
to the 'kcore_text' map:

DSO | Start address | End address
-----------------+--------------------+--------------------
kernel.kallsyms 0xffff800080000000 0xffff800082229200
kernel.kallsyms 0xffff800082229200 0xffff800082545aac -> Extended for vmalloc region
bpf_prog 0xffff800082545aac 0xffff800082545c94
...

DSO | Start address | End address
-----------------+--------------------+--------------------
kernel.kallsyms 0xffff800080000000 0xffff8000822f0000 -> Updated for kcore_text map
kernel.kallsyms 0xffff800082229200 0xffff800082545aac
bpf_prog 0xffff800082545aac 0xffff800082545c94
...

As result, the two maps [0xffff800080000000..0xffff8000822f0000) and
[0xffff800082229200..0xffff800082545aac) are overlapping and triggers
failure.

The current code processes kcore maps in inverted order. To fix it, this
patch adds kcore maps in the tail of list, afterwards these maps will be
processed in the order. Therefore, the kernel text section can be
processed prior to handling the vmalloc region, which avoids using the
inaccurate kernel text size when extending maps with the vmalloc region.

Signed-off-by: Leo Yan <leo.yan@xxxxxxx>
---
tools/perf/util/symbol.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/tools/perf/util/symbol.c b/tools/perf/util/symbol.c
index 9ebdb8e13c0b..e15d70845488 100644
--- a/tools/perf/util/symbol.c
+++ b/tools/perf/util/symbol.c
@@ -1266,7 +1266,24 @@ static int kcore_mapfn(u64 start, u64 len, u64 pgoff, void *data)
map__set_end(list_node->map, map__start(list_node->map) + len);
map__set_pgoff(list_node->map, pgoff);

- list_add(&list_node->node, &md->maps);
+ /*
+ * Kcore maps are ordered with:
+ * [_text.._end): Kernel text section
+ * [VMALLOC_START..VMALLOC_END): vmalloc
+ * ...
+ *
+ * On Arm64, the '_text' and 'VMALLOC_START' are the same values
+ * but VMALLOC_END (~124TiB) is much bigger then the text end
+ * address. So '_text' region is the subset of the vmalloc region.
+ *
+ * Afterwards, when dso__load_kcore() adjusts kernel maps, we must
+ * process the kernel text size prior to handling vmalloc region.
+ * This can avoid to using any inaccurate kernel text size when
+ * extending maps with vmalloc region. For this reason, here it
+ * always adds kcore maps to the tail of list to make sure the
+ * sequential handling is in order.
+ */
+ list_add_tail(&list_node->node, &md->maps);

return 0;
}
--
2.43.0