[PATCH] Extend core dump note section to contain file names of mapped files

From: Denys Vlasenko
Date: Wed Jul 11 2012 - 06:36:08 EST


Hi,

Resending the patch after a while.
Jonathan, developer of CERT Triage Tools, expressed the need
to have this information, CCing him.

But before looking at the attached patch, we need a ruling.

In the last review it was proposed to maybe generate
this information in the form of ASCII text, a-la /proc/PID/maps.

This actually is a good idea, but regretfully, it come a few
decades too late, the rest of core file auxiliary information
is traditionally encoded in binary structures.

Please, can someone with authority in this area decide whether
we want to be unorthodox and use ASCII encoding for the whole thing,
or not?

If the decision will be to use ASCII, I will need to rework the patch.

Otherwise, please take a look at attached patch which implements
creation of a new note in binary format and let me know what do you think of it.

Original patch and description follows

* * * * * * * * * * * * * * * * * * * *

While working with core dump analysis, it struck me how much
PITA is caused merely by the fact that names of loaded binary
and libraries are not known.

gdb retrieves loaded library names by examining dynamic loader's
data stored in the core dump's data segments. It uses intimate
knowledge how and where dynamic loader keeps the list of loaded
libraries. (Meaning that it will break if non-standard loader
is used).

And, as Jan explained to me, it depends on knowing where
the linked list of libraries starts, which requires knowing binary
which was running. IIRC there is no easy and reasonably foolproof
way to determine binary's name. (Looking at argv[0] on stack
is not reasonably foolproof).

Which is *ridiculous*. We *know* the list of mapped files
at core dump generation time.

I propose to save this information in core dump, as a new note
in note segment.

This note has the following format:

long count // how many files are mapped
long page_size // units for file_ofs
array of [COUNT] elements of
long start
long end
long file_ofs
followed by COUNT filenames in ASCII: "FILE1" NUL "FILE2" NUL...
The attached patch implements this.

Since list of mapped files can be large (/proc/`pidof firefox`/maps
on my machine right now is 38k), I allocate the space for note
via vmalloc, and also have a sanity limit of 4 megabytes.
(Maybe we should make it smaller?)
Oleg suggested using a linked list of smaller structures instead of
using a potentially large contiguous block, and I tried it,
but resulting code was significantly more ugly (for my taste).

The patch is run-tested.

For testing, I sent ABRT signal to a running /usr/bin/md5sum.

"readelf -aW core" shows the new note as:

Notes at offset 0x00000274 with length 0x00000990:
Owner Data size Description
CORE 0x00000090 NT_PRSTATUS (prstatus structure)
CORE 0x0000007c NT_PRPSINFO (prpsinfo structure)
CORE 0x000000a0 NT_AUXV (auxiliary vector)
CORE 0x00000168 Unknown note type: (0x46494c45)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^new note^^^^^^^^^^
In hex format:
05 00 00 00 |................|
00000460 68 01 00 00 45 4c 49 46 46 49 4c 45 00 00 00 00 |h...ELIFCORE....|
00000470 0b 00 00 00 00 10 00 00 00 80 17 00 00 f0 31 00 |..............1.|
00000480 00 00 00 00 00 f0 31 00 00 00 32 00 a7 01 00 00 |......1...2.....|
00000490 00 00 32 00 00 20 32 00 a7 01 00 00 00 20 32 00 |..2.. 2...... 2.|
000004a0 00 30 32 00 a9 01 00 00 00 50 69 00 00 60 6b 00 |.02......Pi..`k.|
000004b0 00 00 00 00 00 60 6b 00 00 70 6b 00 20 00 00 00 |.....`k..pk. ...|
000004c0 00 70 6b 00 00 80 6b 00 21 00 00 00 00 80 04 08 |.pk...k.!.......|
000004d0 00 00 05 08 00 00 00 00 00 00 05 08 00 10 05 08 |................|
000004e0 07 00 00 00 00 10 05 08 00 20 05 08 08 00 00 00 |......... ......|
000004f0 00 20 52 b7 00 20 72 b7 00 00 00 00 2f 6c 69 62 |. R.. r...../lib|
00000500 2f 6c 69 62 63 2d 32 2e 31 34 2e 39 30 2e 73 6f |/libc-2.14.90.so|
00000510 00 2f 6c 69 62 2f 6c 69 62 63 2d 32 2e 31 34 2e |./lib/libc-2.14.|
00000520 39 30 2e 73 6f 00 2f 6c 69 62 2f 6c 69 62 63 2d |90.so./lib/libc-|
00000530 32 2e 31 34 2e 39 30 2e 73 6f 00 2f 6c 69 62 2f |2.14.90.so./lib/|
00000540 6c 69 62 63 2d 32 2e 31 34 2e 39 30 2e 73 6f 00 |libc-2.14.90.so.|
00000550 2f 6c 69 62 2f 6c 64 2d 32 2e 31 34 2e 39 30 2e |/lib/ld-2.14.90.|
00000560 73 6f 00 2f 6c 69 62 2f 6c 64 2d 32 2e 31 34 2e |so./lib/ld-2.14.|
00000570 39 30 2e 73 6f 00 2f 6c 69 62 2f 6c 64 2d 32 2e |90.so./lib/ld-2.|
00000580 31 34 2e 39 30 2e 73 6f 00 2f 75 73 72 2f 62 69 |14.90.so./usr/bi|
00000590 6e 2f 6d 64 35 73 75 6d 00 2f 75 73 72 2f 62 69 |n/md5sum./usr/bi|
000005a0 6e 2f 6d 64 35 73 75 6d 00 2f 75 73 72 2f 62 69 |n/md5sum./usr/bi|
000005b0 6e 2f 6d 64 35 73 75 6d 00 2f 75 73 72 2f 6c 69 |n/md5sum./usr/li|
000005c0 62 2f 6c 6f 63 61 6c 65 2f 6c 6f 63 61 6c 65 2d |b/locale/locale-|
000005d0 61 72 63 68 69 76 65 00 |archive.

--
vda

Attachment: file_note.patch
Description: Binary data