Re: [PATCH v4 00/10] Function Granular KASLR

From: Fangrui Song
Date: Sat Jan 23 2021 - 18:00:43 EST


On 2020-08-28, Josh Poimboeuf wrote:
On Fri, Aug 28, 2020 at 12:21:13PM +0200, Miroslav Benes wrote:
> Hi there! I was trying to find a super easy way to address this, so I
> thought the best thing would be if there were a compiler or linker
> switch to just eliminate any duplicate symbols at compile time for
> vmlinux. I filed this question on the binutils bugzilla looking to see
> if there were existing flags that might do this, but H.J. Lu went ahead
> and created a new one "-z unique", that seems to do what we would need
> it to do.
>
> https://sourceware.org/bugzilla/show_bug.cgi?id=26391
>
> When I use this option, it renames any duplicate symbols with an
> extension - for example duplicatefunc.1 or duplicatefunc.2. You could
> either match on the full unique name of the specific binary you are
> trying to patch, or you match the base name and use the extension to
> determine original position. Do you think this solution would work?

Yes, I think so (thanks, Joe, for testing!).

It looks cleaner to me than the options above, but it may just be a matter
of taste. Anyway, I'd go with full name matching, because -z unique-symbol
would allow us to remove sympos altogether, which is appealing.

> If
> so, I can modify livepatch to refuse to patch on duplicated symbols if
> CONFIG_FG_KASLR and when this option is merged into the tool chain I
> can add it to KBUILD_LDFLAGS when CONFIG_FG_KASLR and livepatching
> should work in all cases.

Ok.

Josh, Petr, would this work for you too?

Sounds good to me. Kristen, thanks for finding a solution!

(I am not subscribed. I came here via https://sourceware.org/bugzilla/show_bug.cgi?id=26391 (ld -z unique-symbol))

This works great after randomization because it always receives the
current address at runtime rather than relying on any kind of
buildtime address. The issue with with the live-patching code's
algorithm for resolving duplicate symbol names. If they request a
symbol by name from the kernel and there are 3 symbols with the same
name, they use the symbol's position in the built binary image to
select the correct symbol.

If a.o, b.o and c.o define local symbol 'foo'.
By position, do you mean that

* the live-patching code uses something like (findall("foo")[0], findall("foo")[1], findall("foo")[2]) ?
* shuffling a.o/b.o/c.o will make the returned triple different

Local symbols are not required to be unique. Instead of patching the toolchain,
have you thought about making the live-patching code smarter?
(Depend on the duplicates, such a linker option can increase the link time/binary size considerably
AND I don't know in what other cases such an option will be useful)

For the following example,
https://sourceware.org/bugzilla/show_bug.cgi?id=26822

# RUN: split-file %s %t
# RUN: gcc -c %t/a.s -o %t/a.o
# RUN: gcc -c %t/b.s -o %t/b.o
# RUN: gcc -c %t/c.s -o %t/c.o
# RUN: ld-new %t/a.o %t/b.o %t/c.o -z unique-symbol -o %t.exe
#--- a.s
a: a.1: a.2: nop
#--- b.s
a: nop
#--- c.s
a: nop

readelf -Ws output:

Symbol table '.symtab' contains 13 entries:
Num: Value Size Type Bind Vis Ndx Name
0: 0000000000000000 0 NOTYPE LOCAL DEFAULT UND 1: 0000000000000000 0 FILE LOCAL DEFAULT ABS a.o
2: 0000000000401000 0 NOTYPE LOCAL DEFAULT 1 a
3: 0000000000401000 0 NOTYPE LOCAL DEFAULT 1 a.1
4: 0000000000401000 0 NOTYPE LOCAL DEFAULT 1 a.2
5: 0000000000000000 0 FILE LOCAL DEFAULT ABS b.o
6: 0000000000401001 0 NOTYPE LOCAL DEFAULT 1 a.1
7: 0000000000000000 0 FILE LOCAL DEFAULT ABS c.o
8: 0000000000401002 0 NOTYPE LOCAL DEFAULT 1 a.2
9: 0000000000000000 0 NOTYPE GLOBAL DEFAULT UND _start
10: 0000000000402000 0 NOTYPE GLOBAL DEFAULT 1 __bss_start
11: 0000000000402000 0 NOTYPE GLOBAL DEFAULT 1 _edata
12: 0000000000402000 0 NOTYPE GLOBAL DEFAULT 1 _end

Note that you have STT_FILE SHN_ABS symbols.
If the compiler does not produce them, they will be synthesized by GNU ld.

https://sourceware.org/bugzilla/show_bug.cgi?id=26822
ld.bfd copies non-STT_SECTION local symbols from input object files. If an
object file does not have STT_FILE symbols (no .file directive) but has
non-STT_SECTION local symbols, ld.bfd synthesizes a STT_FILE symbol

The filenames are usually base names, so "a.o" and "a.o" in two directories will
be indistinguishable. The live-patching code can possibly work around this by
not changing the relative order of the two "a.o".