Re: [PATCH 1/2] binfmt_elf: FatELF support in the binary loader.

From: Anton D. Kachalov
Date: Sat Oct 24 2009 - 05:00:21 EST


Hello, Ryan.

Ryan C. Gordon wrote:
Wow, competing ideas! :)

Here are my notes on your idea. Ego compels me to prefer my approach, but I strove to be objective here, as there is a tradeoff of benefits in each of our approaches.
Thanks :) It was born just out of Apple's concept utilizing unused space in ELF headers.
It should works with "setarch" too to force selection of binary.

How does setarch work? Does it reorder the file before launching or copy out one of the ELF records?

$ man setarch
NAME
setarch - change reported architecture in new program environment and
set personality flags

$ dpkg -S /usr/bin/setarch
util-linux: /usr/bin/setarch

"setarch" just set "personality" of running program.

[...]
The most compelling feature of this approach is that a "truearch" binary (is that the correct name?) could work with any existing Linux system, on the condition that the architecture you want is the first one in the file.
Nope, you may put binary files in any order. It's just a linked list of binaries.
If you put, say, x86 first in the file and you want to run it on an x86_64 system, you're either out of luck or going to be running the 32-bit version.
As a previous state, you will able run x86_64. But you need to change order of binfmt and compat_binfmt in built-in.o by changing Makefile. Just swap two lines. I don't know why on x86_64 system first we try compat mode than native while simple run of native app will take more cpu cycles on x86_64 Vs. x86.
In this same scenario, if you put x86_64 first, it just won't run at all on an unpatched x86 box. So, it's a cool trick, but it's not all that beneficial. We have to assume that either approach requires kernel patches to be truly useful. For unpatched boxes, FatELF provides a simple command line app, fatelf-extract, which can be used to get the original ELF binary you want out of the FatELF file, both for stripping unwanted bits and as a measure of last resort if the kernel and dynamic loader can't handle FatELF. I assume setarch works somewhat the same.
Which arch will be "fatelf-extract"? Let's say, If I'm running Linux on PowerPC? x86? =) Only if it is a shell script, it will be beneficial for any arch. I can inject "offset" portion in script file too...
I'm concerned about using the padding bits in e_ident, too. A lot of manpower went into the ELF specification and I felt it was presumptuous for me to personally change the format. A container around them, like FatELF, was a safer, more future-proof choice. I'd rather those that control the ELF spec decide what those padding bits should be used for in the future.

The truearch method requires the kernel to seek throughout the whole file
Nope, it just read "offset" field and seek if needed. So, if file is just one-arch, it will read 128 bytes only.
to decide if it can use it at all. FatELF uses the 128 bytes at the front of the file, which binfmt_elf reads anyhow, and then seeks to the right record from there, so disk bandwidth overhead is extremely small (one extra read of 128 bytes if we can use the file, zero extra reads if not).
In my approach, it's just a few seeks more. Just a few additional reads are not so much compared to overall reads from that file.

[...]
Both approaches have zero disk overhead if a normal ELF file is loaded, which is good.


In terms of this patch itself, I'd be concerned about using gotos for the retry_* blocks when a loop would be easy enough to incorporate. I saw you have a test for personality() that I didn't do; I might have to check into that, but the binfmt_elf_compat code is definitely catching x86 binaries on x86_64 here, so I'm not sure it's necessary.
It's necessary if you would like to use setarch to choose binaries on biarch systems.
Anyhow, I hope this was useful commentary, and not seen as a battle of egos. I'm glad to see other approaches, though, as it suggests there really is a genuine desire for this sort of functionality!
:) Agreed

Rgds,
Anton
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/