Re: [RFC PATCH 4/6] mm: provide generic compat_sys_readahead() implementation

From: Dominik Brodowski
Date: Tue Mar 20 2018 - 04:57:10 EST


On Mon, Mar 19, 2018 at 11:23:42PM +0000, Al Viro wrote:
> static inline long C_S_moron(int, loff_t, size_t);
> long compat_SyS_moron(long a0, long a1, long a2, long a3, long a4, long a5, long a6)
> {
> return C_S_moron((__force int)a0,
> (__force loff_t)(((u64)a2 << 32)|a1),
> (__force size_t)a3);
> }
> static inline long C_S_moron(int fd, loff_t offset, size_t count)
> {
> whatever body you had for it
> }
>
> That - from
> COMPAT_SYSCALL_DEFINE3(moron, int, fd, loff_t, offset, size_t, count)
> {
> whatever body you had for it
> }
>
> We can use similar machinery for SYSCALL_DEFINE itself, so that
> SyS_moron() would be defined with (long, long, long, long, long, long)
> as arguments and not (long, long long, long) as we have now.

That would be great, as it would allow to use a struct pt_regs * based
syscall calling convention on i386 as well, and not only on x86-64, right?

> It's not impossible to do. It won't be pretty, but that use of local
> enums allows to avoid unbearably long expansions.
>
> Benefits:
> * all SyS... wrappers (i.e. the thing that really ought to
> go into syscall tables) have the same type.
> * we could have SYSCALL_DEFINE produce a trivial compat
> wrapper, have explicit COMPAT_SYSCALL_DEFINE discard that thing
> and populate the compat syscall table *entirely* with compat_SyS_...,
> letting the linker sort it out. That way we don't need to keep
> track of what can use native and what needs compat in each compat
> table on biarch.
> * s390 compat wrappers would disappear with that approach.
> * we could even stop generating sys_... aliases - if
> syscall table is generated by slapping SyS_... or compat_SyS_...
> on the name given there, we don't need to _have_ those sys_...
> things at all. All SyS_... would have the same type, so the pile
> in syscalls.h would not be needed - we could generate the externs
> at the same time we generate the syscall table.
>
> And yes, it's a high-squick approach. I know and I'm not saying
> it's a good idea. OTOH, to quote the motto of philosophers and
> shell game operators, "there's something in it"...

... and getting rid of all in-kernel calls to sys_*() is needed as
groundwork for that. So I'll continue to do that "mindless" conversion
first. On top of that, three things (which are mostly orthogonal to each
other) can be done:

1) ptregs system call conversion for x86-64

Original implementation by Linus exists; needs a bit of tweaking
but should be doable soon. Need to double-check it does the right
thing for IA32_EMULATION, though.

2) re-work initramfs etc. code to not use in-kernel equivalents of
syscalls, but operate on the VFS level instead.

3) re-work SYSCALL_DEFINEx() / COMPAT_SYSCALL_DEFINEx() based on
your suggestions.

Does that sound sensible?

Thanks,
Dominik