Re: in_compat_syscall() returns from kernel thread for X86_32.

From: NeilBrown
Date: Tue Oct 23 2018 - 23:03:42 EST


On Thu, Oct 18 2018, Andy Lutomirski wrote:

> On Wed, Oct 17, 2018 at 9:36 PM NeilBrown <neilb@xxxxxxxx> wrote:
>>
>> On Wed, Oct 17 2018, Andy Lutomirski wrote:
>>
>> > On Wed, Oct 17, 2018 at 6:48 PM NeilBrown <neilb@xxxxxxxx> wrote:
>> >>
>> >>
>> >> Was: Re: [tip:x86/asm] x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >> On Tue, Apr 19 2016, tip-bot for Dmitry Safonov wrote:
>> >>
>> >> > Commit-ID: abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Gitweb: http://git.kernel.org/tip/abfb9498ee1327f534df92a7ecaea81a85913bae
>> >> > Author: Dmitry Safonov <dsafonov@xxxxxxxxxxxxx>
>> >> > AuthorDate: Mon, 18 Apr 2016 16:43:43 +0300
>> >> > Committer: Ingo Molnar <mingo@xxxxxxxxxx>
>> >> > CommitDate: Tue, 19 Apr 2016 10:44:52 +0200
>> >> >
>> >> > x86/entry: Rename is_{ia32,x32}_task() to in_{ia32,x32}_syscall()
>> >> >
>> >> ...
>> >> > @@ -318,7 +318,7 @@ static inline bool is_x32_task(void)
>> >> >
>> >> > static inline bool in_compat_syscall(void)
>> >> > {
>> >> > - return is_ia32_task() || is_x32_task();
>> >> > + return in_ia32_syscall() || in_x32_syscall();
>> >> > }
>> >>
>> >> Hi,
>> >> I'm reply to this patch largely to make sure I get the right people
>> >> .....
>> >>
>> >> This test is always true when CONFIG_X86_32 is set, as that forces
>> >> in_ia32_syscall() to true.
>> >> However we might not be in a syscall at all - we might be running a
>> >> kernel thread which is always in 64 mode.
>> >> Every other implementation of in_compat_syscall() that I found is
>> >> dependant on a thread flag or syscall register flag, and so returns
>> >> "false" in a kernel thread.
>> >>
>> >> Might something like this be appropriate?
>> >>
>> >> diff --git a/arch/x86/include/asm/thread_info.h b/arch/x86/include/asm/thread_info.h
>> >> index 2ff2a30a264f..c265b40a78f2 100644
>> >> --- a/arch/x86/include/asm/thread_info.h
>> >> +++ b/arch/x86/include/asm/thread_info.h
>> >> @@ -219,7 +219,7 @@ static inline int arch_within_stack_frames(const void * const stack,
>> >> #ifndef __ASSEMBLY__
>> >>
>> >> #ifdef CONFIG_X86_32
>> >> -#define in_ia32_syscall() true
>> >> +#define in_ia32_syscall() (!(current->flags & PF_KTHREAD))
>> >> #else
>> >> #define in_ia32_syscall() (IS_ENABLED(CONFIG_IA32_EMULATION) && \
>> >> current_thread_info()->status & TS_COMPAT)
>> >>
>> >> This came up in the (no out-of-tree) lustre filesystem where some code
>> >> needs to assume 32-bit mode in X86_32 syscalls, and 64-bit mode in kernel
>> >> threads.
>> >>
>> >
>> > I could get on board with:
>> >
>> > ({WARN_ON_ONCE(current->flags & PF_KTHREAD); true})
>> >
>> > The point of these accessors is to be used *in a syscall*.
>> >
>> > What on Earth is Lustre doing that makes it have this problem?
>>
>> Lustre uses it in the ->getattr method to make sure ->ino, ->dev and
>> ->rdev are appropriately sized. This isn't very different from the
>> usage in ext4 to ensure the seek offset for directories is suitable.
>>
>> These interfaces can be used both from systemcalls and from kernel
>> threads, such as via nfsd.
>>
>> I don't *know* if nfsd is the particular kthread that causes problems
>> for lustre. All I know is that ->getattr returns 32bit squashed inode
>> numbers in kthread context where 64 bit numbers would be expected.
>>
>
> Well, that looks like Lustre is copying an ext4 bug.

I doubt it was copied - more likely independent evolution.
But on reflection, I see that it is probably reasonable that it
shouldn't be used this way - or at all in this context.
I'll try to understand what the issues really are and see if I can
find a solution that doesn't depend on this interface.
Thanks for your help.

NeilBrown


>
> Hi ext4 people-
>
> ext4's is_32bit_api() function is bogus. You can't use
> in_compat_syscall() unless you know you're in a syscall
>
> The buggy code was introduced in:
>
> commit d1f5273e9adb40724a85272f248f210dc4ce919a
> Author: Fan Yong <yong.fan@xxxxxxxxxxxxx>
> Date: Sun Mar 18 22:44:40 2012 -0400
>
> ext4: return 32/64-bit dir name hash according to usage type
>
> I don't know what the right solution is. Al, is it legit at all for
> fops->llseek to care about the caller's bitness? If what ext4 is
> doing is legit, then ISTM the VFS needs to gain a new API to tell
> ->llseek what to do. But I'm wondering why FMODE_64BITHASH by itself
> isn't sufficient,
>
> I'm quite tempted to add a warning to the x86 arch code to try to
> catch this type of bug. Fortunately, a bit of grepping suggests that
> ext4 is the only filesystem with this problem.
>
> --Andy

Attachment: signature.asc
Description: PGP signature