Re: [tip:x86/asm] x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32

From: Andy Lutomirski
Date: Mon Mar 09 2015 - 09:15:55 EST


On Mon, Mar 9, 2015 at 6:04 AM, Denys Vlasenko <vda.linux@xxxxxxxxxxxxxx> wrote:
> On Sat, Mar 7, 2015 at 9:37 AM, tip-bot for Andy Lutomirski
> <tipbot@xxxxxxxxx> wrote:
>> Commit-ID: a7fcf28d431ef70afaa91496e64e16dc51dccec4
>> Gitweb: http://git.kernel.org/tip/a7fcf28d431ef70afaa91496e64e16dc51dccec4
>> Author: Andy Lutomirski <luto@xxxxxxxxxxxxxx>
>> AuthorDate: Fri, 6 Mar 2015 17:50:19 -0800
>> Committer: Ingo Molnar <mingo@xxxxxxxxxx>
>> CommitDate: Sat, 7 Mar 2015 09:34:03 +0100
>>
>> x86/asm/entry: Replace this_cpu_sp0() with current_top_of_stack() and fix it on x86_32
>>
>> I broke 32-bit kernels. The implementation of sp0 was correct
>> as far as I can tell, but sp0 was much weirder on x86_32 than I
>> realized. It has the following issues:
>>
>> - Init's sp0 is inconsistent with everything else's: non-init tasks
>> are offset by 8 bytes. (I have no idea why, and the comment is unhelpful.)
>>
>> - vm86 does crazy things to sp0.
>>
>> Fix it up by replacing this_cpu_sp0() with
>> current_top_of_stack() and using a new percpu variable to track
>> the top of the stack on x86_32.
>
> Looks like the hope that tss.sp0 is a reliable variable
> which points to top of stack didn't really play out :(
>
> Recent relevant commits in x86/entry were:
>
> x86/asm/entry: Add this_cpu_sp0() to read sp0 for the current cpu
> - added accessor to tss.sp0
> "We currently store references to the top of the kernel stack in
> multiple places: kernel_stack (with an offset) and
> init_tss.x86_tss.sp0 (no offset). The latter is defined by
> hardware and is a clean canonical way to find the top of the
> stack. Add an accessor so we can start using it."
>
> x86/asm/entry: Switch all C consumers of kernel_stack to this_cpu_sp0()
> - equivalent change, no win/no loss
>
> x86/asm/entry/64/compat: Change the 32-bit sysenter code to use sp0
> - Even though it did remove one insn, we can get the same
> if KERNEL_STACK_OFFSET will be eliminated
>
> x86: Delay loading sp0 slightly on task switch
> - simple fix, nothing needed to be added
>
> x86: Replace this_cpu_sp0 with current_top_of_stack and fix it on x86_32
> - added a percpu var cpu_current_top_of_stack
> - needs to set it in do_boot_cpu()
> - added ifdef forest:
> +#ifdef CONFIG_X86_64
> return this_cpu_read_stable(cpu_tss.x86_tss.sp0);
> +#else
> + /* sp0 on x86_32 is special in and around vm86 mode. */
> + return this_cpu_read_stable(cpu_current_top_of_stack);
> +#endif
>
>
>
> End result is, now 32-bit kernel has two per-cpu vartiables,
> cpu_current_top_of_stack and kernel_stack.
>
> cpu_current_top_of_stack is essentially "real top of stack",
> and kernel_stack is "real top of stack - KERNEL_STACK_OFFSET".
>
> When/if we get rid of KERNEL_STACK_OFFSET,
> we can also get rid of kernel_stack, since it will be the same as
> cpu_current_top_of_stack (which is a better name anyway).

Exactly.

I think the next step might be to decouple GET_THREAD_INFO and friends
from kernel_stack. I think that might be enough to get rid of
kernel_stack on 32-bit. 64 has two other remaining users: the syscall
entries.

--Andy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/