Re: objtool warning for next-20221118

From: Juergen Gross
Date: Thu Nov 24 2022 - 02:48:04 EST


On 24.11.22 06:28, Juergen Gross wrote:
On 24.11.22 03:39, Josh Poimboeuf wrote:
On Wed, Nov 23, 2022 at 09:03:40AM -0800, Josh Poimboeuf wrote:
On Wed, Nov 23, 2022 at 10:52:09AM +0000, Andrew Cooper wrote:
Well, if you return from arch_cpu_idle_dead() you're back in the idle
loop -- exactly where you would be if you were to bootstrap the whole
CPU -- provided you have it remember the whole state (easier with a
vCPU).

play_dead() really needs sane semantics.  Not only does it introduce a
surprise to the offlining code in do_idle(), it also skips the entire
hotplug state machine.  Not sure if that introduces any bugs, but at the
very least it's subtle and surprising.

But maybe I'm missing something, lets add Xen folks on.

Calling VCPUOP_down on oneself always succeeds, but all it does is
deschedule the vCPU.

It can be undone at a later point by a different vcpu issuing VCPUOP_up
against the previously-downed CPU, at which point the vCPU gets rescheduled.

This is why the VCPUOP_down hypercall returns normally.  All state
really is intact.

As for what Linux does, this is how xen_pv_cpu_up() currently behaves.
If you want to make Xen behave more everything else, then bug a BUG()
after VCPUOP_down, and adjust xen_pv_cpu_up() to skip its initialised
check and always use VCPUOP_initialise to bring the vCPU back online.

Or we could do what sev_es_play_dead() does and just call start_cpu0()
after the hypercall returns?

Something like so (untested).  This is only the x86 bits.

I think I convinced myself that start_cpu0() isn't buggy.  I'm looking
at other cleanups, e.g. converging cpu_bringup_and_idle() with
start_secondary().

I can pick it up again next week, post-turkey.

diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h
index b4dbb20dab1a..e6d1d2810e38 100644
--- a/arch/x86/include/asm/smp.h
+++ b/arch/x86/include/asm/smp.h
@@ -93,9 +93,10 @@ static inline void __cpu_die(unsigned int cpu)
      smp_ops.cpu_die(cpu);
  }
-static inline void play_dead(void)
+static inline void __noreturn play_dead(void)
  {
      smp_ops.play_dead();
+    BUG();
  }
  static inline void smp_send_reschedule(int cpu)
diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c
index 26e8f57c75ad..8e2841deb1eb 100644
--- a/arch/x86/kernel/process.c
+++ b/arch/x86/kernel/process.c
@@ -700,7 +700,7 @@ EXPORT_SYMBOL(boot_option_idle_override);
  static void (*x86_idle)(void);
  #ifndef CONFIG_SMP
-static inline void play_dead(void)
+static inline void __noreturn play_dead(void)
  {
      BUG();
  }
diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c
index 55cad72715d9..d8b12ac1a7c5 100644
--- a/arch/x86/kernel/smpboot.c
+++ b/arch/x86/kernel/smpboot.c
@@ -1833,9 +1833,12 @@ void native_play_dead(void)
      play_dead_common();
      tboot_shutdown(TB_SHUTDOWN_WFS);
-    mwait_play_dead();    /* Only returns on failure */
+    mwait_play_dead();    /* Only returns if mwait is not supported */
+
      if (cpuidle_play_dead())
          hlt_play_dead();
+
+    BUG();
  }
  #else /* ... !CONFIG_HOTPLUG_CPU */
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 480be82e9b7b..30dc904ca990 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -385,17 +385,9 @@ static void xen_pv_play_dead(void) /* used only with HOTPLUG_CPU */
  {
      play_dead_common();
      HYPERVISOR_vcpu_op(VCPUOP_down, xen_vcpu_nr(smp_processor_id()), NULL);
-    cpu_bringup();
-    /*
-     * commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu down)
-     * clears certain data that the cpu_idle loop (which called us
-     * and that we return from) expects. The only way to get that
-     * data back is to call:
-     */
-    tick_nohz_idle_enter();
-    tick_nohz_idle_stop_tick_protected();
-    cpuhp_online_idle(CPUHP_AP_ONLINE_IDLE);
+    /* FIXME: converge cpu_bringup_and_idle() and start_secondary() */
+    cpu_bringup_and_idle();

I think this will leak stack memory. Multiple cpu offline/online cycles of
the same cpu will finally exhaust the idle stack.

The attached patch seems to work fine.

The __noreturn annotation seems to trigger an objtool warning, though, in
spite of the added BUG() at the end of xen_pv_play_dead():

arch/x86/xen/smp_pv.o: warning: objtool: xen_pv_play_dead() falls through to next function xen_pv_cpu_die()


Juergen

From 7ebb76ec0ed32b657ceda530b620ef563ac0f212 Mon Sep 17 00:00:00 2001
From: Juergen Gross <jgross@xxxxxxxx>
Date: Thu, 24 Nov 2022 08:09:45 +0100
Subject: [PATCH] x86/xen: don't let xen_pv_play_dead() return

A function called via the paravirt play_dead() hook should not return
to the caller.

xen_pv_play_dead() has a problem in this regard, as it currently will
return in case an offlined cpu is brought to life again. This can be
changed only by doing basically a longjmp() to cpu_bringup_and_idle(),
as the hypercall for bringing down the cpu will just return when the
cpu is coming up again. Just re-initializing the cpu isn't possible,
as the Xen hypervisor will deny that operation.

So introduce xen_cpu_bringup_again() resetting the stack and calling
cpu_bringup_and_idle(), which can be called after HYPERVISOR_vcpu_op()
in xen_pv_play_dead().

Annotate xen_pv_play_dead() and xen_cpu_bringup_again() with
"__noreturn".

Signed-off-by: Juergen Gross <jgross@xxxxxxxx>
---
arch/x86/xen/smp.h | 2 ++
arch/x86/xen/smp_pv.c | 17 ++++-------------
arch/x86/xen/xen-head.S | 7 +++++++
3 files changed, 13 insertions(+), 13 deletions(-)

diff --git a/arch/x86/xen/smp.h b/arch/x86/xen/smp.h
index bd02f9d50107..22fb982ff971 100644
--- a/arch/x86/xen/smp.h
+++ b/arch/x86/xen/smp.h
@@ -21,6 +21,8 @@ void xen_smp_send_reschedule(int cpu);
void xen_smp_send_call_function_ipi(const struct cpumask *mask);
void xen_smp_send_call_function_single_ipi(int cpu);

+void __noreturn xen_cpu_bringup_again(unsigned long stack);
+
struct xen_common_irq {
int irq;
char *name;
diff --git a/arch/x86/xen/smp_pv.c b/arch/x86/xen/smp_pv.c
index 480be82e9b7b..5801f93d567c 100644
--- a/arch/x86/xen/smp_pv.c
+++ b/arch/x86/xen/smp_pv.c
@@ -381,21 +381,12 @@ static void xen_pv_cpu_die(unsigned int cpu)
}
}

-static void xen_pv_play_dead(void) /* used only with HOTPLUG_CPU */
+static void __noreturn xen_pv_play_dead(void) /* used only with HOTPLUG_CPU */
{
play_dead_common();
HYPERVISOR_vcpu_op(VCPUOP_down, xen_vcpu_nr(smp_processor_id()), NULL);
- cpu_bringup();
- /*
- * commit 4b0c0f294 (tick: Cleanup NOHZ per cpu data on cpu down)
- * clears certain data that the cpu_idle loop (which called us
- * and that we return from) expects. The only way to get that
- * data back is to call:
- */
- tick_nohz_idle_enter();
- tick_nohz_idle_stop_tick_protected();
-
- cpuhp_online_idle(CPUHP_AP_ONLINE_IDLE);
+ xen_cpu_bringup_again((unsigned long)task_pt_regs(current));
+ BUG();
}

#else /* !CONFIG_HOTPLUG_CPU */
@@ -409,7 +400,7 @@ static void xen_pv_cpu_die(unsigned int cpu)
BUG();
}

-static void xen_pv_play_dead(void)
+static void __noreturn xen_pv_play_dead(void)
{
BUG();
}
diff --git a/arch/x86/xen/xen-head.S b/arch/x86/xen/xen-head.S
index ffaa62167f6e..e36ea4268bd2 100644
--- a/arch/x86/xen/xen-head.S
+++ b/arch/x86/xen/xen-head.S
@@ -76,6 +76,13 @@ SYM_CODE_START(asm_cpu_bringup_and_idle)

call cpu_bringup_and_idle
SYM_CODE_END(asm_cpu_bringup_and_idle)
+
+SYM_CODE_START(xen_cpu_bringup_again)
+ UNWIND_HINT_FUNC
+ mov %rdi, %rsp
+ UNWIND_HINT_REGS
+ call cpu_bringup_and_idle
+SYM_CODE_END(xen_cpu_bringup_again)
.popsection
#endif
#endif
--
2.35.3

Attachment: OpenPGP_0xB0DE9DD628BF132F.asc
Description: OpenPGP public key

Attachment: OpenPGP_signature
Description: OpenPGP digital signature