Re: Wake-up from suspend to RAM broken under `retbleed=stuff`

From: Rafael J. Wysocki
Date: Wed Jan 11 2023 - 12:56:21 EST


On Wed, Jan 11, 2023 at 12:20 PM Peter Zijlstra <peterz@xxxxxxxxxxxxx> wrote:
>
> On Mon, Jan 09, 2023 at 04:05:31AM +0000, Joan Bruguera wrote:
> > This fixes wakeup for me on both QEMU and real HW
> > (just a proof of concept, don't merge)
> >
> > diff --git a/arch/x86/kernel/callthunks.c b/arch/x86/kernel/callthunks.c
> > index ffea98f9064b..8704bcc0ce32 100644
> > --- a/arch/x86/kernel/callthunks.c
> > +++ b/arch/x86/kernel/callthunks.c
> > @@ -7,6 +7,7 @@
> > #include <linux/memory.h>
> > #include <linux/moduleloader.h>
> > #include <linux/static_call.h>
> > +#include <linux/suspend.h>
> >
> > #include <asm/alternative.h>
> > #include <asm/asm-offsets.h>
> > @@ -150,6 +151,10 @@ static bool skip_addr(void *dest)
> > if (dest >= (void *)hypercall_page &&
> > dest < (void*)hypercall_page + PAGE_SIZE)
> > return true;
> > +#endif
> > +#ifdef CONFIG_PM_SLEEP
> > + if (dest == restore_processor_state)
> > + return true;
> > #endif
> > return false;
> > }
> > diff --git a/arch/x86/power/cpu.c b/arch/x86/power/cpu.c
> > index 236447ee9beb..e667894936f7 100644
> > --- a/arch/x86/power/cpu.c
> > +++ b/arch/x86/power/cpu.c
> > @@ -281,6 +281,9 @@ static void notrace __restore_processor_state(struct saved_context *ctxt)
> > /* Needed by apm.c */
> > void notrace restore_processor_state(void)
> > {
> > + /* Restore GS before calling anything to avoid crash on call depth accounting */
> > + native_wrmsrl(MSR_GS_BASE, saved_context.kernelmode_gs_base);
> > +
> > __restore_processor_state(&saved_context);
> > }
>
> Yeah, I can see why, but I'm not really comfortable with this. TBH, I
> don't see how the whole resume code is correct to begin with. At the
> very least it needs a heavy dose of noinstr.
>
> Rafael, what cr3 is active when we call restore_processor_state()?

On resume from suspend-to-RAM, the one that was saved by
__save_processor_state() AFAICS.

On resume from hibernation, it looks like this is the one that was
used by the restore kernel.

> Specifically, the problem is that I don't feel comfortable doing any
> sort of weird code until all the CR and segment registers have been
> restored, however, write_cr*() are paravirt functions that result in
> CALL, which then gives us a bit of a checken and egg problem.
>
> I'm also wondering how well retbleed=stuff works on Xen, if at all. If
> we can ignore Xen, things are a little earier perhaps.