Re: CONFIG_VMAP_STACK, on-stack struct, and wake_up_bit

From: Borislav Petkov
Date: Wed Oct 26 2016 - 18:46:19 EST


On Wed, Oct 26, 2016 at 02:30:49PM -0700, Linus Torvalds wrote:
> Ok, similar issue, I think - passing a non-1:1 address to __phys_addr().
>
> But the call trace has nothing to do with gfs2 or the bitlocks:
>
> > [ 2.504561] Call Trace:
> > [ 2.507005] save_microcode_in_initrd_amd+0x31/0x106
> > [ 2.513778] save_microcode_in_initrd+0x3c/0x45
> > [ 2.526110] do_one_initcall+0x50/0x180
> > [ 2.531756] ? set_debug_rodata+0x12/0x12
> > [ 2.537573] kernel_init_freeable+0x194/0x230
> > [ 2.543740] ? rest_init+0x80/0x80
> > [ 2.548952] kernel_init+0xe/0x100
> > [ 2.554164] ret_from_fork+0x25/0x30
>
> I think this might be the
>
> cont = __pa(container);
>
> line in save_microcode_in_initrd_amd().
>
> I see that Borislav is busy with some x86/microcode patches, I suspect
> he already hit this. Adding Borislav to the cc.

Hmm, I guess that fires because that container thing is a static pointer
so it is >= PAGE_OFFSET. But I might be wrong, it is too late here for
brain to work.

In any case, looking at his Code:

0: 48 89 f8 mov %rdi,%rax
3: 72 28 jb 0x2d
5: 48 2b 05 7b a0 dc 00 sub 0xdca07b(%rip),%rax # 0xdca087
c: 48 05 00 00 00 80 add $0xffffffff80000000,%rax
12: 48 39 c7 cmp %rax,%rdi
^^^^^^^^^^^^^^^^

it could be this comparison here:

RAX: fffff39132a822fc, RDI: ffff8800b2a822fc

15: 72 14 jb 0x2b

... which sends us to the UD2.

17: 0f b6 0d 6a 75 ee 00 movzbl 0xee756a(%rip),%ecx # 0xee7588

We might end up at 0x2b from here too - that's !phys_addr_valid(x) - but
ECX is 0 while it should be 36...

1e: 48 89 c2 mov %rax,%rdx
21: 48 d3 ea shr %cl,%rdx
24: 48 85 d2 test %rdx,%rdx
27: 75 02 jne 0x2b
29: 5d pop %rbp
2a: c3 retq
2b:* 0f 0b ud2 <-- trapping instruction
2d: 48 03 05 7b 5b da 00 add 0xda5b7b(%rip),%rax # 0xda5baf
34: 48 81 ff ff ff ff 3f cmp $0x3fffffff,%rdi
3b: 76 ec jbe 0x29
3d: 0f 0b ud2
3f: 0f .byte 0xf

But again, I could be already sleeping and this could be me talking in
my sleep so don't take it too seriously.

In any case, this code was flaky and fragile for many reasons and it is
why this whole wankery is gone in the microcode loader now.

> Can you re-try without the AMD microcode driver for now?

Yeah, just boot with "dis_ucode_ldr".

--
Regards/Gruss,
Boris.

SUSE Linux GmbH, GF: Felix ImendÃrffer, Jane Smithard, Graham Norton, HRB 21284 (AG NÃrnberg)
--