Re: BUG: unable to handle kernel paging request from pty_write [was: Linux 4.4.2]

From: Peter Hurley
Date: Fri Feb 26 2016 - 12:52:24 EST


On 02/26/2016 12:56 AM, Jiri Slaby wrote:
> On 02/26/2016, 01:38 AM, Linus Torvalds wrote:
>> On Thu, Feb 25, 2016 at 1:32 PM, Jiri Slaby <jslaby@xxxxxxx> wrote:
>>>
>>> Interestingly, RBP contains address inside try_to_wake_up --
>>> ffffffff810a535a (dunno why) which is:
>>> ffffffff810a5355: e8 66 a0 ff ff callq ffffffff8109f3c0
>>> <ttwu_stat>
>>> ffffffff810a535a: e9 9d fe ff ff jmpq ffffffff810a51fc
>>> <try_to_wake_up+0x3c>
>>>
>>> ttwu_stat does in the begginning:
>>> mov $0x16e80,%r14
>>>
>>> which is what we actually still have in r14 when it crashes. The first
>>> ttwu_stat's "if" has to go through the true branch (otherwise r14 would
>>> be overwritten).
>>
>> Hmm. That does sound very much like it might be ttwu_stat() that has
>> gotten the stack frame wrong, and when finishes exits, it does
>>
>> popq %rbp
>> ret
>>
>> but in fact it popped the return address, and then returned to a crazy address.
>>
>> Which sounds like a corrupted stack pointer (not a corrupted stack).

So more analysis would seem to confirm that RSP has been bumped +8
while in ttwu_stat() so when the epilog executed, register restore
was off by 1 qword. However, there's nothing in ttwu_stat() that
results in stack pointer offset by +1 qword from prolog.

Below I highlighted key instructions from try_to_wake_up() => ttwu_stat() and
what presumably was the resultant stack state at each instruction:


call try_to_wake_up ffffffff810a5585 \
push rbp ffff8800bb2a7c90 |
push r15 0000000000010e30 |
push r14 0000000000000005 |
push r13 ffff88017ed2a830 |- values from stack trace
push r12 ffff880234e26a08 |
push rbx ffff88017ee19f00 |
sub 0x10, rsp 000000008146e197 /
ffff88023fd40000 => rip @ crash

call raw_spin_lock_irqsave
mov rax, r13
mov 0x16e80, r15

mov 1, r12d
call ttwu_stat ffffffff810a535a => rbp @ crash
push rbp ffff8800bb2a7c80 => r15 @ crash
push r15 0000000000016e80 => r14 @ crash
push r14 ffff8800bb37e180 => r13 @ crash
push r13 0000000000000046 => r12 @ crash
push r12 0000000000000001 => rbx @ crash
push rbx ???
sub 8, rsp ???


So in addition to rbp <= ret addr and r15 <= saved rbp, note also

rbx <= saved r12 (== 1)
r12 <= saved r13 (rflags == 00046)
r14 <= saved r15 (== 0x16e80)

which neatly corresponds to the ttwu_stat() epilog if rsp has
been offset by +1 qword.


Regards,
Peter Hurley