Re: [RFC PATCH] sched: Invoke io_wq_worker_sleeping() with enabled preemption

From: peterz
Date: Wed Aug 19 2020 - 10:22:02 EST


On Wed, Aug 19, 2020 at 03:33:20PM +0200, Sebastian Andrzej Siewior wrote:
> On 2020-08-19 15:15:07 [+0200], peterz@xxxxxxxxxxxxx wrote:

> If you want to optimize further, we could move PF_IO_WORKER to an lower
> bit. x86 can test for both via
> (gcc-10)
> | testl $536870944, 44(%rbp) #, _11->flags
> | jne .L1635 #,
>
> (clang-9)
> | testl $536870944, 44(%rbx) # imm = 0x20000020
> | je .LBB112_6
>
>
> but ARM can't and does
> | ldr r1, [r5, #16] @ tsk_3->flags, tsk_3->flags
> | mov r2, #32 @ tmp157,
> | movt r2, 8192 @ tmp157,
> | tst r2, r1 @ tmp157, tsk_3->flags
> | beq .L998 @,
>
> same ARM64
> | ldr w0, [x20, 60] //, _11->flags
> | and w0, w0, 1073741792 // tmp117, _11->flags,
> | and w0, w0, -536870849 // tmp117, tmp117,
> | cbnz w0, .L453 // tmp117,
>
> using 0x10 for PF_IO_WORKER instead will turn this into:
> | ldr w0, [x20, 60] //, _11->flags
> | tst w0, 48 // _11->flags,
> | bne .L453 //,
>
> ARM:
> | ldr r2, [r5, #16] @ tsk_3->flags, tsk_3->flags
> | tst r2, #48 @ tsk_3->flags,
> | beq .L998 @,

Good point, AFAICT there's a number of low bits still open (and we can
shuffle if we have to), so sure put a patch in to that effect while
you're at it.