Re: wake_wide mechanism clarification

From: Joel Fernandes
Date: Mon Jul 31 2017 - 13:55:25 EST


On Mon, Jul 31, 2017 at 9:42 AM, Josef Bacik <josef@xxxxxxxxxxxxxx> wrote:
<snip>
>>
>> >
>> > So why do you care about wake_wide() anyway? Are you observing some problem
>> > that you suspect is affected by the affine wakeup stuff? Or are you just trying
>>
>> I am dealing with an affine wake up issue, yes.
>>
>> > to understand what is going on for fun? Cause if you are just doing this for
>> > fun you are a very strange person, thanks,
>>
>> Its not just for fun :) Let me give you some background about me, I
>> work in the Android team and one of the things I want to do is to take
>> an out of tree patch that's been carried for some time and post a more
>> upstreamable solution - this is related to wake ups from the binder
>> driver which does sync wake ups (WF_SYNC). I can't find the exact out
>> of tree patch publicly since it wasn't posted to a list, but the code
>> is here [1]. What's worse is I have recently found really bad issues
>> with this patch itself where runnable times are increased. I should
>> have provided this background earlier (sorry that I didn't, my plan
>> was to trigger a separate discussion about the binder sync wake up
>> thing as a part of a patch/proposal I want to post - which I plan to
>> do so). Anyway, as a part of this effort, I want to understand
>> wake_wide() better and "respect" it since it sits in the wake up path
>> and I wanted to my proposal to work well with it, especially since I
>> want to solve this problem in an upstream-friendly way.
>>
>> The other reason to trigger the discussion, is, I have seen
>> wake_wide() enough number of times and asked enough number of folks
>> how it works that it seems sensible to ask about it here (I was also
>> suggested to ask about wake_wide on LKML because since few people
>> seemingly understand how it works) and hopefully now its a bit better
>> understood.
>>
>> I agree with you that instead of spending insane amounts of time on
>> wake_wide itself, its better to directly work on a problem and collect
>> some data - which is also what I'm doing, but I still thought its
>> worth doing some digging into wake_wide() during some free time I had,
>> thanks.
>>
>
> Ok so your usecase is to _always_ wake affine if we're doing a sync wakeup. I
> _think_ for your case it's best to make wake_affine() make this decision, and
> you don't want wake_wide() to filter out your wakeup as not-affine? So perhaps
> just throw a test in there to not wake wide if WF_SYNC is set. This makes

Hmm I was actually thinking that since 'sync' is more of a hint, that
we do a wake_wide() first anyway since its already so conservative,
and for the times it does resort to wide, its probably the right
decision from a scheduling standpoint to avoid affine and avoid too
many tasks too quickly. Do you think that's a fair?

I tried a quick patch and doing wake_wide first, and then checking for
sync does seem to work well.

> logical sense to me as synchronous wakeups are probably going to want to be
> affine wakeups, and then we can rely on wake_affine() to do the load checks to
> make sure it really makes sense. How does that sound? Thanks,

Yep that sounds good and I will try that.

What I was thinking was do the regular wake_wide and wake_affine
checks, and then do something like this:

in select_task_rq_fair, before calling select_idle_sibling, do
something like this to check if only 1 task is running on the waker's
CPU (this is after doing the wake_wide and wake_affine checks)

+ idle_sync = (sync && (new_cpu == cpu) &&
+ cpu_rq(cpu)->nr_running == 1);

and then in select_idle_sibling, something like:

+ /*
+ * If the previous and target CPU share cache and a sync wakeup
+ * is requested and the CPU is about to goto idle, because it
+ * has only the waker running which requested sync, the target
+ * is a better choice for cache affinity and keeping task's
+ * previous core idle and low power state.
+ */
+ if (idle_sync && cpus_share_cache(prev, target))
+ return target;
+

I haven't tested this patch though so I'm not sure it works well yet,
but just sharing the idea. What do you think?

thanks,

-Joel