Re: [PATCH] tick: prefer a lower rating device only if it's CPU local device

From: Sudeep Holla
Date: Mon Jul 09 2018 - 11:12:54 EST




On 08/07/18 21:59, Martin Blumenstingl wrote:
> Hi Thomas,
>
> On Tue, Jul 3, 2018 at 6:48 PM Sudeep Holla <sudeep.holla@xxxxxxx> wrote:
>>
>> Hi Thomas,
>>
>> On Tue, Jul 03, 2018 at 06:08:19PM +0200, Thomas Gleixner wrote:
>>
>> [...]
>>
>>>>> / # cat /sys/devices/system/clockevents/broadcast/current_device
>>>>> meson6_tick
>>>>
>>>> OK, it can support broadcast
>>>>
>>>>> / # cat /sys/devices/system/clockevents/clockevent0/current_device
>>>>> dummy_timer
>>>>> / # cat /sys/devices/system/clockevents/clockevent1/current_device
>>>>> dummy_timer
>>>>> / # cat /sys/devices/system/clockevents/clockevent2/current_device
>>>>> dummy_timer
>>>>
>>>> But I can't understand why is dummy_timer the active event source and
>>>> not meson6_tick. And you say this is working case ? Looks suspicious.
>>>
>>> Because if it switches to broadcast mode then the meson timer cannot longer
>>> be used as per cpu timer. It's broadcasting to all CPUs via the dummy timer.
>>
>> Thanks for the explanation. I completely misread the sysfs entry and
>> assume clockevent_register failed for meson6 and hence regarded as
>> suspicious which is complete non-sense, my bad. Sorry for that.
>> I think I now understand the issue.
>>
>> 1. Juno usecase for which $subject was added as fix:
>>
>> Two system wide timers(cpumask=possible cpus) with rating 300 and 400.
>> When second one with 400 is added, timer with rating 300 is added to
>> released list and again added back to main one. In this case both were
>> chosen as preferred and that resulted in deadlock.
>>
>> 2. Meson6 usecase:
>>
>> When meson6_tick is added, it's set as preferred and dummy_timer is released.
>> When it's being added back from the released list, it will be chosen as
>> preferred as it's per_cpu resulting in deadlock.
>>
>> I am not sure how to fix this. Should the fix to my original problem have
>> checks for both old and new for per-cpu to prevent the issue reported on
>> Meson6
> could you please answer Sudeep's question?
>

OK, I think I have misunderstood my original problem because of the
cpumask_equal for nr_bits <= BITS_PER_LONG. It uses nr_cpumask_bits
which is NR_CPUS when CPUMASK_OFFSTACK is not enabled.

Enabling it fixes my original problem(reverting $subject patch). So it
should be reverted. And also pointed out the issue with ARM mem timer.

I am sorry for the mess, I will post the revert and along with the fix
to my issue. Once again sorry for not understanding the root cause
correctly.
--
Regards,
Sudeep