Re: [PATCH next] softirq: enable MAX_SOFTIRQ_TIME tuning with sysctl max_softirq_time_usecs

From: Zhiqiang Liu
Date: Mon Jun 24 2019 - 00:01:31 EST



在 2019/6/24 0:38, Thomas Gleixner 写道:
> Zhiqiang,
>> controlled by sysadmins to copy with hardware changes over time.
>
> So much for the theory. See below.

Thanks for your reply.
>
>> Correspondingly, the MAX_SOFTIRQ_TIME should be able to be tunned by sysadmins,
>> who knows best about hardware performance, for excepted tradeoff between latence
>> and fairness.
>>
>> Here, we add sysctl variable max_softirq_time_usecs to replace MAX_SOFTIRQ_TIME
>> with 2ms default value.
>
> ...
>
>> */
>> -#define MAX_SOFTIRQ_TIME msecs_to_jiffies(2)
>> +unsigned int __read_mostly max_softirq_time_usecs = 2000;
>> #define MAX_SOFTIRQ_RESTART 10
>>
>> #ifdef CONFIG_TRACE_IRQFLAGS
>> @@ -248,7 +249,8 @@ static inline void lockdep_softirq_end(bool in_hardirq) { }
>>
>> asmlinkage __visible void __softirq_entry __do_softirq(void)
>> {
>> - unsigned long end = jiffies + MAX_SOFTIRQ_TIME;
>> + unsigned long end = jiffies +
>> + usecs_to_jiffies(max_softirq_time_usecs);
>
> That's still jiffies based and therefore depends on CONFIG_HZ. Any budget
> value will be rounded up to the next jiffie. So in case of HZ=100 and
> time=1000us this will still result in 10ms of allowed loop time.
>
> I'm not saying that we must use a more fine grained time source, but both
> the changelog and the sysctl documentation are misleading.
>
> If we keep it jiffies based, then microseconds do not make any sense. They
> just give a false sense of controlability.
>
> Keep also in mind that with jiffies the accuracy depends also on the
> distance to the next tick when 'end' is evaluated. The next tick might be
> imminent.
>
> That's all information which needs to be in the documentation.
>

Thanks again for your detailed advice.
As your said, the max_softirq_time_usecs setting without explaining the
relationship with CONFIG_HZ will give a false sense of controlability. And
the time accuracy of jiffies will result in a certain difference between the
max_softirq_time_usecs set value and the actual value, which is in one jiffies
range.

I will add these infomation in the sysctl documentation and changelog in v2 patch.

>> + {
>> + .procname = "max_softirq_time_usecs",
>> + .data = &max_softirq_time_usecs,
>> + .maxlen = sizeof(unsigned int),
>> + .mode = 0644,
>> + .proc_handler = proc_dointvec_minmax,
>> + .extra1 = &zero,
>> + },
>
> Zero as the lower limit? That means it allows a single loop. Fine, but
> needs to be documented as well.
>
> Thanks,
>
> tglx
>
> .
>