Re: Performance regression introduced by commit b667b8673443 ("pipe: Advance tail pointer inside of wait spinlock in pipe_read()")

From: Waiman Long
Date: Fri Jan 17 2020 - 13:11:14 EST


On 1/17/20 12:29 PM, Waiman Long wrote:
> On 1/17/20 12:05 PM, Linus Torvalds wrote:
>> [ on mobile, sorry for html crud ]
>>
>> On Fri, Jan 17, 2020, 08:53 Waiman Long <longman@xxxxxxxxxx
>> <mailto:longman@xxxxxxxxxx>> wrote:
>>
>>
>> I had found that parallel kernel build became much slower when a
>> 5.5-based kernel is used. On a 2-socket 96-thread x86-64 system, the
>> "make -j88" time increased from less than 3 minutes with the 5.4
>> kernel
>> to more than double with the 5.5 kernel.
>>
>>
>> I suspect you may have hit the same bug in the GNU make jobserver
>> that I did.
>>
>> It's timing-sensitive, and under the right circumstances the make
>> jobserver loses job tickets to other jobservers that have a child
>> that died, but they are blocked waiting for a new ticket, so they
>> aren't releasing (or re-using) the one that the child death would
>> free up.
>>
>> End result: a big lack of parallelism, and a much slower build.
>>
>> GNU make v4.2.1 is buggy. The fix was done over two years ago, but
>> there hasn't been a new release since then, so a lot of distributions
>> have the buggy version..
>>
>> The fix is commitÂb552b05 ("[SV 51159] Use a non-blocking read with
>> pselect to avoid hangs.") In the make the git tree.
>>
>>
>> Â Â ÂLinus
>
> Thanks for the information.
>
> Yes, I did use make v4.2.1 which is the version that is shipped in
> RHEL8. I will build new make and try it.
>
> Thanks,
> Longman
>
I built a make with the lastest make git tree and the problem was gone
with the new make. So it was a bug in make not the kernel. Sorry for the
noise.

Cheers,
Longman