Re: hackbench regression with 2.6.36-rc1

From: Eric W. Biederman
Date: Thu Aug 19 2010 - 16:25:29 EST

Next message: Borislav Petkov: "Re: [PATCH] x86, tsc: Limit CPU frequency calibration on AMD"
Previous message: Andreas Gruenbacher: "Re: [GIT PULL] notification tree - try 37!"
In reply to: Zhang, Yanmin: "Re: hackbench regression with 2.6.36-rc1"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

"Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx> writes:

> On Wed, 2010-08-18 at 03:56 -0700, Eric W. Biederman wrote:
>> "Zhang, Yanmin" <yanmin_zhang@xxxxxxxxxxxxxxx> writes:
>>
>> > Comparing with 2.6.35's result, hackbench (thread mode) has about
>> > 80% regression on dual-socket Nehalem machine and about 90% regression
>> > on 4-socket Tigerton machines.
>>
>> That seems unfortunate.
>
>> Do you only show a regression in the pthread
>> hackbench test?
> Yes.
>
>> Do you show a regression when you use pipes?
> No.
>
>>
>> Does the size of the regression very based on the number of loop
>> iterations?
> No. I tried 1000 and get the similar regression ratio.
> I choose a large 2000 loop number because I want to get a stable
> result.
>
> It's easy to reproduce it. We found it almost on all our machines.
>
>> I ask because it appears that on the last message the
>> sender will exit necessitating that the receiver put the senders pid.
>> Which should be atypical.
> I don't agree on that. With hackbench, sender would send loops*receiver_num_per_group
> messages before exiting.
> In addition, 'perf top' shows put_pid is the hottest function in the beginning
> after I start hackbench.

If increasing the number of loops does not improve the performance the
hypothesis that it is only the last message that has the regression
is shot.

>> > Command to start hackbench:
>> > #./hackbench 100 thread 2000
>> >
>> > process mode has no such regression.
>> >
>> > Profiling shows:
>> > #perf top
>> > samples pcnt function DSO
>> > _______ _____ ________________________ ________________________
>> >
>> > 74415.00 29.9% put_pid [kernel.kallsyms]
>> > 38395.00 15.4% unix_stream_recvmsg [kernel.kallsyms]
>> > 34877.00 14.0% unix_stream_sendmsg [kernel.kallsyms]
>> > 25204.00 10.1% pid_vnr [kernel.kallsyms]
>> > 21864.00 8.8% unix_scm_to_skb [kernel.kallsyms]
>> > 13637.00 5.5% cred_to_ucred [kernel.kallsyms]
>> > 6520.00 2.6% unix_destruct_scm [kernel.kallsyms]
>> > 4731.00 1.9% sock_alloc_send_pskb [kernel.kallsyms]
>> >
>> >
>> > With 2.6.35, perf doesn't show put_pid/pid_NR.
>>
>> Yes. 2.6.35 is imperfect and can report the wrong pid in some
>> circumstances. I am surprised nothing related to the reference count on
>> struct cred does not show up in your profiling traces.
>>
>
>> You are performing statistical sampling so I don't believe the
>> percentage of hits per function is the same as the percentage of
>> time per function.
> Agree. But from performance tuning point of view, percentage of hit is enough
> for helping developers to investigate.
>
> I provide 'perf top' data is to help you debug, not to prove your patches
> cause the regression. We used bisect to locate them.

Sure I was just trying to figure out how to explain why the creds
don't show a similar hit. I still don't have a complete explanation
for the profile but the cred put and get are inline functions so they
won't be present as distinct functions in the profile.

>> Given that we are talking about a scheduler benchmark that is
>> doing something rather artificial (inter thread communication via
>> sockets), I don't know that this case is worth worrying about.
> Good question. I don't know how about below scenario:
> Start 2 processes and every process creates many threads. threads of process 1
> communicates with threads of process 2.

Maybe. A lot depends on the timing, and what it takes to trigger
the cross cpu cache line bounce.

And we still have pipes for ultimate performance. Grrr.

I will give it some thought to see if I can find a less expensive way
but I don't have any good ideas at the moment.

Eric
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Borislav Petkov: "Re: [PATCH] x86, tsc: Limit CPU frequency calibration on AMD"
Previous message: Andreas Gruenbacher: "Re: [GIT PULL] notification tree - try 37!"
In reply to: Zhang, Yanmin: "Re: hackbench regression with 2.6.36-rc1"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]