Re: [RFC][PATCH 0/3] locking/mutex: Rewrite basic mutex

From: Waiman Long
Date: Thu Aug 25 2016 - 12:33:31 EST


On 08/25/2016 11:43 AM, Peter Zijlstra wrote:
On Tue, Aug 23, 2016 at 06:13:43PM -0700, Jason Low wrote:
I tested this patch on an 8 socket system with the high_systime AIM7
workload with diskfs. The patch provided big performance improvements in
terms of throughput in the highly contended cases.

-------------------------------------------------
| users | avg throughput | avg throughput |
| without patch | with patch |
-------------------------------------------------
| 10 - 90 | 13,943 JPM | 14,432 JPM |
-------------------------------------------------
| 100 - 900 | 75,475 JPM | 102,922 JPM |
-------------------------------------------------
| 1000 - 1900 | 77,299 JPM | 115,271 JPM |
-------------------------------------------------

Unfortunately, at 2000 users, the modified kernel locked up.

# INFO: task reaim:<#> blocked for more than 120 seconds.

So something appears to be buggy.
So with the previously given changes to reaim, I get the below results
on my 4 socket Haswell with the new version of 1/3 (also below).

I still need to update 3/3..

Note that I think my reaim change wrecked the jobs/min calculation
somehow, as it keeps increasing. I do think however that the numbers are
comparable between runs, since they're wrecked the same way.

The performance data for the 2 kernels were roughly the same. This was what I had been expecting as there was no change in algorithm in how the slowpath was being handled. So I was surprised by Jason's result yesterday showing such a big difference.

Cheers,
Longman