Re: [Linux-v4.6-rc1] ext4: WARNING: CPU: 2 PID: 2692 at kernel/locking/lockdep.c:2017 __lock_acquire+0x180e/0x2260

From: Peter Zijlstra
Date: Wed Mar 30 2016 - 05:50:53 EST


On Wed, Mar 30, 2016 at 11:36:59AM +0200, Peter Zijlstra wrote:
> On Tue, Mar 29, 2016 at 10:47:02AM +0200, Ingo Molnar wrote:
>
> > > You are right; this is lockdep running into a hash collision; which is a new
> > > DEBUG_LOCKDEP test. See 9e4e7554e755 ("locking/lockdep: Detect chain_key
> > > collisions").
> >
> > I've Cc:-ed Alfredo Alvarez Fernandez who added that test.
>
> OK, so while the code in check_no_collision() seems sensible, it relies
> on borken bits.
>
> The whole chain_hlocks and /proc/lockdep_chains stuff appears to have
> been buggered from the start.
>
> The below patch should fix this.

Note that unless we had more than 65536 chain_hlocks consumed the patch
would not make a difference.

> Furthermore, our hash function has definite room for improvement.

And no matter how good we make it, a u64 hash is bound to collide at
some point (or of any size really).

Also, we could make them non-fatal, returning true from
lookup_chain_cache() is always correct (_very_ expensive, but correct),
so in case of doubt we could just return true.