Re: [PATCH v4 (resend)] lockdep: Allow tuning tracing capacity constants.

From: Dmitry Vyukov
Date: Wed Jan 20 2021 - 07:00:25 EST


On Wed, Jan 20, 2021 at 11:12 AM Tetsuo Handa
<penguin-kernel@xxxxxxxxxxxxxxxxxxx> wrote:
>
> Since syzkaller continues various test cases until the kernel crashes,
> syzkaller tends to examine more locking dependencies than normal systems.
> As a result, syzbot is reporting that the fuzz testing was terminated
> due to hitting upper limits lockdep can track [1] [2] [3].
>
> Peter Zijlstra does not want to allow tuning these limits via kernel
> config options, for such change discourages thinking. But analysis via
> /proc/lockdep* did not show any obvious culprit [4] [5]. It is possible
> that many hundreds of kn->active lock instances are to some degree
> contributing to these problems, but there is no means to verify whether
> these instances are created for protecting same callback functions.
> Unless Peter provides a way to make these instances per "which callback
> functions the lock instance will call (identified by something like MD5
> of string representations of callback functions which each lock instance
> will protect)" than plain "serial number", I don't think that we can
> verify the culprit.
>
> [1] https://syzkaller.appspot.com/bug?id=3d97ba93fb3566000c1c59691ea427370d33ea1b
> [2] https://syzkaller.appspot.com/bug?id=381cb436fe60dc03d7fd2a092b46d7f09542a72a
> [3] https://syzkaller.appspot.com/bug?id=a588183ac34c1437fc0785e8f220e88282e5a29f
> [4] https://lkml.kernel.org/r/4b8f7a57-fa20-47bd-48a0-ae35d860f233@xxxxxxxxxxxxxxxxxxx
> [5] https://lkml.kernel.org/r/1c351187-253b-2d49-acaf-4563c63ae7d2@xxxxxxxxxxxxxxxxxxx
>
> Reported-by: syzbot <syzbot+cd0ec5211ac07c18c049@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Reported-by: syzbot <syzbot+91fd909b6e62ebe06131@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Reported-by: syzbot <syzbot+62ebe501c1ce9a91f68c@xxxxxxxxxxxxxxxxxxxxxxxxx>
> Signed-off-by: Tetsuo Handa <penguin-kernel@xxxxxxxxxxxxxxxxxxx>
> Acked-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>

Thanks for your persistence!
I still support this. And assessment of lockdep stats on overflow
seems to confirm it's just a very large lock graph triggered by
syzkaller.


> ---
> kernel/locking/lockdep.c | 2 +-
> kernel/locking/lockdep_internals.h | 8 +++---
> lib/Kconfig.debug | 40 ++++++++++++++++++++++++++++++
> 3 files changed, 45 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
> index c1418b47f625..c0553872668a 100644
> --- a/kernel/locking/lockdep.c
> +++ b/kernel/locking/lockdep.c
> @@ -1391,7 +1391,7 @@ static int add_lock_to_list(struct lock_class *this,
> /*
> * For good efficiency of modular, we use power of 2
> */
> -#define MAX_CIRCULAR_QUEUE_SIZE 4096UL
> +#define MAX_CIRCULAR_QUEUE_SIZE (1UL << CONFIG_LOCKDEP_CIRCULAR_QUEUE_BITS)
> #define CQ_MASK (MAX_CIRCULAR_QUEUE_SIZE-1)
>
> /*
> diff --git a/kernel/locking/lockdep_internals.h b/kernel/locking/lockdep_internals.h
> index de49f9e1c11b..ecb8662e7a4e 100644
> --- a/kernel/locking/lockdep_internals.h
> +++ b/kernel/locking/lockdep_internals.h
> @@ -99,16 +99,16 @@ static const unsigned long LOCKF_USED_IN_IRQ_READ =
> #define MAX_STACK_TRACE_ENTRIES 262144UL
> #define STACK_TRACE_HASH_SIZE 8192
> #else
> -#define MAX_LOCKDEP_ENTRIES 32768UL
> +#define MAX_LOCKDEP_ENTRIES (1UL << CONFIG_LOCKDEP_BITS)
>
> -#define MAX_LOCKDEP_CHAINS_BITS 16
> +#define MAX_LOCKDEP_CHAINS_BITS CONFIG_LOCKDEP_CHAINS_BITS
>
> /*
> * Stack-trace: tightly packed array of stack backtrace
> * addresses. Protected by the hash_lock.
> */
> -#define MAX_STACK_TRACE_ENTRIES 524288UL
> -#define STACK_TRACE_HASH_SIZE 16384
> +#define MAX_STACK_TRACE_ENTRIES (1UL << CONFIG_LOCKDEP_STACK_TRACE_BITS)
> +#define STACK_TRACE_HASH_SIZE (1 << CONFIG_LOCKDEP_STACK_TRACE_HASH_BITS)
> #endif
>
> /*
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 7937265ef879..4cb84b499636 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -1332,6 +1332,46 @@ config LOCKDEP
> config LOCKDEP_SMALL
> bool
>
> +config LOCKDEP_BITS
> + int "Bitsize for MAX_LOCKDEP_ENTRIES"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 15
> + help
> + Try increasing this value if you hit "BUG: MAX_LOCKDEP_ENTRIES too low!" message.
> +
> +config LOCKDEP_CHAINS_BITS
> + int "Bitsize for MAX_LOCKDEP_CHAINS"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 16
> + help
> + Try increasing this value if you hit "BUG: MAX_LOCKDEP_CHAINS too low!" message.
> +
> +config LOCKDEP_STACK_TRACE_BITS
> + int "Bitsize for MAX_STACK_TRACE_ENTRIES"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 19
> + help
> + Try increasing this value if you hit "BUG: MAX_STACK_TRACE_ENTRIES too low!" message.
> +
> +config LOCKDEP_STACK_TRACE_HASH_BITS
> + int "Bitsize for STACK_TRACE_HASH_SIZE"
> + depends on LOCKDEP && !LOCKDEP_SMALL
> + range 10 30
> + default 14
> + help
> + Try increasing this value if you need large MAX_STACK_TRACE_ENTRIES.
> +
> +config LOCKDEP_CIRCULAR_QUEUE_BITS
> + int "Bitsize for elements in circular_queue struct"
> + depends on LOCKDEP
> + range 10 30
> + default 12
> + help
> + Try increasing this value if you hit "lockdep bfs error:-1" warning due to __cq_enqueue() failure.
> +
> config DEBUG_LOCKDEP
> bool "Lock dependency engine debugging"
> depends on DEBUG_KERNEL && LOCKDEP
> --
> 2.18.4
>