Re: [PATCH] lockdep: fix stack trace caching logic

From: Dmitry Vyukov
Date: Thu Feb 04 2016 - 12:51:23 EST


On Thu, Feb 4, 2016 at 6:44 PM, Peter Hurley <peter@xxxxxxxxxxxxxxxxxx> wrote:
> On 02/04/2016 05:40 AM, Dmitry Vyukov wrote:
>> check_prev_add() caches saved stack trace in static trace variable
>> to avoid duplicate save_trace() calls in dependencies involving trylocks.
>> But that caching logic contains a bug.
>
>> We may not save trace on first iteration due to early return from check_prev_add().
>
> This commit log should identify the role test instrumentation plays in triggering
> this bug: is it a recursive read lock dependency injected between existing lock
> dependencies? What test component triggered this?


I don't think that it has something to do with instrumentation.
check_prev_add() has explicitly coded early exits paths which don't
save stack trace.
For example, if first pair of locks is: try-read-lock vs read-lock, we
will not save stack trace; then next pair is: write-lock vs read-lock,
we will use bogus stack trace.




>> Then on the second iteration when we actually need the trace we don't save it
>> because we think that we've already saved it.
>>
>> Let check_prev_add() itself control when stack is saved.
>>
>> There is another bug. Trace variable is protected by graph lock.
>> But we can temporary release graph lock during printing.
>>
>> Fix this by invalidating cached stack trace when we release graph lock.
>>
>> Signed-off-by: Dmitry Vyukov <dvyukov@xxxxxxxxxx>
>> ---
>> kernel/locking/lockdep.c | 16 ++++++++++------
>> 1 file changed, 10 insertions(+), 6 deletions(-)
>>
>> diff --git a/kernel/locking/lockdep.c b/kernel/locking/lockdep.c
>> index 60ace56..c7710e4 100644
>> --- a/kernel/locking/lockdep.c
>> +++ b/kernel/locking/lockdep.c
>> @@ -1822,7 +1822,7 @@ check_deadlock(struct task_struct *curr, struct held_lock *next,
>> */
>> static int
>> check_prev_add(struct task_struct *curr, struct held_lock *prev,
>> - struct held_lock *next, int distance, int trylock_loop)
>> + struct held_lock *next, int distance, int *stack_saved)
>> {
>> struct lock_list *entry;
>> int ret;
>> @@ -1883,8 +1883,11 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
>> }
>> }
>>
>> - if (!trylock_loop && !save_trace(&trace))
>> - return 0;
>> + if (!*stack_saved) {
>> + if (!save_trace(&trace))
>> + return 0;
>> + *stack_saved = 1;
>> + }
>>
>> /*
>> * Ok, all validations passed, add the new lock
>> @@ -1907,6 +1910,8 @@ check_prev_add(struct task_struct *curr, struct held_lock *prev,
>> * Debugging printouts:
>> */
>> if (verbose(hlock_class(prev)) || verbose(hlock_class(next))) {
>> + /* We drop graph lock, so another thread can overwrite trace. */
>> + *stack_saved = 0;
>> graph_unlock();
>> printk("\n new dependency: ");
>> print_lock_name(hlock_class(prev));
>> @@ -1929,7 +1934,7 @@ static int
>> check_prevs_add(struct task_struct *curr, struct held_lock *next)
>> {
>> int depth = curr->lockdep_depth;
>> - int trylock_loop = 0;
>> + int stack_saved = 0;
>> struct held_lock *hlock;
>>
>> /*
>> @@ -1956,7 +1961,7 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
>> */
>> if (hlock->read != 2 && hlock->check) {
>> if (!check_prev_add(curr, hlock, next,
>> - distance, trylock_loop))
>> + distance, &stack_saved))
>> return 0;
>> /*
>> * Stop after the first non-trylock entry,
>> @@ -1979,7 +1984,6 @@ check_prevs_add(struct task_struct *curr, struct held_lock *next)
>> if (curr->held_locks[depth].irq_context !=
>> curr->held_locks[depth-1].irq_context)
>> break;
>> - trylock_loop = 1;
>> }
>> return 1;
>> out_bug:
>>
>