Re: [PATCH RFC v7 00/23] DEPT(Dependency Tracker)

From: Boqun Feng
Date: Tue Jan 17 2023 - 14:04:57 EST


[Cc Waiman]

On Mon, Jan 16, 2023 at 10:00:52AM -0800, Linus Torvalds wrote:
> [ Back from travel, so trying to make sense of this series.. ]
>
> On Sun, Jan 8, 2023 at 7:33 PM Byungchul Park <byungchul.park@xxxxxxx> wrote:
> >
> > I've been developing a tool for detecting deadlock possibilities by
> > tracking wait/event rather than lock(?) acquisition order to try to
> > cover all synchonization machanisms. It's done on v6.2-rc2.
>
> Ugh. I hate how this adds random patterns like
>
> if (timeout == MAX_SCHEDULE_TIMEOUT)
> sdt_might_sleep_strong(NULL);
> else
> sdt_might_sleep_strong_timeout(NULL);
> ...
> sdt_might_sleep_finish();
>
> to various places, it seems so very odd and unmaintainable.
>
> I also recall this giving a fair amount of false positives, are they all fixed?
>

>From the following part in the cover letter, I guess the answer is no?

...
6. Multiple reports are allowed.
7. Deduplication control on multiple reports.
8. Withstand false positives thanks to 6.
...

seems to me that the logic is since DEPT allows multiple reports so that
false positives are fitlerable by users?

> Anyway, I'd really like the lockdep people to comment and be involved.

I never get Cced, so I'm unware of this for a long time...

A few comments after a quick look:

* Looks like the DEPT dependency graph doesn't handle the
fair/unfair readers as lockdep current does. Which bring the
next question.

* Can DEPT pass all the selftests of lockdep in
lib/locking-selftests.c?

* Instead of introducing a brand new detector/dependency tracker,
could we first improve the lockdep's dependency tracker? I think
Byungchul also agrees that DEPT and lockdep should share the
same dependency tracker and the benefit of improving the
existing one is that we can always use the self test to catch
any regression. Thoughts?

Actually the above sugguest is just to revert revert cross-release
without exposing any annotation, which I think is more practical to
review and test.

I'd sugguest we 1) first improve the lockdep dependency tracker with
wait/event in mind and then 2) introduce wait related annotation so that
users can use, and then 3) look for practical ways to resolve false
positives/multi reports with the help of users, if all goes well,
4) make it all operation annotated.

Thoughts?

Regards,
Boqun

> We did have a fairly recent case of "lockdep doesn't track page lock
> dependencies because it fundamentally cannot" issue, so DEPT might fix
> those kinds of missing dependency analysis. See
>
> https://lore.kernel.org/lkml/00000000000060d41f05f139aa44@xxxxxxxxxx/
>
> for some context to that one, but at teh same time I would *really*
> want the lockdep people more involved and acking this work.
>
> Maybe I missed the email where you reported on things DEPT has found
> (and on the lack of false positives)?
>
> Linus
>