Re: Possible race in dev_coredumpm()-del_timer() path

From: Greg KH
Date: Wed Apr 13 2022 - 01:34:40 EST


On Wed, Apr 13, 2022 at 10:59:22AM +0530, Mukesh Ojha wrote:
> Hi All,
>
> We are hitting one race due to which try_to_grab_pending() is stuck .

What kernel version are you using?

> In following scenario, while running (p1)dev_coredumpm() devcd device is
> added to
> the framework and uevent notification sent to userspace that result in the
> call to (p2) devcd_data_write()
> which eventually try to delete the queued timer which in the racy scenario
> timer is not queued yet.
> So, debug object report some warning and in the meantime timer is
> initialized and queued from p1 path.
> and from p2 path it gets overriden again timer->entry.pprev=NULL and
> try_to_grab_pending() stuck
> as del_timer() always return 0 as timer_pending() return false.
>
> P1 P2(X)
>
>
>      dev_coredumpm()
>
>                                           Uevent notification sent to
> userspace
>                                           for device addition
>
>             device_add() ========================>                 Process X
> reads this uevents
> notification and do write call
> that results in call to
>
> devcd_data_write()
> mod_delayed_work()
> try_to_grab_pending()
> del_timer()
> debug_assert_init()
>
>             INIT_DELAYED_WORK
>                    (&devcd->del_wk, devcd_del);
>             schedule_delayed_work(&devcd->del_wk,
>                    DEVCD_TIMEOUT);
>
> debug_object_fixup()
> timer_fixup_assert_init()
> timer_setup()
> do_init_timer()   ==> reinitialized the timer to timer->entry.pprev=NULL
>
> timer_pending()
> !hlist_unhashed_lockless(&timer->entry)
> !h->pprev

The above is confusing and not able to be understood due to the
formatting mess. Care to fix this up and resend?

thanks,

greg k-h