Re: [RFC PATCH] kthread: do not modify running work

From: Petr Mladek
Date: Mon Oct 05 2020 - 08:21:00 EST


On Mon 2020-10-05 18:21:05, Hillf Danton wrote:
> On Mon, 5 Oct 2020 10:38:29 Petr Mladek wrote:
> > On Sun 2020-10-04 10:12:13, Hillf Danton wrote:
> > > What is the difference of invoking kthread_queue_delayed_work() from the
> > > callback from kthread_mod_delayed_work()?
> >
> > kthread_queue_delayed_work() does nothing when the work is already
> > queued. kthread_mod_delayed_work() removes the work from the queue
> > if it is there and queue it again with the newly requested delay.
>
> Can you let us know the reasons why we need to remove the work from
> queue in callback?

Each work could get queued only once at the same time. It can be either
in work_list or in delayed_work_list. But it must never be in both at
the same time.

Canceling the work solves race with the timer callback. We must be
sure that is not running in parallel and will not try to shuffle
the worker lists.

Note that there is a difference between queued work and running work.
Queued work is either in worker->work_list or in
worker->delayed_work_list. While worker->current_work points
to the currently running work.

Now, the work can be running and queued at the same time be design.
It is quite common usecase.

> > > Given the queue method, it is no win to modify delayed work from callback
> > > in any case because "we are not adding interfaces just because we can."
> >
> > What about ipmi_kthread_worker_func()? It is delayed work that
> > queues itself.
>
> Can you shed some light on where I can find ipmi_kthread_worker_func()
> in the mainline?

Ah, the right name is mv88e6xxx_irq_poll().

> > What is the motivation for this patch, please?
>
> See the subject line.
>
> > Does it solve some real problem?
>
> Once more
> 1) avoid running a delayed work more than once if it is one-ff.
>
> 2) cut the risk of modifying a running one-off delayed work with
> resources released in the callback.

Do you have any real bug where this happen, please?

The API works this way be design. It makes it generic and usable
for different use-cases. It is up to the caller to use it the correct
way:

+ queue the work only once when the work should be done only once
+ make sure that the work is not queued when the service is being
stopped

Note that it is not only about queuing the work from its own
callback. It is also about races between queuing and running.

There is no guarantee when the work will be running. It happens
when the worker thread gets scheduled and the work is the first
on the list. It is up to the API user to make sure that the work
will get queued when needed and vice versa.

It is possible that some other behavior might be more practical
in some scenarios but it would complicate situation in others.


If you need more details, please google the discussion when the API was
developed.

Best Regards,
Petr