Re: mm: deadlock between get_online_cpus/pcpu_alloc

From: Vlastimil Babka
Date: Tue Feb 07 2017 - 04:49:55 EST


On 02/07/2017 10:43 AM, Mel Gorman wrote:
> If I'm reading this right, a hot-remove will set the pool POOL_DISASSOCIATED
> and unbound. A workqueue queued for draining get migrated during hot-remove
> and a drain operation will execute twice on a CPU -- one for what was
> queued and a second time for the CPU it was migrated from. It should still
> work with flush_work which doesn't appear to block forever if an item
> got migrated to another workqueue. The actual drain workqueue function is
> using the CPU ID it's currently running on so it shouldn't get confused.

Is the worker that will process this migrated workqueue also guaranteed
to be pinned to a cpu for the whole work, though? drain_local_pages()
needs that guarantee.

> Tejun, did I miss anything? Does a workqueue item queued on a CPU being
> offline get unbound and a caller can still flush it safely? In this
> specific case, it's ok that the workqueue item does not run on the CPU it
> was queued on.
>