Re: [PATCH 11/11] ksm: stop hotremove lockdep warning
From: Hugh Dickins
Date: Mon Feb 11 2013 - 17:13:41 EST
On Fri, 8 Feb 2013, Gerald Schaefer wrote:
> On Fri, 25 Jan 2013 18:10:18 -0800 (PST)
> Hugh Dickins <hughd@xxxxxxxxxx> wrote:
> > Complaints are rare, but lockdep still does not understand the way
> > ksm_memory_callback(MEM_GOING_OFFLINE) takes ksm_thread_mutex, and
> > holds it until the ksm_memory_callback(MEM_OFFLINE): that appears
> > to be a problem because notifier callbacks are made under down_read
> > of blocking_notifier_head->rwsem (so first the mutex is taken while
> > holding the rwsem, then later the rwsem is taken while still holding
> > the mutex); but is not in fact a problem because mem_hotplug_mutex
> > is held throughout the dance.
> > There was an attempt to fix this with mutex_lock_nested(); but if that
> > happened to fool lockdep two years ago, apparently it does so no
> > longer.
> > I had hoped to eradicate this issue in extending KSM page migration
> > not to need the ksm_thread_mutex. But then realized that although
> > the page migration itself is safe, we do still need to lock out ksmd
> > and other users of get_ksm_page() while offlining memory - at some
> > point between MEM_GOING_OFFLINE and MEM_OFFLINE, the struct pages
> > themselves may vanish, and get_ksm_page()'s accesses to them become a
> > violation.
> > So, give up on holding ksm_thread_mutex itself from MEM_GOING_OFFLINE
> > to MEM_OFFLINE, and add a KSM_RUN_OFFLINE flag, and
> > wait_while_offlining() checks, to achieve the same lockout without
> > being caught by lockdep. This is less elegant for KSM, but it's more
> > important to keep lockdep useful to other users - and I apologize for
> > how long it took to fix.
> Thanks a lot for the patch! I verified that it fixes the lockdep warning
> that we got on memory hotremove.
> > Reported-by: Gerald Schaefer <gerald.schaefer@xxxxxxxxxx>
> > Signed-off-by: Hugh Dickins <hughd@xxxxxxxxxx>
Thank you for reporting and testing and reporting back:
sorry again for taking so long to fix it.
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/