Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges

From: Christoph Lameter
Date: Wed Feb 27 2008 - 17:36:14 EST


On Wed, 20 Feb 2008, Nick Piggin wrote:

> On Friday 15 February 2008 17:49, Christoph Lameter wrote:
> > The invalidation of address ranges in a mm_struct needs to be
> > performed when pages are removed or permissions etc change.
> >
> > If invalidate_range_begin() is called with locks held then we
> > pass a flag into invalidate_range() to indicate that no sleeping is
> > possible. Locks are only held for truncate and huge pages.
>
> You can't sleep inside rcu_read_lock()!

Could you be specific? This refers to page migration? Hmmm... Guess we
would need to inc the refcount there instead?

> I must say that for a patch that is up to v8 or whatever and is
> posted twice a week to such a big cc list, it is kind of slack to
> not even test it and expect other people to review it.

It was tested with the GRU and XPmem. Andrea also reported success.

> Also, what we are going to need here are not skeleton drivers
> that just do all the *easy* bits (of registering their callbacks),
> but actual fully working examples that do everything that any
> real driver will need to do. If not for the sanity of the driver
> writer, then for the sanity of the VM developers (I don't want
> to have to understand xpmem or infiniband in order to understand
> how the VM works).

There are 3 different drivers that can already use it but the code is
complex and not easy to review. Skeletons are easy to allow people to get
started with it.

> > lru_add_drain();
> > tlb = tlb_gather_mmu(mm, 0);
> > update_hiwater_rss(mm);
> > + mmu_notifier(invalidate_range_begin, mm, address, end, atomic);
> > end = unmap_vmas(&tlb, vma, address, end, &nr_accounted, details);
> > if (tlb)
> > tlb_finish_mmu(tlb, address, end);
> > + mmu_notifier(invalidate_range_end, mm, address, end, atomic);
> > return end;
> > }
> >
>
> Where do you invalidate for munmap()?

zap_page_range() called from unmap_vmas().

> Also, how to you resolve the case where you are not allowed to sleep?
> I would have thought either you have to handle it, in which case nobody
> needs to sleep; or you can't handle it, in which case the code is
> broken.

That can be done in a variety of ways:

1. Change VM locking

2. Not handle file backed mappings (XPmem could work mostly in such a
config)

3. Keep the refcount elevated until pages are freed in another execution
context.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/