Re: [PATCH] Minimal mmu notifiers for kvm

From: Robin Holt
Date: Fri Apr 25 2008 - 07:12:56 EST


This patch would require GRU to maintain its own page tables and hold
reference counts on the pages. That seems like a complete waste of
memory compared to Andrea's most recent patch. The invalidate_range_start
and invalidate_range_end pair is needed to eliminate the page reference
counts. The _start callout sets an internal structure in a state that
prevents GRU from satisfying faults, then executes the GRU instruction
to flush the TLB entry. The _end callout releases the block on faults.

On Fri, Apr 25, 2008 at 08:13:00AM +1000, Rusty Russell wrote:
> +static DEFINE_SPINLOCK(notifier_lock);
> +
> +/*
> + * Must not hold mmap_sem nor any other VM related lock when calling
> + * this registration function.
> + */
> +int mm_add_notifier_ops(struct mm_struct *mm,
> + const struct mmu_notifier_ops *mops)
> +{
> + int err;
> +
> + spin_lock(&notifier_lock);

This one global lock will get extremely hot when a 4096 MPI rank job
is starting up and every one of them goes to use the GRU at once. I am
not sure where x86_64 peaks out, but on ia64 going beyond approx 32 cpus
contending for the same lock made starvation a very important issue.

> + if (mm->mmu_notifier_ops)
> + err = -EBUSY;

So we can only use one of KVM or GRU or Quadrix or IB or (later) XPMEM
per mm?

> + else {
> + mm->mmu_notifier_ops = mops;
> + err = 0;
> + }
> + spin_unlock(&notifier_lock);
> + return err;
> +}
> +EXPORT_SYMBOL_GPL(mm_add_notifier_ops);

Robin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/