Re: [PATCH 02/11] Add "handle page fault" PV helper.

From: Ingo Molnar
Date: Mon Nov 02 2009 - 11:29:56 EST



* Gleb Natapov <gleb@xxxxxxxxxx> wrote:

> On Mon, Nov 02, 2009 at 05:12:48PM +0100, Ingo Molnar wrote:
> >
> > * Gleb Natapov <gleb@xxxxxxxxxx> wrote:
> >
> > > On Mon, Nov 02, 2009 at 10:22:14AM +0100, Ingo Molnar wrote:
> > > >
> > > > * Gleb Natapov <gleb@xxxxxxxxxx> wrote:
> > > >
> > > > > diff --git a/arch/x86/mm/fault.c b/arch/x86/mm/fault.c
> > > > > index f4cee90..14707dc 100644
> > > > > --- a/arch/x86/mm/fault.c
> > > > > +++ b/arch/x86/mm/fault.c
> > > > > @@ -952,6 +952,9 @@ do_page_fault(struct pt_regs *regs, unsigned long error_code)
> > > > > int write;
> > > > > int fault;
> > > > >
> > > > > + if (arch_handle_page_fault(regs, error_code))
> > > > > + return;
> > > > > +
> > > >
> > > > This patch is not acceptable unless it's done cleaner. Currently we
> > > > already have 3 callbacks in do_page_fault() (kmemcheck, mmiotrace,
> > > > notifier), and this adds a fourth one. Please consolidate them into a
> > > > single callback site, this is a hotpath on x86.
> > > >
> > > This call is patched out by paravirt patching mechanism so overhead
> > > should be zero for non paravirt cases. [...]
> >
> > arch_handle_page_fault() isnt upstream yet - precisely what is the
> > instruction sequence injected into do_page_fault() in the patched-out
> > case?
>
> It is introduced by the same patch. The instruction inserted is:
> xor %rax, %rax

ok.

My observations still stand:

> > > [...] What do you want to achieve by consolidate them into single
> > > callback? [...]
> >
> > Less bloat in a hotpath and a shared callback infrastructure.
> >
> > > [...] I mean the code will still exist and will have to be executed on
> > > every #PF. Is the goal to move them out of line?
> >
> > The goal is to have a single callback site for all the users - which
> > call-site is patched out ideally - on non-paravirt too if needed. Most
> > of these callbacks/notifier-chains have are inactive most of the time.
> >
> > I.e. a very low overhead 'conditional callback' facility, and a single
> > one - not just lots of them sprinkled around the code.

looks like a golden opportunity to get this right.

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/