Re: [RFC][PATCH] x86: make text_poke() atomic

From: Ingo Molnar
Date: Mon Mar 02 2009 - 17:24:18 EST



* Masami Hiramatsu <mhiramat@xxxxxxxxxx> wrote:

> Mathieu Desnoyers wrote:
>> * Masami Hiramatsu (mhiramat@xxxxxxxxxx) wrote:
>>> Index: linux-2.6/init/main.c
>>> ===================================================================
>>> --- linux-2.6.orig/init/main.c
>>> +++ linux-2.6/init/main.c
>>> @@ -676,6 +676,9 @@ asmlinkage void __init start_kernel(void
>>> taskstats_init_early();
>>> delayacct_init();
>>>
>>> +#ifdef CONFIG_X86
>>> + text_poke_init();
>>> +#endif
>>
>> All good, except this above. There should be an empty text_poke_init()
>> in some header file, and an implementation for the X86 arch rather than
>> a ifdef in init/main.c.
>
> Hmm, I'd rather use __weak function instead of defining it in some header
> files, because text_poke() and alternatives exist only on x86.
>
> I know that we need to discuss cross modifying code on x86 with
> Arjan or other Intel engineers. This patch may still be useful
> for removing unnecessary vm_area allocation in text_poke().
>
> Thank you,
>
> ---
>
> Use map_vm_area() instead of vmap() in text_poke() for
> avoiding page allocation and delayed unmapping, and call
> vunmap_page_range() and local_flush_tlb() directly because
> this mapping is temporary and local.
>
> At the result of above change, text_poke() becomes atomic and
> can be called from stop_machine() etc.

That looks like a good fix in itself - see a few minor details
below.

(Note, i could not try your patch because it has widespread
whitespace damage - please watch out for this for future
patches.)

> +static struct vm_struct *text_poke_area[2];
> +static DEFINE_SPINLOCK(text_poke_lock);
> +
> +void __init text_poke_init(void)
> +{
> + text_poke_area[0] = get_vm_area(PAGE_SIZE, VM_ALLOC);
> + text_poke_area[1] = get_vm_area(2 * PAGE_SIZE, VM_ALLOC);
> + BUG_ON(!text_poke_area[0] || !text_poke_area[1]);

BUG_ON() for non-100%-essential init code is a no-no. Please
change it to WARN_ON() so that people have a chance to report i.

Also, i think all these vma complications came from the decision
to use vmap - and vmap enhancements in .29 complicated this
supposedly-simple interface.

So perhaps another approach to (re-)consider would be to go back
to atomic fixmaps here. It spends 3 slots but that's no big
deal.

In exchange it will be conceptually simpler, and will also scale
much better than a global spinlock. What do you think?

Ingo
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/