Re: [PATCH] i386: fix vmalloc_sync_all() for Xen

From: Jeremy Fitzhardinge
Date: Thu Jun 19 2008 - 14:17:57 EST


Jan Beulich wrote:
Since start is a static variable, it must be updated this way. The intention
here is to shorten the loop in later runs - since kernel page table entries
never go away, this is possible. Possibly just using the insync array would
be sufficient, but when I first coded this I wanted to avoid as much
overhead as was possible.

Yes, I see. How often does this get called? alloc_vm_area() and register_notify_die(). alloc_vm_area is only called by the grant-table code, and register_notify_die() is boot-time init. Is this worth optimising at all?


spin_lock_irqsave(&pgd_lock, flags);
+ if (unlikely(list_empty(&pgd_list))) {
+ spin_unlock_irqrestore(&pgd_lock, flags);
+ return;
+ }
This seems a bit warty. If the list is empty, then won't the list_for_each_entry() just fall through? Presumably this only applies to boot, since pgd_list won't be empty on a running system with usermode processes. Is there a correctness issue here, or is it just a micro-optimisation?
No, it isn't. Note the setting to NULL of page, which after the loop gets
tested for. list_for_each_entry() would never yield a NULL page, even
if the list is empty.
Does that matter? If pgd_list is empty, then it's in sync by definition. Why does it need special-casing?

Yes, certainly. But it would result in all insync bits set, which would be
wrong - only non-empty page directory entries can be in sync.

I think it would be better to separately test whether the vmalloc mapping is present in the init_mm and skip the syncing loop in that case, rather than this somewhat convoluted logic to overload the test in vmalloc_sync_one.

list_for_each_entry(page, &pgd_list, lru) {
if (!vmalloc_sync_one(page_address(page),
- address))
+ address)) {
+ BUG_ON(list_first_entry(&pgd_list,
+ struct page,
+ lru) != page);
What condition is this testing for?
This is a replacement of the BUG_ON() that an earlier patch from you
removed: Failure of vmalloc_sync_one() must happen on the first
entry or never, and this is what is being checked for here.
Could you add a comment?

Sure, though there was none originally, and the intention seemed
quite clear to me.

Well, looks to me like vmalloc_sync_one can only return NULL iff the vmalloc mapping is absent in init_mm, so that's going to be invariant with respect to any other pgd you pass in. So I don't think the BUG_ON will ever fire, and it's unclear what actual logical property it's testing for.

I think all this can be cleaned up quite a bit, but this patch is an improvement over what's currently there.

Acked-by: Jeremy Fitzhardinge <jeremy.fitzhardinge@xxxxxxxxxx>

J

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/