Re: [PATCHv2, RFC 13/30] thp, mm: implement grab_cache_huge_page_write_begin()

From: Dave Hansen
Date: Thu Mar 21 2013 - 14:13:58 EST


On 03/14/2013 10:50 AM, Kirill A. Shutemov wrote:
> From: "Kirill A. Shutemov" <kirill.shutemov@xxxxxxxxxxxxxxx>
>
> The function is grab_cache_page_write_begin() twin but it tries to
> allocate huge page at given position aligned to HPAGE_CACHE_NR.

The obvious question, then, is whether we should just replace
grab_cache_page_write_begin() with this code and pass in HPAGE_CACHE_NR
or 1 based on whether we're doing a huge or normal page.

> diff --git a/mm/filemap.c b/mm/filemap.c
> index 38fdc92..bdedb1b 100644
> --- a/mm/filemap.c
> +++ b/mm/filemap.c
> @@ -2332,6 +2332,64 @@ found:
> }
> EXPORT_SYMBOL(grab_cache_page_write_begin);
>
> +#ifdef CONFIG_TRANSPARENT_HUGEPAGE
> +/*
> + * Find or create a huge page at the given pagecache position, aligned to
> + * HPAGE_CACHE_NR. Return the locked huge page.
> + *
> + * If, for some reason, it's not possible allocate a huge page at this
> + * possition, it returns NULL. Caller should take care of fallback to small
> + * pages.
> + *
> + * This function is specifically for buffered writes.
> + */
> +struct page *grab_cache_huge_page_write_begin(struct address_space *mapping,
> + pgoff_t index, unsigned flags)
> +{
> + int status;
> + gfp_t gfp_mask;
> + struct page *page;
> + gfp_t gfp_notmask = 0;
> +
> + BUG_ON(index & HPAGE_CACHE_INDEX_MASK);

--
> + gfp_mask = mapping_gfp_mask(mapping);
> + BUG_ON(!(gfp_mask & __GFP_COMP));
> + if (mapping_cap_account_dirty(mapping))
> + gfp_mask |= __GFP_WRITE;
> + if (flags & AOP_FLAG_NOFS)
> + gfp_notmask = __GFP_FS;

This whole hunk is both non-obvious and copy-n-pasted from
grab_cache_page_write_begin(). That makes me worry that bugs/features
will get added/removed in one and not the other. I really think they
need to get consolidated somehow.

> +repeat:
> + page = find_lock_page(mapping, index);
> + if (page) {
> + if (!PageTransHuge(page)) {
> + unlock_page(page);
> + page_cache_release(page);
> + return NULL;
> + }
> + goto found;
> + }
> +
> + page = alloc_pages(gfp_mask & ~gfp_notmask, HPAGE_PMD_ORDER);

I alluded to this a second ago, but what's wrong with alloc_hugepage()?

> + if (!page) {
> + count_vm_event(THP_WRITE_FAILED);
> + return NULL;
> + }
> +
> + count_vm_event(THP_WRITE_ALLOC);
> + status = add_to_page_cache_lru(page, mapping, index,
> + GFP_KERNEL & ~gfp_notmask);
> + if (unlikely(status)) {
> + page_cache_release(page);
> + if (status == -EEXIST)
> + goto repeat;
> + return NULL;
> + }

I'm rather un-fond of sprinking likely/unlikelies around. But, I guess
this is really just copied from the existing one. <sigh>

> +found:
> + wait_on_page_writeback(page);
> + return page;
> +}
> +#endif

So, I diffed :

-struct page *grab_cache_page_write_begin(struct address_space
vs.
+struct page *grab_cache_huge_page_write_begin(struct address_space

They're just to similar to ignore. Please consolidate them somehow.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/