Re: [RFC] Tight check of pfn_valid on sparsemem

From: KAMEZAWA Hiroyuki
Date: Mon Jul 12 2010 - 23:24:40 EST


On Tue, 13 Jul 2010 00:53:48 +0900
Minchan Kim <minchan.kim@xxxxxxxxx> wrote:

> Kukjin, Could you test below patch?
> I don't have any sparsemem system. Sorry.
>
> -- CUT DOWN HERE --
>
> Kukjin reported oops happen while he change min_free_kbytes
> http://www.spinics.net/lists/arm-kernel/msg92894.html
> It happen by memory map on sparsemem.
>
> The system has a memory map following as.
> section 0 section 1 section 2
> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
> SECTION_SIZE_BITS 28(256M)
>
> It means section 0 is an incompletely filled section.
> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
>
> It checks only mem_section's validation.
> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
> It's not what we want.
>
> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
>
> Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx>
> Reported-by: Kukjin Kim <kgene.kim@xxxxxxxxxxx>
>
> P.S)
> It is just RFC. If we agree with this, I will make the patch on mmotm.
>
> --
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b4d109e..6c2147a 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -979,6 +979,8 @@ struct mem_section {
> struct page_cgroup *page_cgroup;
> unsigned long pad;
> #endif
> + unsigned long start_pfn;
> + unsigned long end_pfn;
> };
>

I have 2 concerns.
1. This makes mem_section twice. Wasting too much memory and not good for cache.
But yes, you can put this under some CONFIG which has small number of mem_section[].

2. This can't be help for a case where a section has multiple small holes.


Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
==
Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
To handle that, pfn_valid() should check there are really memmap or not.
For that purpose, __get_user() can be used.
This idea is from ia64_pfn_valid().

Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@xxxxxxxxxxxxxx>
---
include/linux/mmzone.h | 12 ++++++++++++
mm/sparse.c | 17 +++++++++++++++++
2 files changed, 29 insertions(+)

Index: mmotm-2.6.35-0701/include/linux/mmzone.h
===================================================================
--- mmotm-2.6.35-0701.orig/include/linux/mmzone.h
+++ mmotm-2.6.35-0701/include/linux/mmzone.h
@@ -1047,12 +1047,24 @@ static inline struct mem_section *__pfn_
return __nr_to_section(pfn_to_section_nr(pfn));
}

+#ifndef CONFIG_ARCH_HAS_HOLES_IN_MEMMAP
static inline int pfn_valid(unsigned long pfn)
{
if (pfn_to_section_nr(pfn) >= NR_MEM_SECTIONS)
return 0;
return valid_section(__nr_to_section(pfn_to_section_nr(pfn)));
}
+#else
+extern int pfn_valid_mapped(unsigned long pfn);
+static inline int pfn_valid(unsigned long pfn)
+{
+ if (pfn_to_seciton_nr(pfn) >= NR_MEM_SECTIONS)
+ return 0;
+ if (!valid_section(__nr_to_section(pfn_to_section_nr(pfn))))
+ return 0;
+ return pfn_valid_mapped(pfn);
+}
+#endif

static inline int pfn_present(unsigned long pfn)
{
Index: mmotm-2.6.35-0701/mm/sparse.c
===================================================================
--- mmotm-2.6.35-0701.orig/mm/sparse.c
+++ mmotm-2.6.35-0701/mm/sparse.c
@@ -799,3 +799,20 @@ void sparse_remove_one_section(struct zo
free_section_usemap(memmap, usemap);
}
#endif
+
+#ifdef CONFIG_ARCH_HAS_HOLES_IN_MEMMAP
+int pfn_valid_mapped(unsigned long pfn)
+{
+ struct page *page = pfn_to_page(pfn);
+ char *lastbyte = (char *)(page+1)-1;
+ char byte;
+
+ if(__get_user(byte, page) != 0)
+ return 0;
+
+ if ((((unsigned long)page) & PAGE_MASK) ==
+ (((unsigned long)lastbyte) & PAGE_MASK))
+ return 1;
+ return (__get_user(byte,lastbyte) == 0);
+}
+#endif





--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/