Re: [RFC] Tight check of pfn_valid on sparsemem

From: KAMEZAWA Hiroyuki
Date: Tue Jul 13 2010 - 00:28:08 EST


On Tue, 13 Jul 2010 13:11:14 +0900
Minchan Kim <minchan.kim@xxxxxxxxx> wrote:

> On Tue, Jul 13, 2010 at 12:19 PM, KAMEZAWA Hiroyuki
> <kamezawa.hiroyu@xxxxxxxxxxxxxx> wrote:
> > On Tue, 13 Jul 2010 00:53:48 +0900
> > Minchan Kim <minchan.kim@xxxxxxxxx> wrote:
> >
> >> Kukjin, Could you test below patch?
> >> I don't have any sparsemem system. Sorry.
> >>
> >> -- CUT DOWN HERE --
> >>
> >> Kukjin reported oops happen while he change min_free_kbytes
> >> http://www.spinics.net/lists/arm-kernel/msg92894.html
> >> It happen by memory map on sparsemem.
> >>
> >> The system has a memory map following as.
> >> Â Â Âsection 0 Â Â Â Â Â Â section 1 Â Â Â Â Â Â Âsection 2
> >> 0x20000000-0x25000000, 0x40000000-0x50000000, 0x50000000-0x58000000
> >> SECTION_SIZE_BITS 28(256M)
> >>
> >> It means section 0 is an incompletely filled section.
> >> Nontheless, current pfn_valid of sparsemem checks pfn loosely.
> >>
> >> It checks only mem_section's validation.
> >> So in above case, pfn on 0x25000000 can pass pfn_valid's validation check.
> >> It's not what we want.
> >>
> >> The Following patch adds check valid pfn range check on pfn_valid of sparsemem.
> >>
> >> Signed-off-by: Minchan Kim <minchan.kim@xxxxxxxxx>
> >> Reported-by: Kukjin Kim <kgene.kim@xxxxxxxxxxx>
> >>
> >> P.S)
> >> It is just RFC. If we agree with this, I will make the patch on mmotm.
> >>
> >> --
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index b4d109e..6c2147a 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -979,6 +979,8 @@ struct mem_section {
> >> Â Â Â Â struct page_cgroup *page_cgroup;
> >> Â Â Â Â unsigned long pad;
> >> Â#endif
> >> + Â Â Â unsigned long start_pfn;
> >> + Â Â Â unsigned long end_pfn;
> >> Â};
> >>
> >
> > I have 2 concerns.
> > Â1. This makes mem_section twice. Wasting too much memory and not good for cache.
> > Â ÂBut yes, you can put this under some CONFIG which has small number of mem_section[].
> >
>
> I think memory usage isn't a big deal. but for cache, we can move
> fields into just after section_mem_map.
>
I don't think so. This addtional field can eat up the amount of memory you saved
by unmap.

> > Â2. This can't be help for a case where a section has multiple small holes.
>
> I agree. But this(not punched hole but not filled section problem)
> isn't such case. But it would be better to handle it altogether. :)
>
> >
> > Then, my proposal for HOLES_IN_MEMMAP sparsemem is below.
> > ==
> > Some architectures unmap memmap[] for memory holes even with SPARSEMEM.
> > To handle that, pfn_valid() should check there are really memmap or not.
> > For that purpose, __get_user() can be used.
>
> Look at free_unused_memmap. We don't unmap pte of hole memmap.
> Is __get_use effective, still?
>
__get_user() works with TLB and page table, the vaddr is really mapped or not.
If you got SEGV, __get_user() returns -EFAULT. It works per page granule.

Thanks,
-Kame

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/