Re: linux-next: Tree for Nov 7

From: Michal Hocko
Date: Mon Nov 13 2017 - 10:31:49 EST


On Mon 13-11-17 15:09:09, Russell King - ARM Linux wrote:
> On Mon, Nov 13, 2017 at 03:11:40PM +0100, Michal Hocko wrote:
> > On Mon 13-11-17 10:20:06, Michal Hocko wrote:
> > > [Cc arm and ppc maintainers]
> > >
> > > Thanks a lot for testing!
> > >
> > > On Sun 12-11-17 11:38:02, Joel Stanley wrote:
> > > > On Fri, Nov 10, 2017 at 11:00 PM, Michal Hocko <mhocko@xxxxxxxxxx> wrote:
> > > > > Hi Joel,
> > > > >
> > > > > On Wed 08-11-17 15:20:50, Michal Hocko wrote:
> > > > > [...]
> > > > >> > There are a lot of messages on the way up that look like this:
> > > > >> >
> > > > >> > [ 2.527460] Uhuuh, elf segement at 000d9000 requested but the
> > > > >> > memory is mapped already
> > > > >> > [ 2.540160] Uhuuh, elf segement at 000d9000 requested but the
> > > > >> > memory is mapped already
> > > > >> > [ 2.546153] Uhuuh, elf segement at 000d9000 requested but the
> > > > >> > memory is mapped already
> > > > >> >
> > > > >> > And then trying to run userspace looks like this:
> > > > >>
> > > > >> Could you please run with debugging patch posted
> > > > >> http://lkml.kernel.org/r/20171107102854.vylrtaodla63kc57@xxxxxxxxxxxxxx
> > > > >
> > > > > Did you have chance to test with this debugging patch, please?
> > > >
> > > > Lots of this:
> > > >
> > > > [ 1.177266] Uhuuh, elf segement at 000d9000 requested but the memory is mapped already, got 000dd000
> > > > [ 1.177555] Clashing vma [dd000, de000] flags:100873 name:(null)
> > >
> > > This smells like the problem I've expected that mmap with hint doesn't
> > > respect the hint even though there is no clashing mapping. The above
> > > basically says that we didn't map at 0xd9000 but it has placed it at
> > > 0xdd000. The nearest (clashing) vma is at 0xdd000 so this is our new
> > > mapping. find_vma returns the closest vma (with addr < vm_end) for the
> > > given address 0xd9000 so this address cannot be mapped by any other vma.
> > >
> > > Now that I am looking at arm's arch_get_unmapped_area it does perform
> > > aligning for shared vmas.
> >
> > Sorry for confusion here. These are not shared mappings as pointed out
> > by Russell in a private email. I got confused by the above flags which I
> > have misinterpreted as bit 0 set => MAP_SHARED. These are vm_flags
> > obviously so the bit 0 is VM_READ. Sorry about the confusion. The real
> > reason we are doing the alignment is that we do a file mapping
> > /*
> > * We only need to do colour alignment if either the I or D
> > * caches alias.
> > */
> > if (aliasing)
> > do_align = filp || (flags & MAP_SHARED);
> >
> > I am not really familiar with this architecture to understand why do we
> > need aliasing for file mappings, though.
>
> I think it's there so that flush_dcache_page() works - possibly
> get_user_pages() being used on a private mapping of page cache pages,
> but that's guessing.

I fail to see how the mixure of MAP_FIXED and regular mapping of the
same file work then, but as I've said I really do not understand this
code.

> I'm afraid I don't remember all the details, this is code from around
> 15 years ago, and I'd be very nervous about changing it now without
> fully understanding the issues.

Ohh, absolutely! I didn't dare to touch this code and that's why I took
the easy way and simply opt-out from the harding for all those archs
that are basically sharing this pattern. But after a closer look it
seems that we can really introduce MAP_FIXED_SAFE that would keep the
arch mmap code intact yet we would get the hardening for all archs.
It would allow also allow a safer MAP_FIXED semantic for userspace.
--
Michal Hocko
SUSE Labs