Re: Move to unshared VMAs in NOMMU mode?

From: David Howells
Date: Thu Mar 15 2007 - 18:48:23 EST


Hugh Dickins <hugh@xxxxxxxxxxx> wrote:

> But if "the SYSV SHM problem" you mention at the beginning
> is just the "nattch" problem you mention at the end, I doubt
> that's worth such a redesign as you're considering here.

Yes, as far as I know that's the problem. nattch is available to userspace and
seems to misbehave as far as userspace programs are concerned (I think the
program sees that it is 1 and assumes itself to be the last user).

> Actually, I'm rather surprised SHM needs any such nattch count,
> I'd expect it to deducible from file->f_count and mode&SHM_DEST
> (but haven't investigated whether that really works out at all).

Ummm... Currently file->f_count doesn't count the number of shmats because the
VMAs are shared. If they are no longer shared then the problem goes away.

There may be several VMLs for a particular process pointing to a VMA.

sys_shmdt() doesn't malfunction because it's not possible to split a VMA in
NOMMU mode, and so the whole VMA must match.

Actually, looking carefully at it, it might go wrong it someone does shmat(),
munmap(), shmdt(). do_munmap(), however, protects against too many munmaps (in
whatever form they're issued).

> If you just need a little CONFIG_MMU in ipc/shm.c to solve your
> problem, I don't think more is justified.

Hmmm... I'm not sure it's quite that simple. SYSV SHM is provided by a chain
of shm -> tiny-shmem -> ramfs. The mapping is actually managed by ramfs.

> Your struct vm_region idea does look more to my taste than what
> you presently have; yet if you pursue it, I think it would just
> make divergence worse wouldn't it? NOMMU wanting vma to contain
> a pointer to vm_region, MMU wanting vm_region embedded in vma.

That bit of divergence is, in effect, already there. In NOMMU-mode the VMA
owns the backing store; in MMU-mode it does not. This would, at least, rectify
that: fixing it would mean that the backing store is no longer owned by the
VMA, and would permit more flexibility in overlapping mappings.

> I don't really understand why NOMMU chooses to share vmas, or
> vm_regions, rather than just sharing the data which they indicate.

Where would that data be? How do you keep track of it? How do you know when
to deallocate it?

I have considered co-opting the pagecache attached to the mapped inode (which
is exactly how I do shared-writable mappings on ramfs), but that only works for
shared mappings. I still have to have a way to handle unshareable mappings.
At the moment, they're both the same way (unless overridden by the driver/fs),
and I just share the VMA.

> Just because you can use less memory that way?

That's one consideration. The other is that it makes management of these
chunks of data simpler. If the memory isn't attached to the VMA then it must
be managed in some other manner.

David
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/