[...]
> Also note that we actually do have the reverse mapping information from
> physical pages to virtual pages, although it is non-obvious:
>
> page_map -> inode
> inode -> vm_area_list
>
> so to find the virtual copy of a physical page you just look up the
> inode, and then follow all mappings of that inode (it's not as costly as
> it sounds, really).
but execution time depends on how many vma's are attached to the inode,
plus these vma's are not sorted by physical range in the ringlist. We've
got to do a linear scan on the ringlist to find out about which virtual
address(es) refers to the physical address. So you can kill performance
just by creating 100 VM mappings (one page full, one page empty) in a
single file, and then do some (future) mmuIO operation on it. Also, this
way of mapping scales linearly with the number of MM contexts, thus the
more processes mapping a file, the more overhead. Bad.
currently the i_mmap ringlist is used by vmtruncate() only, so it doesnt
really show up performance-wise, and not doing backwards mappings speeds
up all other MM operations considerably, and decreases complexity by alot.
So this is not a problem _now_, plus it's a feature currently, but this
i_mmap way of backward mapping isnt too healthy. _if_ we want to have
robust phys->virt mappings and use them in everyday block IO or networking
operations or in robust page-reaping, then i think we have to find another
way.
i'm not sure wether we want backwards mappings, but making the mapping
speed dependent on two such fundamental system invariants: number of
mappings and number of MM contexts, sounds very bad. Also, currently we
have no user-limits for mappings, so anyone who's able to open a say 10M
file, can create 1200+ vma's, without violating any limit. Now, if i
imagine a RT system thread like kswapd, scanning over physical ranges,
doing physical-address based page-reaping, and then hitting such a 1200
entry linear list ... or am i missing something.
-- mingo