Re: [PATCH 2/3] shmem: update memory reservation on truncate

From: Hugh Dickins
Date: Wed Jun 25 2014 - 23:55:26 EST


On Wed, 25 Jun 2014, Konstantin Khlebnikov wrote:

> Shared anonymous mapping created without MAP_NORESERVE holds memory
> reservation for whole range of shmem segment. Usually there is no way to
> change its size, but /proc/<pid>/map_files/...
> (available if CONFIG_CHECKPOINT_RESTORE=y) allows to do that.
>
> This patch adjust memory reservation in shmem_setattr().
>
> Signed-off-by: Konstantin Khlebnikov <koct9i@xxxxxxxxx>

Acked-by: Hugh Dickins <hughd@xxxxxxxxxx>

Thank you, I knew nothing about this backdoor to shmem objects. Scary.
Was this really the only problem map_files access leads to? If you
did not do so already, please try to think through other possibilities.

I haven't begun, but perhaps it's not so bad. I guess the interaction
with mremap extension is benign - it's annoyed people in the past that
the underlying shmem object is not extended, but now here's a way that
it can be.

(I'll leave it to others comment on 3/3 if they wish.)

>
> ---
>
> exploit:
>
> #include <sys/mman.h>
> #include <unistd.h>
> #include <stdio.h>
>
> int main(int argc, char **argv)
> {
> unsigned long addr;
> char path[100];
>
> /* charge 4KiB */
> addr = (unsigned long)mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_SHARED|MAP_ANONYMOUS, -1, 0);
> sprintf(path, "/proc/self/map_files/%lx-%lx", addr, addr + 4096);
> truncate(path, 1 << 30);
> /* uncharge 1GiB */
> }
> ---
> mm/shmem.c | 17 +++++++++++++++++
> 1 file changed, 17 insertions(+)
>
> diff --git a/mm/shmem.c b/mm/shmem.c
> index 0aabcbd..a3c49d6 100644
> --- a/mm/shmem.c
> +++ b/mm/shmem.c
> @@ -149,6 +149,19 @@ static inline void shmem_unacct_size(unsigned long flags, loff_t size)
> vm_unacct_memory(VM_ACCT(size));
> }
>
> +static inline int shmem_reacct_size(unsigned long flags,
> + loff_t oldsize, loff_t newsize)
> +{
> + if (!(flags & VM_NORESERVE)) {
> + if (VM_ACCT(newsize) > VM_ACCT(oldsize))
> + return security_vm_enough_memory_mm(current->mm,
> + VM_ACCT(newsize) - VM_ACCT(oldsize));
> + else if (VM_ACCT(newsize) < VM_ACCT(oldsize))
> + vm_unacct_memory(VM_ACCT(oldsize) - VM_ACCT(newsize));
> + }
> + return 0;
> +}
> +
> /*
> * ... whereas tmpfs objects are accounted incrementally as
> * pages are allocated, in order to allow huge sparse files.
> @@ -543,6 +556,10 @@ static int shmem_setattr(struct dentry *dentry, struct iattr *attr)
> loff_t newsize = attr->ia_size;
>
> if (newsize != oldsize) {
> + error = shmem_reacct_size(SHMEM_I(inode)->flags,
> + oldsize, newsize);
> + if (error)
> + return error;
> i_size_write(inode, newsize);
> inode->i_ctime = inode->i_mtime = CURRENT_TIME;
> }
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/