Re: [PATCH 3/3] Squashfs: Directly decompress into the page cachefor file data

From: Minchan Kim
Date: Sun Oct 13 2013 - 21:43:14 EST

Next message: Greg KH: "Linux 3.11.5"
Previous message: Chen Gang: "Re: [Suggestion] kernel/rcutorture.c: about using scnprintf() insteadof sprintf()."
In reply to: Phillip Lougher: "[PATCH 3/3] Squashfs: Directly decompress into the page cache for file data"
Next in thread: Phillip Lougher: "[PATCH 2/3] Squashfs: restructure squashfs_readpage()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On Fri, Oct 11, 2013 at 07:19:40AM +0100, Phillip Lougher wrote:
> This introduces an implementation of squashfs_readpage_block()
> that directly decompresses into the page cache.
>
> This uses the previously added page handler abstraction to push
> down the necessary kmap_atomic/kunmap_atomic operations on the
> page cache buffers into the decompressors. This enables
> direct copying into the page cache without using the slow
> kmap/kunmap calls.
>
> The code detects when multiple threads are racing in
> squashfs_readpage() to decompress the same block, and avoids
> this regression by falling back to using an intermediate
> buffer.
>
> This patch enhances the performance of Squashfs significantly
> when multiple processes are accessing the filesystem simultaneously
> because it not only reduces memcopying, but it more importantly
> eliminates the lock contention on the intermediate buffer.
>
> Using single-thread decompression.
>
> dd if=file1 of=/dev/null bs=4096 &
> dd if=file2 of=/dev/null bs=4096 &
> dd if=file3 of=/dev/null bs=4096 &
> dd if=file4 of=/dev/null bs=4096
>
> Before:
>
> 629145600 bytes (629 MB) copied, 45.8046 s, 13.7 MB/s
>
> After:
>
> 629145600 bytes (629 MB) copied, 9.29414 s, 67.7 MB/s

It's really nice!
Below are just nitpicks.

>
> Signed-off-by: Phillip Lougher <phillip@xxxxxxxxxxxxxxx>
> ---
> fs/squashfs/Kconfig | 14 ++++
> fs/squashfs/Makefile | 7 +-
> fs/squashfs/file_direct.c | 178 ++++++++++++++++++++++++++++++++++++++++++++++
> fs/squashfs/page_actor.c | 104 +++++++++++++++++++++++++++
> fs/squashfs/page_actor.h | 32 +++++++++
> 5 files changed, 334 insertions(+), 1 deletion(-)
> create mode 100644 fs/squashfs/file_direct.c
> create mode 100644 fs/squashfs/page_actor.c
>
> diff --git a/fs/squashfs/Kconfig b/fs/squashfs/Kconfig
> index c70111e..8bc9e6a 100644
> --- a/fs/squashfs/Kconfig
> +++ b/fs/squashfs/Kconfig
> @@ -25,6 +25,20 @@ config SQUASHFS
>
> If unsure, say N.
>
> +config SQUASHFS_FILE_DIRECT
> + bool "Decompress files directly into the page cache"
> + depends on SQUASHFS
> + help
> + Squashfs has traditionally decompressed file data into an
> + intermediate buffer and then memcopied it into the page cache.
> +
> + Saying Y here includes support for directly decompressing
> + file data into the page cache. Doing so can significantly
> + improve performance because it eliminates a mempcpy and
> + it also removes the lock contention on the single buffer.
> +
> + If unsure, say N.
> +
> config SQUASHFS_XATTR
> bool "Squashfs XATTR support"
> depends on SQUASHFS
> diff --git a/fs/squashfs/Makefile b/fs/squashfs/Makefile
> index 14cfc41..deaabc5 100644
> --- a/fs/squashfs/Makefile
> +++ b/fs/squashfs/Makefile
> @@ -5,8 +5,13 @@
> obj-$(CONFIG_SQUASHFS) += squashfs.o
> squashfs-y += block.o cache.o dir.o export.o file.o fragment.o id.o inode.o
> squashfs-y += namei.o super.o symlink.o decompressor.o decompressor_single.o
> -squashfs-y += file_cache.o
> squashfs-$(CONFIG_SQUASHFS_XATTR) += xattr.o xattr_id.o
> squashfs-$(CONFIG_SQUASHFS_LZO) += lzo_wrapper.o
> squashfs-$(CONFIG_SQUASHFS_XZ) += xz_wrapper.o
> squashfs-$(CONFIG_SQUASHFS_ZLIB) += zlib_wrapper.o
> +
> +ifdef CONFIG_SQUASHFS_FILE_DIRECT
> + squashfs-y += file_direct.o page_actor.o
> +else
> + squashfs-y += file_cache.o
> +endif
> diff --git a/fs/squashfs/file_direct.c b/fs/squashfs/file_direct.c
> new file mode 100644
> index 0000000..f31298e
> --- /dev/null
> +++ b/fs/squashfs/file_direct.c
> @@ -0,0 +1,178 @@
> +/*
> + * Copyright (c) 2013
> + * Phillip Lougher <phillip@xxxxxxxxxxxxxxx>
> + *
> + * This work is licensed under the terms of the GNU GPL, version 2. See
> + * the COPYING file in the top-level directory.
> + */
> +
> +#include <linux/fs.h>
> +#include <linux/vfs.h>
> +#include <linux/kernel.h>
> +#include <linux/slab.h>
> +#include <linux/string.h>
> +#include <linux/pagemap.h>
> +#include <linux/mutex.h>
> +
> +#include "squashfs_fs.h"
> +#include "squashfs_fs_sb.h"
> +#include "squashfs_fs_i.h"
> +#include "squashfs.h"
> +#include "page_actor.h"
> +
> +static int squashfs_read_cache(struct page *target_page, u64 block, int bsize,
> + int pages, struct page **page);
> +
> +/* Read separately compressed datablock directly into page cache */
> +int squashfs_readpage_block(struct page *target_page, u64 block, int bsize)
> +
> +{
> + struct inode *inode = target_page->mapping->host;
> + struct squashfs_sb_info *msblk = inode->i_sb->s_fs_info;
> +
> + int file_end = (i_size_read(inode) - 1) >> PAGE_CACHE_SHIFT;
> + int mask = (1 << (msblk->block_log - PAGE_CACHE_SHIFT)) - 1;
> + int start_index = target_page->index & ~mask;
> + int end_index = start_index | mask;
> + int i, n, pages, missing_pages, bytes, res = -ENOMEM;
> + struct page **page;
> + struct squashfs_page_actor *actor;
> + void *pageaddr;
> +
> + if (end_index > file_end)
> + end_index = file_end;
> +
> + pages = end_index - start_index + 1;
> +
> + page = kmalloc(sizeof(void *) * pages, GFP_KERNEL);
> + if (page == NULL)
> + goto error_out;
> +
> + /*
> + * Create a "page actor" which will kmap and kunmap the
> + * page cache pages appropriately within the decompressor
> + */
> + actor = squashfs_page_actor_init_special(page, pages, 0);
> + if (actor == NULL)
> + goto error_out2;
> +
> + /* Try to grab all the pages covered by the Squashfs block */
> + for (missing_pages = 0, i = 0, n = start_index; i < pages; i++, n++) {
> + page[i] = (n == target_page->index) ? target_page :
> + grab_cache_page_nowait(target_page->mapping, n);
> +
> + if (page[i] == NULL) {
> + missing_pages++;
> + continue;
> + }
> +
> + if (PageUptodate(page[i])) {
> + unlock_page(page[i]);
> + page_cache_release(page[i]);
> + page[i] = NULL;
> + missing_pages++;
> + }
> + }
> +
> + if (missing_pages) {
> + /*
> + * Couldn't get one or more pages - probably because we're
> + * racing with another thread in squashfs_readpage also
> + * trying to grab them. Fall back to using an intermediate
> + * buffer. Note as this page conflict is mutual, we
> + * know all the racing processes will do this.

I think other major reason to be fallback is that we are trying to read page A
because it was reclaimed by VM while other pages in the block remain page cache.
In the case, we will always be fallback and I think normally it's more frequent
than racing case of concurrent read.
But It could save by decompressor_multi.c.

> + */
> + kfree(actor);
> + return squashfs_read_cache(target_page, block, bsize, pages,

Please handle pages just in case squashfs_read_cache fails.

> + page);
> + }

Otherwise, looks good to me. It's really nice abstraction!
Thanks.

--
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Greg KH: "Linux 3.11.5"
Previous message: Chen Gang: "Re: [Suggestion] kernel/rcutorture.c: about using scnprintf() insteadof sprintf()."
In reply to: Phillip Lougher: "[PATCH 3/3] Squashfs: Directly decompress into the page cache for file data"
Next in thread: Phillip Lougher: "[PATCH 2/3] Squashfs: restructure squashfs_readpage()"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]