Re: BTRFS: Unbelievably slow with kvm/qemu

From: Giangiacomo Mariotti
Date: Sat Jul 17 2010 - 01:29:27 EST


On Wed, Jul 14, 2010 at 9:49 PM, Christoph Hellwig <hch@xxxxxxxxxxxxx> wrote:
> There are a lot of variables when using qemu.
>
> The most important one are:
>
> Â- the cache mode on the device. ÂThe default is cache=writethrough,
> Â which is not quite optimal. ÂYou generally do want to use cache=none
> Â which uses O_DIRECT in qemu.
> Â- if the backing image is sparse or not.
> Â- if you use barrier - both in the host and the guest.
>
> Below I have a table comparing raw blockdevices, xfs, btrfs, ext4 and
> ext3. ÂFor ext3 we also compare the default, unsafe barrier=0 version
> and the barrier=1 version you should use if you actually care about
> your data.
>
> The comparism is a simple untar of a Linux 2.6.34 tarball, including a
> sync after it. ÂWe run this with ext3 in the guest, either using the
> default barrier=0, or for the later tests also using barrier=1. ÂIt
> is done on an OCZ Vertext SSD, which gets reformatted and fully TRIMed
> before each test.
>
> As you can see you generally do want to use cache=none and every
> filesystem is about the same speed for that - except that on XFS you
> also really need preallocation. ÂWhat's interesting is how bad btrfs
> is for the default compared to the others, and that for many filesystems
> things actually get minimally faster when enabling barriers in the
> guest. ÂThings will look very different for barrier heavy guest, I'll
> do another benchmark for those.
>
>                            Âbdev      Âxfs       btrfs      ext4      Âext3      Âext3 (barrier)
>
> cache=writethrough   Ânobarrier    sparse     Â0m27.183s    0m42.552s    2m28.929s    0m33.749s    0m24.975s    0m37.105s
> cache=writethrough   Ânobarrier    prealloc    Â-        0m32.840s    2m28.378s    0m34.233s    -        -
>
> cache=none       Ânobarrier    sparse     Â0m21.988s    0m49.758s    0m24.819s    0m23.977s    0m22.569s    0m24.938s
> cache=none       Ânobarrier    prealloc    Â-        0m24.464s    0m24.646s    0m24.346s    -        -
>
> cache=none       Âbarrier     sparse     Â0m21.526s    0m41.158s    0m24.403s    0m23.924s    0m23.040s    0m23.272s
> cache=none       Âbarrier     prealloc    Â-        0m23.944s    0m24.284s    0m23.981s    -        -
>
Very interesting. I haven't had the time to try it again, but now I'm
gonna try some options about the cache and see what gives me the best
results.

--
Giangiacomo
¢éì®&Þ~º&¶¬–+-±éÝ¥Šw®žË±Êâmébžìdz¹Þ)í…æèw*jg¬±¨¶‰šŽŠÝj/êäz¹ÞŠà2ŠÞ¨è­Ú&¢)ß«a¶Úþø®G«éh®æj:+v‰¨Šwè†Ù>Wš±êÞiÛaxPjØm¶Ÿÿà -»+ƒùdš_