Re: 2.6.38: XFS/USB/HW issue, or failing USB stick?

From: Arnd Bergmann
Date: Fri Mar 18 2011 - 15:11:05 EST


On Friday 18 March 2011 18:45:34 Justin Piszcz wrote:
> On Fri, 18 Mar 2011, Arnd Bergmann wrote:
> > Getting back to the rogiinal question, I'd recommend testing the
> > stick by doing raw accesses instead of a file system. A simple
>
> Ok, here are the results:
>
> root@sysresccd /root % time dd if=/dev/zero of=/dev/sda oflag=direct bs=4M
> dd: writing `/dev/sda': No space left on device
> 1961+0 records in
> 1960+0 records out
> 8220835840 bytes (8.2 GB) copied, 283.744 s, 29.0 MB/s

Ok, so no immediate problem there.

> > I'm also interested in results from flashbench
> > (git://git.linaro.org/people/arnd/flashbench.git, e.g. like
> > http://lists.linaro.org/pipermail/flashbench-results/2011-March/000039.html)
> > That might help explain how the stick failed.
>
> Certainly, testing below, following this:
> http://lists.linaro.org/pipermail/flashbench-results/2011-March/000039.html

I'm sorry, I should have been more specific. Unfortunately, running flashbench
is not very user friendly yet.

The results indicate that the device does not have a 2 MB erase block size
but rather 4 or 8, which is more common on 8 GB media.

> # ./flashbench --open-au --open-au-nr=1 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 29.5M/s
> 1MiB 29.1M/s
> 512KiB 28.5M/s
> 256KiB 22.8M/s
> 128KiB 23.8M/s
> 64KiB 24.4M/s
> 32KiB 18.9M/s
> 16KiB 13.1M/s
> 8KiB 8.22M/s
>
> # ./flashbench --open-au --open-au-nr=4 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 25.9M/s
> 1MiB 21.8M/s
> 512KiB 15M/s
> 256KiB 11.9M/s
> 128KiB 12.1M/s
> 64KiB 13.6M/s
> 32KiB 9.81M/s
> 16KiB 6.41M/s
> 8KiB 3.88M/s

The numbers are jumping around a bit with the incorrectly guessed erasesize.
These values should be more like the ones in the first test. Can you rerun
with --erasesize=$[4 * 1024 * 1024]?

Also, what is the output of 'lsusb' for this stick? I'd like to add the
data to https://wiki.linaro.org/WorkingGroups/KernelConsolidation/Projects/FlashCardSurvey

> # ./flashbench --open-au --open-au-nr=5 /dev/sda --blocksize=8192 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 29.2M/s
> 1MiB 27.8M/s
> 512KiB 18.4M/s
> 256KiB 7.82M/s
> 128KiB 4.62M/s
> 64KiB 2.47M/s
> 32KiB 1.26M/s
> 16KiB 642K/s
> 8KiB 327K/s

This is where your drive stops coping with the accesses: Writing small
blocks to four different erase blocks (2MB for the test, probably
larger) works fine, but writing to five of them is devestating for
performance, going from 30 MB/s to 300 KB/s, or lower if you were
to write smaller than 8 KB blocks.

The cutoff at --open-au-nr=4 is coincidentally the same as for the
SD card I was testing. This is what happens in the animation in
http://lwn.net/Articles/428799/. The example given there is for
a drive that can only have two open AUs (allocation units aka
erase blocks), while yours does 4.

> (did not run one with 7)

Note that the test results I had with 6 and 7 are without --random,
so the cut-off there was higher for that card when writing an
multiple erase blocks from start to finish instead of writing random
sectors inside of them.

> # ./flashbench --findfat --fat-nr=10 /dev/sda --blocksize=1024 --erasesize=$[2* 1024 * 1024] --random
> 2MiB 22.7M/s 19.1M/s 15.5M/s 13.1M/s 29.5M/s 29.5M/s 29.6M/s 29.6M/s 29.5M/s 29.5M/s
> 1MiB 20.6M/s 13.3M/s 13.3M/s 20.8M/s 18.1M/s 17.8M/s 18M/s 18.3M/s 18.8M/s 18.6M/s
> 512KiB 18.4M/s 18.6M/s 18.3M/s 18.1M/s 23.5M/s 23.2M/s 23.5M/s 23.5M/s 23.4M/s 23.4M/s
> 256KiB 26.9M/s 21.3M/s 21.2M/s 21M/s 21.1M/s 21.2M/s 21.1M/s 21.1M/s 20.6M/s 21M/s
> 128KiB 22.2M/s 22.3M/s 22.6M/s 21.4M/s 21.5M/s 21.3M/s 21.6M/s 21.3M/s 21.4M/s 21.4M/s
> 64KiB 23.9M/s 22.6M/s 22.9M/s 23M/s 22.5M/s 22.4M/s 22.4M/s 22.4M/s 22.5M/s 22.4M/s
> 32KiB 18.2M/s 18.3M/s 18.3M/s 18.3M/s 18.3M/s 18.4M/s 18.3M/s 18.2M/s 18.3M/s 18.3M/s
> 16KiB 12.9M/s 12.9M/s 13M/s 13M/s 12.9M/s 13M/s 12.9M/s 12.9M/s 12.9M/s 12.9M/s
> 8KiB 8.14M/s 8.15M/s 8.15M/s 8.15M/s 8.15M/s 8.14M/s 8.14M/s 8.15M/s 8.15M/s 8.06M/s
> 4KiB 4.07M/s 4.08M/s 4.07M/s 4.06M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s 4.04M/s
> 2KiB 2.02M/s 2.02M/s 2.02M/s 2.02M/s 2.02M/s 2.01M/s 2.01M/s 2.01M/s 2.01M/s 2.02M/s
> 1KiB 956K/s 954K/s 956K/s 953K/s 947K/s 947K/s 947K/s 950K/s 947K/s 948K/s
>

One thing that is very clear from this is that this stick has a page size
of 8KB, and that it requires at least 64 KB transfers for the maximum speed.

If your partition is not aligned to 8 KB or more (better: to the erase
block size, e.g. 4 MB) or if the file system writes smaller than 8 KB
naturally aligned blocks at once, the drive has to do read-modify-write
cycles that severely impact performance and the expected life-time.

I cannot see any block that is optimzied for storing the FAT, which is
good, as this means that the manufacturer did not exclusively design
the stick for FAT32, as is normally the case with flash memory cards.

For this stick, I would strongly recommend creating the file system
in a way that writes at least 16 KB naturally aligned blocks at all
times, but I don't know if that's supported by XFS.

Also, the limitation of forcing a garbage collection when writing to
more than four 4 MB (or so) segments may be a problem, depending on
how XFS stores its metadata. The good news is that it can do random
write access inside of the erase blocks.

Arnd
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/