Re: [patch] ext2/3: document conditions when reliable operation ispossible

From: david
Date: Sat Aug 29 2009 - 12:29:05 EST


On Sat, 29 Aug 2009, Pavel Machek wrote:

On Fri 2009-08-28 07:46:42, david@xxxxxxx wrote:


so what sort of test would be needed to identify if a device has this
problem?

people can do ad-hoc tests by pulling the devices in use and then
checking the entire device, but something better should be available.

it seems to me that there are two things needed to define the tests.

1. a predictable write load so that it's easy to detect data getting lose

2. some statistical analysis to decide how many device pulls are needed
(under the write load defined in #1) to make the odds high that the
problem will be revealed.

Its simpler than that. It usually breaks after third unplug or so.

for USB devices there may be a way to use the power management functions
to cut power to the device without requiring it to physically be pulled,
if this is the case (even if this only works on some specific chipsets),
it would drasticly speed up the testing

This is really so easy to reproduce, that such speedup is not
neccessary. Just try the scripts :-).

so if it doesn't get corrupted after 5 unplugs does that mean that that particular device doesn't have a problem? or does it just mean you got lucky?

would 10 sucessful unplugs mean that it's safe?

what about 20?

we need to get this beyond anecdotal evidence mode, to something that (even if not perfect, as you can get 100 'heads' in a row with an honest coin) gives you pretty good assurances that a particular device is either good or bad.

David Lang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/