Re: [rfc] Ignore Fsync Calls in Laptop_Mode

From: Theodore Tso
Date: Thu May 26 2011 - 06:55:00 EST



On May 26, 2011, at 3:01 AM, D. Jansen wrote:

> That seems to be the big ordering issue. I had always assumed that
> user space writes (by the same app to the same file) would be
> committed in order. Is that really not the case?
>
> Wouldn't most app programmers assume ordering? Wouldn't that always
> possibly be an issue? Or do all the apps that require ordered writes
> use fsync. There will surely be some who require ordering but don't
> fsync. And without ordering, some apps won't be able to avoid fsync
> without data safety issues.

I really don't like using the word "ordering" the way Dave used it,
because it's a file system lingo that *always* confuses civilians.
And "Insider" language like that isn't help for communication,
unless you're certain there are only experts in the room...

As Dave said earlier, "ordering" in the sense he was using it
refers strictly to ensuring consistency after a crash.

Now, there are two levels of consistency; one is file system
level consistency, and the other is application level
consistency. It used to be that desktop drives would
lie about forcing data to disk in response to a FLUSH
CACHE command, "yes sir, I promise the data is on
the disk, sir!", because it resulted in higher WINBENCH
scores. File systems engineers hated this, because
a primary tool we have for assuring that file systems
don't look like swiss cheese after a crash was completely
unreliable. Fortunately, those disks have largely
disappeared from the market place.

The suggestion of making fsync a no-op is essentially
asking for a knob that breaks application-level consistency
the same way those broken hard drives broke file system
consistency by making the FLUSH CACHE command
unreliable. Maybe improving battery lifetime is a more
honorable excuse than the purely mercenary goal of
selling more disk drives, but it can still break applications
after a crash.

Now, you may think that you're prepared by that. After all,
you're already prepared to say that you're willing to lose
the last 15 minutes of work or whatever, right?

Well, wrong. It's not so simple as that. If you're only
talking about simple, flat, human-readable text files,
maybe it would work that way. But what about complex,
binary databases? Like sqllite databases used by
Firefox and Chrome? Or MySQL databases? More
and more, sophisticated applications, even desktop
applications, are using these complex data stores,
and the libraries which update these complex data
stores rely on fsync() to prevent their database files
from looking like swiss cheese. If you crash while
fsync() has been disabled, the entire database file
could be completely trashed, which could be hours,
days, weeks, or months of work lost.

So the resistance that people like Dave have to your
proposal can be summed up by Confucious if you are
Chinese: ""Never impose on others what you would
not choose for yourself." Or if you are Jewish, the Rabbi
Hillel said: "That which is hateful to you, do not do to
your fellow. That is the whole Torah; the rest is the
explanation; go and learn." Or if you are a Muslim,
the Prophet Mohammed: "Hurt no one so that no one
may hurt you." Breaking fsync() is like hard drives that
break faith with file system authors by lying when they
say everything is safely written to stable storage. And
what are databases but complex file systems living inside
a single file?

-- Ted

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/