Re: Should raw I/O be added to the kernel?

Gerard Roudier (groudier@club-internet.fr)
Fri, 22 Jan 1999 23:59:01 +0100 (MET)


On Fri, 22 Jan 1999, Stephen C. Tweedie wrote:

> Hi,
>
> On Thu, 21 Jan 1999 21:39:26 +0100 (MET), Gerard Roudier
> <groudier@club-internet.fr> said:
>
> > Providing integrity using only a synchronous IO semantic is so costly for
> > performances that I donnot even want to think for a second to such an
> > approach.
>
> Interesting to hear you say this, since that is _precisely_ what you get
> if you are running a large DB like Oracle or Informix on raw devices.
> All raw devices are _necessarily_ synchronous, and yet these devices are
> suggested as a means of improving performance.

The reason it improves performance is, IMO, that this eliminates a useless
memory copy and the kernel caching that may have adverse effects for such
applications instead of optimizing IO througput.
It is not a real optimization, in my opinion, we just need to eliminates
kernel IO optimizations since they are defeated by the way DB are doing
IOs from user-land.

Implementing data caching from user-land may have the following cost, IMO:
- It increases the actual IO latency from the application to the device.
- Due to the increase of latency, you may need to use large caching and
complex optimization algorithms/heuritics in order not to be too bad.
This wastes memory and CPU.

Using synchronous writes may reduce the average number of SCSI sommands
the drive will be fed with at a time and so prevents the drive from
optimizing IO based on actual head and angular positions.
If you want not to be too bad you may also need to implement some complex
thing for that in user-land that also wastes resources.

The lower the IO latency is, the less you need complex caching and complex
optimization heuristics to implement.

If you add on top of a data base manager some usually bad application
implemented by programmers that donnot have any idea on how the thing
actually works and you get something that need several order of magnitude
more ressources than traditional applications.

In order to make things still worse, you may add a RAID controller at the
bottom of that pile of **** that still may increase significantly the
actual IO latency.

> > With 'full control' I meant 'full control on actual IO ordering
> > requirements to maintain integrity and consistency. I want to think of an
> > IO sub-system that allows to ask for 'ordering' requirements of the
> > reality of IOs.
>
> Absolutely, there's no doubt about this. The only problem is that the
> Unix API does not allow it.

Agreed, I did mention that too.

> The nearest we can get is asynchronous IO
> (using posix.4 libaio) along with synchronous status returns: in other
> words, the writes are asynchronous but the completion status is not
> returned until the data has positively hit disk.

Indeed, this may help.

Regards,
Gerard.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/