Re: PATCH: Raw device IO for 2.1.131

WCOEKAER.US.ORACLE.COM (WCOEKAER@us.oracle.com)
21 Dec 98 10:49:27 -0800


--=_ORCL_26668082_0_0
Content-Transfer-Encoding:7bit
Content-Type:text/plain; charset="us-ascii"

And what about portability ?

Cheers,
Wim.

--=_ORCL_26668082_0_0
Content-Type:message/rfc822

Date: 21 Dec 98 10:27:07
From:"Khimenko Victor" <khim@sch57.msk.ru>
To:dominik.kubla@uni-mainz.de
Subject:Re: PATCH: Raw device IO for 2.1.131
Cc:WCOEKAER@us.oracle.com, h.milz@seneca.muc.de, linux-kernel@vger.rutgers.edu
Return-Path:<khim.sch57.msk.ru!khim@khim.sch57.msk.ru>
Received:from inet16.us.oracle.com by mailsun3 with ESMTP (SMI-8.6/37.9) id KAA17126; Mon, 21 Dec 1998 10:35:46 -0800
Received:from shell.sch57.msk.ru (uucp@shell.sch57.msk.ru [195.178.195.110]) by inet16.us.oracle.com (8.8.5/8.8.5) with ESMTP id KAA04959 for <WCOEKAER@us.oracle.com>; Mon, 21 Dec 1998 10:35:00 -0800 (PST)
Received:from khim.UUCP (uucp@localhost) by shell.sch57.msk.ru (8.8.7/8.8.7) with UUCP id VAA06926; Mon, 21 Dec 1998 21:25:40 +0300
Received:by khim.sch57.msk.ru (dMail for DOS v2.07a2, 12Jun98); Mon, 21 Dec 1998 21:27:07 +0300
References:<19981221180256.B27974@arthur.zdv.Uni-Mainz.DE> <199812160338.TAA17171@mailsun2.us.oracle.com> <ABoZsTs8nG@khim.sch57.msk.ru>
Message-Id:<ADxBfVsC31@khim.sch57.msk.ru>
Organization:MCCME
X-PM-Encryptor:PM-CRYPTPGP, 1
Lines:65
MIME-Version: 1.0
Content-Type:text/plain; charset=us-ascii
Content-Transfer-Encoding:7bit

In <19981221180256.B27974@arthur.zdv.Uni-Mainz.DE> Dominik Kubla
(dominik.kubla@uni-mainz.de) wrote:
DK> On Wed, Dec 16, 1998 at 11:01:54AM +0300, Khimenko Victor wrote:
>> > OK, a possible "technical" problem is, I want to have 2 linux boxes(or
more)
>> > connected to the same scsi disks. (twin tailed or what have you). I have
>> > running 2 instances of the same software both accessing those disks. For
>> > obvious reasons, load balancing, spread load of jobs, and failover, if a
>> > node fails, at least the other instance still has access to the disk
and can
>> > RECOVER the data. Because my logfiles are also 'shared' so I can access
the
>> > other node's logfiles and recover from that.
>>
>> I'm could not see how this all will work without specially designed
software
>> and hardware !

DK> So what? That just tells you that you don't know everything: A lot of
people
DK> could show you how it is done reliably with of the shelf hardware and
software.

"Specially designed hardware" != "not shelf hardware" :-) BTW hardware part
is
more flexible here then software part ...

>> Since this problem is not raised in this thread yet :-)) IMO the only
clear
>> solution would be changes in ext2fs or may be special filesystem.
DK> [...]
>> > There is also the fact that raw io for databases IS faster. Whatever
type
>> > filesystem you design, doesn't matter since we know which blocks to
write
>> > where. An index entry points to a specific block/file/slot so its easy
to
>> > calculate the offset in the 'file' ;) And except for full table scans,
the
>> > data is spread allover the place, so read-ahead into buffercache
doesn't do
>> > didley squad in that case.
>>
>> But you still should keep track of space used for different tables in
>> database :-)) This is EXACTLY filesystem work. Of course you could make
>> internal filesystem in database but of course much more clear way is to
>> fix/extend existing filesystem.

DK> No it is not, ever heard of OS/400?

Of course. There are database as core of the system. *nix is not designed
this
way. Both designs has benefits but analogy: both swiss roll and stockfish
are nice food while swiss roll with stockfish is ...

DK> And in addition a filesystem can not do all the things databases would
like
DK> it to do unless the filesystem was specially tailored for the specific
DK> database APPLICATION.

DK> RAW DEVICES are simply a short cut of the system to allow databases to
use
DK> "filesystems" specially tailored for the specific application (the on
site
DK> application, not the database as distributed by the vendor). Not more
and not
DK> less. So why don't they use the VFS API provided by most (but not all!)
DK> systems? Simple: because this makes the database system-dependant and
DK> unnecessary complex. And complexity kills software.

Exactly. And raw devices support makes kernel unnecessary complex and (even
more
important!) much more flexible. "And complexity kills software" :-)) And
since
this is not needed by 99% (more like 99.99% :-) of users mainstream kernel
will
not include raw devices support :-))

DK> Above you were stating that special filesystems would be the solutions
DK> to the problem. They are and the only system independant way to do this
are
DK> raw devices.

Create module for raw access. Create special filesystem (better way IMO). Do
not
pollute mainstream kernel with stuff needed by only 1% (more like 0.01%) of
users just as "temporary solution". Linux is not Solaris or HP-UX: it could
be
tweacked as needed and when needed.

--=_ORCL_26668082_0_0--

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/