RE: B*gg*r mallocs, mmap JOY!

Nathan Bryant (nathan@burgessinc.com)
Mon, 17 Feb 1997 16:42:28 -0500 (EST)


On Mon, 17 Feb 1997, John Carter wrote:

> On Mon, 17 Feb 1997, Michael Weller wrote:
>
> > On Fri, 14 Feb 1997, Christian Hardmeier wrote:
> >
> > > On Thu, 13 Feb 1997, John Carter wrote:
> > >
> > > > (Note you _can_ use mmap to write to files, the trick is
> > > > ftruncate())
> > > >
> > > > ftruncate( ofd, st.st_size); // Note this cuty...
> > > > // ftruncate is not in glibc, its a system call, and works instantaneously.
> > >
> > > The behaviour of ftruncate() is OS-dependant when the given length
> > > exceeds the file's current size. It may or may not work. So if you
> > > want to write portable code...
> >
> > I agree and was quite surprised about this usage of ftruncate. What you
> > need to do is simply create an empty file of length <n> when you want to
> > mmap <n> bytes. From my knowledge THE standard way to do so is:
> >
> > fseek(fd, <n>, SEEK_SET);
> >
> > fflush(fd); /* probably not needed, fseek does that. */
> >
> > mmap(... fileno(fd) ...);
>
> Hokay Folks.....
> From the "info glibc" on the subject of fseek....
>
> INFO> You can move the stream's file position with `fseek' (*note File
> INFO> Positioning::.). Moving the file position past the end of the
> INFO> data already written fills the intervening space with zeroes.
>
> Note that ftruncate does it instananeously. It isn't wasting my time
> scribbling zero's to disk when I'm about to scribble lots of more
> interesting things directly thereafter. (It takes a looong time to
> scribble 400MB of zero's and to then immediately fill it with
> satellite image. I know. Yawn. Too well. How about a cup of coffee? Has
> it got around to doing anything useful yet? How about a stroll? Yawn....)

I'm not sure this is true. I know that lseek() actually creates holes in
the file, dunno about fseek().

>
> >From the "info glibc" on the subject of lseek...
> INFO> You can set the file position past the current end of the file.
> INFO> This does not by itself make the file longer; `lseek' never
> INFO> changes the file. But subsequent output at that position will
> INFO> extend the file. Characters between the previous end of file and
> INFO> the new position are filled with zeros.
> Again, that is no help.

Sure it is. The info page fails to mention that lseek() doesn't actually
write the zeros to disk; it creates holes. Holes are blocks of a file that
aren't allocated on disk because they contain nothing but zeroes.

>
> >From "man ftruncate"
> MAN> Truncate causes the file named by path or referenced by fd
> MAN> to be truncated to at most length bytes in size. If the
> MAN> file previously was larger than this size, the extra data
> MAN> is lost. With ftruncate, the file must be open for writ-
> MAN> ing.
>
> ftruncate does the job and shuts-up. I like ftruncate. So it is a
> security hole. So what.
>
> So if I'm really curious about what other users are doing on my disk I
> can ftruncate all the free space and grep around looking for old dirt. So

Does Linux actually do this or does it zero the space first? Or create
holes? Hmm...

In any case, if you're really concerned about security to that extent, you
should zero a file before you delete it.

> what. Its my disk and as Pournelle's law says, "One user, one (at
> least) CPU."

Regardless of what Jerry Pournelle may think, multiuser systems are NOT
going away any time soon.

> People are too paranoid. If someone hacks into my machine
> and does the same to me, well, he will discover spying is the most
> boring job in the world and go home and take up street sweeping. I
> _luv_ Linux.
>
> So its not portable.

So use lseek(). mmap() has no problems with sparse files. lseek() does
all we need, and is portable. ftruncate() isn't.

>
> True. From various other messages I have recieved and from notes in
> /usr/src/linux/mm/mmap.c there are a lot of nonportable things going
> on.
>
> Portability is something that happens to Operating System authors.
> Some of them anyway.
> Linus.
> Umm. Err...
> Not many others I can think of... ;-))
> (Hint to would be flamers: I'm poking fun at most other systems
> I have worked on)
>
> \begin{OFF_TOPIC}
> But if there is one thing that I have learnt in 20 years of computing
> is portability matters a lot less than one thinks. (Well, especially
> less than the marketing department's hype says.)
>
> a) Restricting one self to the Lowest Common Denominator is a very
> severe restriction. (Think about graphics, ever write a ye olde
> graphing program using line printer characters? That was portable,
> Just about nothing else between that level and 200 lines of X calls just to
> say "Hello World" is portable.)
>
> b) Even code that one ardently believes is portable is not because...
> i. Standards are such useful things, everyone has their own.
> ii. When the time comes to port, the problems are different...
> At the time of the last major port I did, I found...
> 1) The new software package did 99% of what our old homebrew
> software did anyway..
> 2) Our business had changed so I was off writing entirely new
> programs anyway.
> 3) Semantics shift at the most fundamental level, leaving strange
> mismatches in the best met standards. Eg.
> VAX/VMS was a Fortran world and hence everything was run by RMS
> (RECORD Management System) hence the Unix style byte streams and
> RMS never really quite met properly.
>
> c) New tools become available, new languages provide new
> opportunities, so I have moved from Fortran to Basic to Algol to
> Pascal To C To Actor To C++ to ???, and very little got ported along
> the way.
>
> \end{OFF_TOPIC}
>
> John Carter EMail: ece@dwaf-hri.pwv.gov.za
> Telephone : 27-12-808-0374x194 Fax:- 27-12-808-0338
>
> Founder of the Council for Unnatural Scientists.
>
>

+-----------------------+---------------------------------------+
| Nathan Bryant | Unsolicited commercial e-mail WILL be |
| nathan@burgessinc.com | charged an $80/hr proofreading fee. |
+-----------------------+---------------------------------------+