Large file i/o in 2.2.10 and 2.3.8 (which locks up!)

Dr Mark Hagger (mhagger@dera.gov.uk)
Thu, 24 Jun 1999 13:16:30 +0100 (GMT)


Hi,

I've been experimenting with writing large amounts of data (in moderate
size blocks) to disk followed by repeatedly reading it back in (again in
blocks). (For those interested its because I'm contemplating doing a
matrix-vector multiply in an out-of-core fashion).

However, I've run into strange problems while benchmarking my approach.
I've included the source code of this program for your perusal. I've run
it on both 2.2.10 and 2.3.8 on my PII 400MHz (Redhat 5.2) machine, which
has 256Mbytes RAM, an Ultra-wide SCSI bus and a 7000rpm SCSI drive.

Basically the program generates a block of random data (16Mbytes) and
writes this to a file on the disk 20 times. It then lseeks back to the
beginning of the file and tries to read it back in, again in blocks of
16Mbytes.

So, for 2.2.10:

The writing proceeds quite well, getting bursts of 50Mbytes/s for most of
the writes with periodic very slow writes (~1Mbytes/s) whilst
(presumambly) quantities of data are fed to the disk. The readbacks are
quite slow, a fairly consistant 13Mbytes/s. Although over time (ie
multiple number of entire file reads) this seems to degrade somewhat,
which in itself is odd.

Now, for 2.3.8:

I was a little surprised at how slow the writing to the disk goes, and
indeed how much variation there was with write speeds of each block
(between 3Mbytes/s and 9Mbytes/s). I guess the main reason for this is to
do with changes to the memory buffering of disk writes, and the general
"bursty" nature of this process.

Now, when the program gets to the read stage the disk goes mad and hammers
away like crazy, reading apparently varies between 1 and 13Mbytes/s.
When I ran this for a while (ie looped over the entire file read a number
of times) the program eventually Seg faulted and crashed, and actually
took the machine down with it, locking it up solid. I had to poweroff and
reboot.

On examining the messages file I saw a vast number of messages which read:

kernel: attempt to access beyond end of device
kernel: 08:09: rw=0, want=1072687759, limit=6827593

(we're talking 100's of these for anyone interested). Examining this
again I saw that these messages poured out during the first (block) read
and then continued to pour out during subsequent reads, up until the time
the machine locks up.

This all seems a little odd to me, (as ever I can't ignore the fact that
I've done something silly! But the read/write/lseek don't generate any
error messages whilst this is happening).

In general though, changes to the block size seems to make quite
considerable differences to the performance, larger than 16Mbytes block
size causes things to slow down considerably, although smaller block sizes
seem to have effects as well. Possibly someone out there can tell me the
reason for this! (or indeed any thoughts on how to make this type of
process run fast would be useful).

Mark

Here's the program:

/* test a large (blocked) file write followed by a series of
repeated readbacks */

#include <stdio.h>
#include <math.h>
#include <stdlib.h>

#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <unistd.h>
#include <sys/time.h>
#include <errno.h>
#include <string.h>

int main() {
double wtime;

int nblocks=16;
int bsize=1000;
int nsize=2000;
double *matrix;
struct timeval tim;
struct timezone tz;
int i,j,fid;
ssize_t nread;

matrix=malloc(bsize*nsize*sizeof(double));

fid=open("/local/block_file",O_CREAT | O_RDWR,S_IRWXU );
for(i=0;i<nblocks;i++) {
/*create random numbers */
for(j=1;j<bsize*nsize;j++)
matrix[j]=drand48();
/* and write the matrix to the file */
printf("Writing block %d to file\n",i);
fflush(stdout);

gettimeofday(&tim,&tz);
wtime=-tim.tv_sec-tim.tv_usec/1.0e+6;

if(write(fid,matrix,sizeof(double)*nsize*bsize)<0) {
printf("write error: %s\n",strerror(errno));
}

gettimeofday(&tim,&tz);
wtime+=tim.tv_sec+tim.tv_usec/1.0e+6;

printf("Time to write = %le (%7.3lf Mbytes/s)\n",wtime,sizeof(double)*bsize*nsize/(wtime*1.0e+6));
fflush(stdout);
}

/*now try reading it back a number of times */

for(j=0;j<5;j++) {
off_t offset;
offset=0;
if(lseek(fid,offset,SEEK_SET)<0) {
printf("lseek error: %s\n",strerror(errno));
}
for(i=0;i<nblocks;i++) {
printf("Reading block %d from file\n",i);
fflush(stdout);

gettimeofday(&tim,&tz);
wtime=-tim.tv_sec-tim.tv_usec/1.0e+6;

nread=read(fid,matrix,sizeof(double)*nsize*bsize);
if(nread<0) {
printf("read error: %s\n",strerror(errno));
}
if(nread<sizeof(double)*nsize*bsize) {
printf("read wierdness %d read\n",nread);
}

gettimeofday(&tim,&tz);
wtime+=tim.tv_sec+tim.tv_usec/1.0e+6;

printf("Time to read = %le (%7.3lf Mbytes/s)\n",wtime,sizeof(double)*bsize*nsize/(wtime*1.0e+6));
fflush(stdout);
}
}

free(matrix);
close(fid);
return 0;
}

-- END PROGRAM --

--

Mark Hagger Tel: 01305 212803 DERA Winfrith, Winfrith Technology Centre Fax: 01305 212103 Winfrith, Dorset, DT2 8XJ, UK Email: mhagger@dera.gov.uk

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/