Re: smbfs caching

Nimrod Zimerman (zimerman@deskmail.com)
Sat, 23 Jan 1999 23:29:05 +0200


On Fri, Jan 22, 1999 at 04:36:46PM -0500, Jim Nance wrote:

> I have an application that does a lot of file system I/O, and it can
> be run in a parallel mode where different machines share data via a
> networked file system. Getting good performance and scaling in parallel
> mode requires good performance from both the filesystem clients and servers.
> Since the Linux NFS server is not that great I decided to try using
> samba and smbfs instead. To my supprise NFS performed significantly better
> than the SMB setup. Here are the results:

Maybe, the problem you are seeing is something I've stumbled with lately. It
goes like this:

Unix utilities typically use 'stat' to get information on each and every
file they need information about. For example, 'ls -l' would query every
file like that.

Windows utilities typically use 'findfirst/findnext' to get the same
information, for whole directories.

SMB's implementation of findfirst/findnext is nice. It sends the information
regarding all queried files at the same time, in a large packet (or
packets). That means that querying a directory with 700 files can take
several seconds.
stat, however, is called for each file in turn (this is the way Unix
works...), and SMB can't do too much about it. For 700 files, it would send
700 tiny packets, and get a response for each of these. A quick calculation
reveals that this is scary. Assume a network with a latency of 500ms (this
is my situation here, at times). It would take *at least* 350 seconds for
the information of stating 700 files to get back to you. Ouch.

If your test involved stating a lot of files - this just might be the reason
for the speed difference.

A solution? Nothing that I can think of. A work-around? Yes. Have the smbfs
code cache the information returned from findfirst/findnext on whole
directories. Why would that help? Since most utilities that stat a lot of
files would generally use 'readdir' to see what files they have to query,
caching that information would be a win, as the application would typically
seconds later try to stat each and every file (and smbfs uses
findfirst/findnext to "emulate" a readdir).

I've not yet started real work on that, and I don't know if I will. My usage
of smb was rather limited. If I continue using it, I'll probably try to hack
this into my kernel sooner or later, as the slowness really annoyed me. You
can't entirely blame Linux for that - this is just the way Unix works, and
it is an incompatibility between the two operating systems - but I'm certain
people would blame Linux (compared to a Windows client, Unix here performs
badly).

And maybe this isn't your problem, after all. ;-)

Nimrod

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.rutgers.edu
Please read the FAQ at http://www.tux.org/lkml/