Re: ncpfs: Connection invalid / Input-/Output Errors

From: schönfeld / in-medias-res
Date: Fri Sep 09 2005 - 06:15:16 EST


Petr Vandrovec schrieb:
> schönfeld / in-medias-res wrote:
>
>> Hi Petr,
>>
>> the two servers is that the one with the problems does run a nagios nrpe
>> server and some plugins, e.g. to check disk space on the novell disk,
>> while the other server does not. Now i found that heavy operations on
>> the filesystem (e.g. stat'ing many small files in a short time) is a
>> kind of problematic, if you want to do anything else on the filesystem
>> at the same time. The second process just hangs until the first one
>> accessing the ncp filesystem is ready with its operation. Well if
>
>
> You need either another CPU, or semaphore which do not suffer from
> starvation.
> Or you have to rewrite ncpfs to use some queue instead of simple
> semaphore. What happens is that your copy process in a loop acquires
> ncp_server's semaphore, sends request to server, waits for response, and
> releases semaphore. It does that for every request sent out. Now your
> process comes in, finds that ncp_server's semaphore is locked, and starts
> waiting. Other process gets answer from server, releases semaphore, and
> as both processes were just waiting before this happened, they both have
> same priority, and so one which just did up() continues to run. And
> before waken up process gets chance to do its task, copy process sends
> another request, and so your second process goes to sleep again.

Ah thanks. That makes things a lot of clearer.

I found out that my attemption were true: the plugin really gets a KILL
signal if it exceeds the timeout. Means that the nagios check plugin is
the source of the problem (in combination with that what you did explain
AND the process which uses the ncpfs regulary and is running constant).
Now we found a solution for that. We just start the always
running process with a lower priority. That makes ncpfs access possible
while this process is running and producing load. Now: If we have the
always running process running, with low priority (nice +5), and the
nagios plugin tries to do something on the ncpfs it is able to, runs
fine and exits gracefully. Problem solved, at least until we find a
solution that does not look like a workaround ;-)

Thanks for your help! You helped me very much.

Bye
Patrick
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/