Re: [PATCH][RFC] Signal-per-fd for RT signals

From: Dan Kegel (dank@kegel.com)
Date: Fri Sep 14 2001 - 20:33:51 EST


Vitaly Luban <vitaly@luban.org> wrote:
> Attached patch is an implementation of "signal-per-fd"
> enhancement to kernel RT signal mechanism, AFAIK first
> proposed by A. Chandra and D. Mosberger ...
> which should dramatically increase linux based network
> servers scalability.
> [ Patch lives at http://www.luban.org/GPL/gpl.html ]

I have been using variations on this patch while trying
to benchmark an FTP server at a load of 10000 simultaneous
sessions (at 1 kilobyte/sec each), and noticed a few issues:

1. If a SIGINT comes in, t->files may be null, so where
   send_signal() says
         if( (info->si_fd < files->max_fds) &&
   it should say
         if( files && (info->si_fd < files->max_fds) &&
   otherwise there will be a null pointer oops.

2. If a signal has come in, and a reference to it is left
   in filp->f_infoptr, and for some reason the signal is
   removed from the queue without going through collect_signal(),
   a stale pointer may be left in filp->f_infoptr, which could
   cause a wild pointer oops. There are two places this can happen:
   a. if send_signal() returns -EAGAIN because we're out of memory or queue space
   b. if user sets the signal handler to SIG_IGN, triggering a call
   to rm_sig_from_queue()

I have seen the above problems in the field in my version of the patch,
and written and tested fixes for them. (Ah, the joys of ksymoops.)

3. Any reference to t->files probably needs to be protected by
   acquiring t->files->file_lock, else when the file table is
   expanded, any filp in use will become stale.

I have seen this problem in my version of the patch, but have not yet tackled it.
Is there any good guidance out there for how the various spinlocks
interact? Documentation/spinlocks.txt and Documentation/DocBook/kernel-locking.tmpl
are the best I've seen so far, but they don't get into specifics about, say,
files->file_lock and task->sigmask_lock. Guess I'll just have to read the source.

Also, while I have verified that the patch significantly reduces
reliable signal queue usage, I have not yet been able to measure
a reduction in CPU time in a real app. Presumably the benefits
are in response time, which I am not set up to measure yet.

This is my first excursion into the kernel, so please be gentle.
- Dan
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sat Sep 15 2001 - 21:00:50 EST