Re: closefd: closes a file of any process

From: almesber@lrc.di.epfl.ch
Date: Sun Jul 02 2000 - 07:04:27 EST


Horst von Brand wrote:
> Which presuposes applications capable of handling said event gracefully.
> I.e., zilch.

That's what I suspect, yes.

This is a very interesting thread. I think the discussion circles around
three problems, which are similar in appearance, but quite different in
the possible solutions. Let me attempt a classification:

A) hide resource unavailability from a process
   This is the "tell process to go elsewhere while ejecting the floppy"
   case. May be generalized (with some difficulty) to more general fd
   replacement. I think there are four possible cases:
   1) process/fd is idle, can be removed/changed just by changing pointers
   2) fd is busy (system call in progress) but we can just wait for the
      operation to terminate (maybe prod it a little)
   3) fd is not busy, but process is doing something that may make it
      undesirable to mess with its fds (e.g. it's in the middle of exit(2)
      or close(2))
   4) fd is in the middle of an operation that won't come back anytime
      soon
   I'm afraid there aren't many cases of 2), and the prodding may get a
   little difficult for, say, NFS, so whenever the fd is busy, we probably
   must assume 4). 3) may simply mean that whenever the process is in the
   kernel, nothing should happen to it. 3) of course also covers 4), but
   one may special-case the latter. (After all, we may need this function
   just because we're in case 4). 1) is too good to be true and of little
   hack value.

   For 4), I can imagine the following two approaches:
   a) add a layer that allows a clean split between the process and the
      driver side of an fd, e.g. by handing all requests to some proxy
      process. May be slow.
   b) hackish, but probably quite efficient: by splicing the thread of
      execution at the process/driver boundary: driver operation
      returns immediately, while the still-running driver operation gets
      a new return address that makes the result evaporate. I think it
      would be interesting to see if somebody could come up with a clean
      solution for such a beast ;-)

   In both cases, it seems natural to hook the code that maintains the
   separation into VFS, as a new file system. This would limit things
   to regular files and directories, but that's probably okay at the
   beginning. Also, in either case, any access to "current" from the
   driver side would drastically complicate things.

B) de-access a resource (fd) to allow removal of device
   There are again several flavours of this:
   1) process is killable and nobody cares, so kill it
   2) process should not be killed, so we have to try to close the fd
      (and hope it doesn't get angry), or replace it with something
      else
   3) process stubbornly refuses to die (e.g. it's blocked in an access
      to the reource we want to get rid of)
   4) like 3), but the block is because of something else (e.g. we want
      to umount the partition with the log file of the backup process
      that is currently hung in the tape driver)

   For 2), any solution from A) should be suitable. 1) is again trivial.
   That leaves us with 3) and 4), which are essentially the same. The
   goal is simply to make that system call return.

   Since we're likely in the middle of a driver and may have accumulated
   an unknown amount of state (e.g. locks, driver flags, etc.), this
   probably can't be solved from outside the driver. Since we should be
   in an uninterruptible sleep at this time (anything else would be a
   serious bug that needs to be fixed no matter what is done about the
   issues discussed here), the obvious solution seems to be to teach
   drivers to allow interruption of uninterruptible sleeps. Maybe with a
   special new signal and, sigh, a new task state.

C) counter denial of service attacks based on keeping devices busy
   This is a combination of B with counter-measures to keep hostile
   code from thwarting the initial counter-measure. Some scheduler or
   kill tricks may be the most efficient way to deal with all such
   kinds of arms races. (E.g. write a kill(2) that accepts a list of
   "good" processes, and atomically delivers the signal to all others.)

Of course, A) and B) could be combined. Also, some instrumentation could
be added to A) to allow the process doing the replacement to retrieve
the exact state of an fd.

A) seems to be the hardest thing to do, particularly if performance is
an issue. If performance is a non-issue, one could use some kind of
demon that looks like NFS, Coda, or such, and spawns a thread for each
IO operation. If we want to revoke an fd, the demon could simply open
the substitute, continue working on that, and discard the result of any
pending operation as soon as it comes back (if ever). 100% user space.

Solving B) would solve a large class of user-visible problems. Since
there may not be all that many TASK_UNINTERRUPTIBLE sleeps that cause
problems in real life, it should be feasible just to put the
infrastructure in place, and to fix things when a problem is found.
I.e. no major code review required.

I'll leave C) to another thread ;-)

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, ICA, EPFL, CH       werner.almesberger@ica.epfl.ch /
/_IN_N_032__Tel_+41_21_693_6621__Fax_+41_21_693_6610_____________________/

- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.rutgers.edu Please read the FAQ at http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Fri Jul 07 2000 - 21:00:10 EST