Re: [OKS] Module removal

From: Werner Almesberger (wa@almesberger.net)
Date: Tue Jul 02 2002 - 13:05:40 EST


Benjamin Herrenschmidt wrote:
> That was one of the solutions proposed by Rusty, that is basically
> waiting for all CPUs to have scheduled upon exit from module_exit
> and before doing the actual removal.

No, that's not what I mean. If you have a de-registration function,
it may give you one of the following assurances:

 1) none (e.g. of xxx_deregister just unlinks some ops structure,
    but makes no attempt to squash concurrent accesses)
 2) guarantee that no further access to ops structure will occur
    after xxx_deregister returns, and
    2a) one can probe for cached references/execution of callbacks,
        e.g.
          foo_deregister(&my_ops);
          while (foo_running()) schedule();
          /* can destroy local state/code now */
        This is a common construct in the Linux kernel. Note that
        none of the module code executes after foo_running returns,
        so return-after-removal can't happen.
    2b) we're guaranteed that any running callbacks or such will
        have acquired their own locks/counts/etc. before
        xxx_deregister returns. Modules would (also) use the use
        count to protect code. Not sure if this happens anywhere
        in the kernel. return-after-removal would be possible
        here (unless we're going through a layer of trampolines
        or such). E.g.
          foo_deregister(&my_ops);
          while (myself_running()) schedule();
          /* may still have to execute code after unlocking */
          while (MOD_IN_USE) schedule();
        Note that this one still protects data !
 3) guarantee that no direct effects of accesses to ops structure
    occur after xxx_deregister returns, e.g.
      foo_deregister(&my_ops);
      /* can destroy local state/code now */
    This also eliminates return-after-removal, because deregister
    would wait for callbacks to return.
 4a/4b) Like 3, but the callback signals back completion without
    returning (i.e. to indicate that it has copied all shared data,
    and is now working on a private copy), we're back to case 2a or
    2b. In the case of modules, the usage count would be used to
    protect code, so decrement_and_return would be sufficient here.

I think these are the most common cases. The point is that only
case 1) leaves you completely in the dark, and only cases 2b and
4b are special when it comes to modules. Cases 2a, 3, and 4a are
always safe.

Moreover, case 1, and cases 2a, 2b, 4a, and 4b without
synchronizing after de-registration, can also cause problems in
non-module code, e.g.

bar_callback(void *my_data)
{
    *(int **) my_data = 1;
}

...
int whatever = 0,*ptr = &whatever;
...
foo_register(&bar_ops,&ptr);
...
foo_deregister(&bar_ops); /* case 1, 2a, 2b, 4a, or 4b */
ptr = NULL;
/* BUG: bar_callback may still be running ! */
...

Such code would only be data-safe (but not code-safe) if all shared
data is static, e.g. if the callback writes to some hardware
register, and the address is either static or constant.

AFAIK, most de-registration functions should be in case 2a, 3, or
4a. Well, we probably have a bunch of cases 1, too :-)

Now, what does this all boil down to ? Cases 2a, 3, and 4a are
non-issues. Cases 2b and 4b still need decrement_and_return, so I
was a bit too optimistic in assuming we're totally safe. Case 1 is
pathological also for non-modules, unless a few unlikely
constraints are met.

- Werner

-- 
  _________________________________________________________________________
 / Werner Almesberger, Buenos Aires, Argentina         wa@almesberger.net /
/_http://icapeople.epfl.ch/almesber/_____________________________________/
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/



This archive was generated by hypermail 2b29 : Sun Jul 07 2002 - 22:00:09 EST