Re: Holding ref to /proc/<pid> dentry prevents task being freed
From: Chris Wright
Date: Fri May 06 2005 - 12:49:52 EST
* bmerry@xxxxxxxxxxxx (bmerry@xxxxxxxxxxxx) wrote:
> On Thu, May 05, 2005 at 09:34:13AM -0700, Chris Wright wrote:
> > * bmerry@xxxxxxxxxxxx (bmerry@xxxxxxxxxxxx) wrote:
> > > I'm busy writing a security module that does some very basic ACL stuff
> > > on a per-task basis. If my module obtains and holds a dentry for
> > > /proc/<pid> (via path_lookup), then the task_free_security hook is
> > > never called for that process. Since the module releases the dentry in
> > > task_free_security, this creates a chicken-and-egg problem and neither
> > > the task nor the dentry is ever released. A side-effect is that the
> > > module refcount never drops to 0.
> > Why are you holding that dentry? Some more background please.
> Just realised that I never mentioned what kernel I've been using:
> The security module is for sandboxing. The basic idea is that a wrapper
> process passes a bunch of paths to the module, then execs the program
> that should be sandboxed. After the exec, the process should only be
> allowed access to those paths and their subdirectories (actually there
> are some flags passed too to say what permissions are granted, but
> that isn't really relevant).
> Rather than calling d_path on every access request and doing string
> matching (sounds hideously slow), I use path_lookup to get a dentry/mnt
> pair for each path passed in (once when it is passed in). Then the
> inode_permission hook walks up the filesystem, matching dentries.
This can break with hard links, bind mounts, etc. Can you not label the
> Some processes have a legitimate reason for accessing /proc/<pid> (pid
> of that process). Java, for example, does readlink("/proc/self/exe") to
> find the binary. So the wrapper passes /proc/<pid> to the module, which
> looks up and holds the dentry for it. I don't want to give blanket
> permission to /proc, since preventing the sandbox from getting
> information about what else is happening on the system is fairly
> important to my application.
Did you look at security_task_to_inode? It's there to help you label the
task based proc entries' inodes directly.
> At the moment I'm looking at hard-coding special behaviour for /proc
> into the module, but I was wondering if there was a simpler way around
> this problem.
You'll likely regret hardcoding something like that.
> Incidentally, is it intentional that vfsmount_lock is not exported to
> modules? My walk-up-the-tree code is essentially d_path without
> constructing the string, but I've had to remove the lock and unlock of
> vfsmount_lock because I don't have the symbol.
It's on the grounds that you shouldn't be poking about vfsmounts as it's
very core to vfs. Right answer is to use helpers (or identify a
legimate need for a new helper). In this case, your code is now
hopelessly racy, and I think would be better served by dealing with the
Hope that helps,
Linux Security Modules http://lsm.immunix.org http://lsm.bkbits.net
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/