Re: What protection does sysfs_readdir have with SMP/Preemption?

From: Steven Rostedt
Date: Wed Nov 23 2005 - 07:56:27 EST


On Wed, 23 Nov 2005, Maneesh Soni wrote:
>
> The dir operation sysfs_readdir() is called under directory inode's i_sem
> taken in vfs_readdir() and create_dir() also takes parent directory inode's
> i_sem. So in this case I think following are the relevant steps happening
> which look safe to me.
>
> cpu 0
> vfs_readdir()
> down(dir inode i_sem)
> sysfs_readdir(dir)
> parse through dir->s_dirent s_children list
> up(dir inode i_sem)
>
>
> cpu 1
> sysfs_create_dir()
> create_dir()
> down(parent dir inode i_sem)
> lookup_one_len (allocates & makes the new directory dentry visible)
> sysfs_make_diret()
> sysfs_new_dirent()
> attach the new directory s_dirent to parent's s_children list)
> up(parent dir inode i_sem)
>
>
> Basically, sysfs_readdir for a directory is protected against any
> addition/deletion in the directory by directory inode's i_sem.

OK, I missed that, thanks.

>
> But the bad pointer reference seen in sysfs_readdir() has to be debugged.
> Assumption here is that if there is a dentry attached to s_dirent, there
> has to be a inode associated becuase negative dentries are not created
> in sysfs. Is it possible to get some more information about the recreation
> scenario. Could you enable DEBUG printks for lib/kobject.c and
> drivers/base/class.c to see the events happening.

The bug that I've been fighting in my own kernel is a memory leak. So I
started looking into this at what would happen in verious places if an
allocation didn't work.

In create_dir:

error = sysfs_make_dirent(p->d_fsdata, *d, k, mode, SYSFS_DIR);
// Above is where the entry is added to the parent link list.

if (!error) {
error = sysfs_create(*d, mode, init_dir);
// If sysfs_create fails to allocate an inode, when below
// does the element get removed from the parent?
if (!error) {
p->d_inode->i_nlink++;
(*d)->d_op = &sysfs_dentry_ops;
d_rehash(*d);
}
}
if (error && (error != -EEXIST)) {
sysfs_put((*d)->d_fsdata);
// sysfs_put only seems to handle the kobject portion

d_drop(*d);
// d_drop handles the unhash
}
dput(*d);

So I'm not sure an error from sysfs_create will remove the object from the
link list. In fact, it might be worst since now there's an object on the
link list that may no long even be an object.

I'll test this by forcing a failure at sysfs_create.

-- Steve


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/