Re: POHMELFS is back

From: Evgeniy Polyakov
Date: Tue Sep 20 2011 - 02:07:34 EST


Code part comments

On Mon, Sep 19, 2011 at 02:10:51PM -0400, Valdis.Kletnieks@xxxxxx (Valdis.Kletnieks@xxxxxx) wrote:
> +static ssize_t pohmelfs_write(struct file *filp, const char __user *buf, size_t len, loff_t *ppos)
> +{
> + ssize_t err;
> + struct inode *inode = filp->f_mapping->host;
> +#if 0
> + struct inode *inode = filp->f_mapping->host;
>
> Just remove the #if 0'ed code.

Actually I need to remove other part, it is kind of debugging part :)

> in phhmelfs_fill_inode() (and probably other places):
> + pr_info("pohmelfs: %s: ino: %lu inode is regular: %d, dir: %d, link: %d, mode: %o, "
>
> pr_debug please. pr_info per inode reference is just insane.

Yes, I will clean up all prints for sure

> +void pohmelfs_print_addr(struct sockaddr_storage *addr, const char *fmt, ...)
> + pr_info("pohmelfs: %pI4:%d: %s", &sin->sin_addr.s_addr, ntohs(sin->sin_port), ptr);
>
> Gaak. This apparently gets called *per read*. pr_debug *and* additional
> "please spam my log" flags please.

Please do not be scared by log prints ,they will be cleanup up for
release

> +static inline int dnet_id_cmp_str(const unsigned char *id1, const unsigned char *id2)
> +{
> + unsigned int i = 0;
> +
> + for (i*=sizeof(unsigned long); i<DNET_ID_SIZE; ++i) {
>
> strncmp()?

Actually yes

> Also, as a general comment - since this is an interface to Elliptics, which as
> far as I can tell runs in userspace, would this whole thing make more sense
> using FUSE?

I tried FUSE about 3 years ago, it had _serious_ problems with
performance. Disk storage can run in userspace, since it is never
limited by CPU or copies, but client may not be allowed to waste that
much resources.

> I'm also assuming that Elliptics is responsible for all the *hard* parts of
> distributed filesystems, like quorum management and re-synching after a
> partition of the network, and so on? If so, you really need to discuss that
> some more - in particular how well this all works during failure modes.

Yes, POHMELFS is just an another interface.
Elliptics uses eventual consistency model, so when replicas go out of
sync, special checking process starts to bring them back. It also
recovers missed copy, if we replaced disk/server or added new one.

When some servers go down, all IO is handled by their neighbours
according to route tabe generated on the first start. When servers are
turned on again, they start getting those IO requests, and eventual
consistency process rolls out missed writes. If data is old or missed we
will switch to another replica.

Thanks a lot for review!

--
Evgeniy Polyakov
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/