Re: [Lsf] [RFC] writeback and cgroup

From: Steve French
Date: Tue Apr 10 2012 - 12:23:15 EST


On Sat, Apr 7, 2012 at 3:00 AM, Jan Kara <jack@xxxxxxx> wrote:
>  Hi Vivek,
>
> On Wed 04-04-12 10:51:34, Vivek Goyal wrote:
>> On Tue, Apr 03, 2012 at 11:36:55AM -0700, Tejun Heo wrote:
>> [..]
>> > IIUC, without cgroup, the current writeback code works more or less
>> > like this.  Throwing in cgroup doesn't really change the fundamental
>> > design.  Instead of a single pipe going down, we just have multiple
>> > pipes to the same device, each of which should be treated separately.
>> > Of course, a spinning disk can't be divided that easily and their
>> > performance characteristics will be inter-dependent, but the place to
>> > solve that problem is where the problem is, the block layer.
>>
>> How do you take care of thorottling IO to NFS case in this model? Current
>> throttling logic is tied to block device and in case of NFS, there is no
>> block device.
>  Yeah, for throttling NFS or other network filesystems we'd have to come
> up with some throttling mechanism at some other level. The problem with
> throttling at higher levels is that you have to somehow extract information
> from lower levels about amount of work so I'm not completely certain now,
> where would be the right place. Possibly it also depends on the intended
> usecase - so far I don't know about any real user for this functionality...

Remember to distinguish between the two ends of the network file system.
There are slightly different problems. The client has to be able to
expose the number of requests (and size of writes, or equivalently
number of pages it can write at one time) so that writeback is not done
too aggressively. File servers have to be able to
discover the i/o limits dynamically of the underlying volume (not the
block device, but potentially a pool of devices) so it can tell
the client how much i/o it can send. For SMB2 server (Samba) and
eventually for NFS, how many simultaneous requests it
can support will allow them to sanely set the number of "credits"
on each response - ie tell the client how many requests
are allowed in flight to a particular export.

In the case of block device throttling - other than the file system
internally using such APIs who would use block device specific
throttling - only the file system knows where it wants to put hot data,
and in the case of btrfs, doesn't the file system manage the
storage pool. The block device should be transparent to the
user in the long run, and only the volume visible.


--
Thanks,

Steve
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/