Re: [PATCHv4 4/9] zram: Introduce recompress sysfs knob
From: Minchan Kim
Date: Fri Nov 04 2022 - 13:28:00 EST
On Fri, Nov 04, 2022 at 12:48:42PM +0900, Sergey Senozhatsky wrote:
> On (22/11/03 10:00), Minchan Kim wrote:
> [..]
> > > Per-my understanding this threshold can change quite often,
> > > depending on memory pressure and so on. So we may force
> > > user-space to issues more syscalls, without any gain in
> > > simplicity.
> >
> > Sorry, didn't understand your point. Let me clarify my idea.
> > If we have separate knob for recompress thresh hold, we could
> > work like this.
> >
> > # recompress any compressed pages which is greater than 888 bytes.
> > echo 888 > /sys/block/zram0/recompress_threshold
> >
> > # try to compress any pages greather than threshold with following
> > # algorithm.
> >
> > echo "type=lzo priority=1" > /sys/block/zram0/recompress_algo
> > echo "type=zstd priority=2" > /sys/block/zram0/recompress_algo
> > echo "type=deflate priority=3" > /sys/block/zram0/recompress_algo
>
> OK. We can always add more sysfs knobs and make threshold a global
> per-device value.
>
> I think I prefer the approach when threshold is part of the current
> recompress context, not something derived form another context. That
> is, when all values (page type, threshold, possibly algorithm index)
> are submitted by user-space for this particular recompression
>
> echo "type=huge threshold=3000 ..." > recompress
>
> If threshold is a global value that is applied to all recompress calls
> then how does user-space say no-threshold? For instance, when it wants
> to recompress only huge pages. It probably still needs to supply something
> like threshold=0. So my personal preference for now - keep threshold
> as a context dependent value.
>
> Another thing that I like about threshold= being context dependent
> is that then we don't need to protect recompression against concurrent
> global threshold modifications with lock and so on. It keeps things
> simpler.
Sure. Let's go with per-algo threshold.
>
> [..]
> > > > Let's squeeze the comp algo index into meta area since we have
> > > > some rooms for the bits. Then can we could remove the specific
> > > > recomp two flags?
> > >
> > > What is meta area?
> >
> > zram->table[index].flags
> >
> > If we squeeze the algorithm index, we could work like this
> > without ZRAM_RECOMP_SKIP.
>
> We still need ZRAM_RECOMP_SKIP. Recompression may fail to compress
> object further: sometimes we can get recompressed object that is larger
> than the original one, sometimes of the same size, sometimes of a smaller
> size but still belonging to the same size class, which doesn't save us
> any memory. Without ZRAM_RECOMP_SKIP we will continue re-compressing
Indeed.
> objects that are in-compressible (in a way that saves us memory in
> zsmalloc) by any of the ZRAM's algorithms.
>
> > read_block_state
> > zram_algo_idx(zram, index) > 0 ? 'r' : '.');
> >
> > zram_read_from_zpool
> > if (zram_algo_idx(zram, idx) != 0)
> > idx = 1;
>
> As an idea, maybe we can store everything re-compression related
> in a dedicated meta field? SKIP flag, algorithm ID, etc.
>
> We don't have too many bits left in ->flags on 32-bit systems. We
> currently probably need at least 3 bits - one for RECOMP_SKIP and at
> least 2 for algorithm ID. 2 bits for algorithm ID put us into situation
> that we can have only 00, 01, 10, 11 as IDs, that is maximum 3 recompress
> algorithms: 00 is the primary one and the rest are alternative ones.
> Maximum 3 re-compression algorithms sounds like a reasonable max value to
> me. Yeah, maybe we can use flags bits for it.
If possbile, let's go with those three bits into flags since we could
factor them out into dedicated field, anytime later since it's not ABI.