Re: [PATCH] lightnvm: pblk: Introduce hot-cold data separation

From: Javier GonzÃlez
Date: Tue May 07 2019 - 01:28:46 EST


> On 6 May 2019, at 07.16, Heiner Litz <hlitz@xxxxxxxx> wrote:
>
> Igor, Javier,
>
> both of you are right. Here is what I came up with after some more thinking.
>
> We can avoid the races in 2. and 3. with the following two invariants:
> I1: If we have a GC line with seq_id X, only garbage collect from
> lines older than X (this addresses 2.)
> I2: Guarantee that the open GC line always has a smaller seq_id than
> all open user lines (this addresses 3)
>
> We can enforce I2 by adding a minor seq_id. The major sequence id is
> only incremented when allocating a user line. Whenever a GC line is
> allocated we read the current major seq_id (open user line) and
> increment the minor seq_id. This allows us to order all GC lines
> before the open user line during recovery.
>
> Problem with this approach:
> Consider the following example: There exist user lines U0, U1, U2
> (where 0,1,2 are seq_ids) and a non-empty GC5 line (with seq_id 5). If
> we now do only sequential writes all user lines will be overwritten
> without GC being required. As a result, data will now reside on U6,
> U7, U8. If we now need to GC we cannot because of I1.
> Solution: We cannot fast-forward the GC line's seq_id because it
> contains old data, so pad the GC line with zeros, close it and open a
> new GC9 line.
>
> Generality:
> This approach extends to schemes that use e.g. hot, warm, cold open
> lines (adding a minor_minor_seq_id)
>
> Heiner


Looks like a good solution that allows us to maintain the current mapping model.

Javier

Attachment: signature.asc
Description: Message signed with OpenPGP