Re: [PATCH v2 3/7] mm/vmalloc.c: allow vread() to read out vm_map_ram areas

From: Baoquan He
Date: Wed Jan 04 2023 - 03:02:44 EST


On 12/17/22 at 12:06pm, Lorenzo Stoakes wrote:
> On Sat, Dec 17, 2022 at 09:54:31AM +0800, Baoquan He wrote:
> > Currently, vread can read out vmalloc areas which is associated with
> > a vm_struct. While this doesn't work for areas created by vm_map_ram()
> > interface because it doesn't have an associated vm_struct. Then in vread(),
> > these areas will be skipped.
> >
> > Here, add a new function vb_vread() to read out areas managed by
> > vmap_block specifically. Then recognize vm_map_ram areas via vmap->flags
> > and handle them respectively.
> >
> > Signed-off-by: Baoquan He <bhe@xxxxxxxxxx>
> > ---
> > mm/vmalloc.c | 66 ++++++++++++++++++++++++++++++++++++++++++++++------
> > 1 file changed, 59 insertions(+), 7 deletions(-)
> >
> > diff --git a/mm/vmalloc.c b/mm/vmalloc.c
> > index 190f29bbaaa7..6612914459cf 100644
> > --- a/mm/vmalloc.c
> > +++ b/mm/vmalloc.c
> > @@ -3515,6 +3515,51 @@ static int aligned_vread(char *buf, char *addr, unsigned long count)
> > return copied;
> > }
> >
> > +static void vb_vread(char *buf, char *addr, int count)
> > +{
> > + char *start;
> > + struct vmap_block *vb;
> > + unsigned long offset;
> > + unsigned int rs, re, n;
> > +
> > + vb = xa_load(&vmap_blocks, addr_to_vb_idx((unsigned long)addr));
> > +
> > + spin_lock(&vb->lock);
> > + if (bitmap_empty(vb->used_map, VMAP_BBMAP_BITS)) {
> > + spin_unlock(&vb->lock);
> > + memset(buf, 0, count);
> > + return;
> > + }
> > + for_each_set_bitrange(rs, re, vb->used_map, VMAP_BBMAP_BITS) {
> > + if (!count)
> > + break;
> > + start = vmap_block_vaddr(vb->va->va_start, rs);
> > + if (addr < start) {
> > + if (count == 0)
> > + break;
> > + *buf = '\0';
> > + buf++;
> > + addr++;
> > + count--;
> > + }

Very sorry, Lorenzo, I just noticed this mail. It's very weird. Earlier,
Uladzislau's reply to patch 2/7 got to be seen in my mutt mail client 10
days later. I am not sure it's my mail client's problem, or a mail server
delivery issue.

>
> I may be missing something here, but is this not essentially 'if the address is
> below a used region, write a single null byte into the buffer and continue,
> assuming we are now in a used area?'

Not sure if I got you. for_each_set_bitrange only iterates the used
regions. So in the for loop, what we do is fill zero into the buffer
below the used region, then read out the used region. You said
'continue', I don't understand what it means.

Assume we have 3 used regions in one vmap block, see below diagram.
|_______|______________|________|_____________|_____|_____________|______|
|hole 0 |used region 0 |hole 1 |used region 1|hole2|used region2 |hole 3 |

hole 0,1,2 will be set zero when we iterate to the used region above
them. And the last hole 3 is set at the end of this function. Please
help point it out if I got it wrong.

>
> This doesn't seem right, but I am happy to be corrected (perhaps we only expect
> to be a single byte below a start region?)
>
> > + /*it could start reading from the middle of used region*/
> > + offset = offset_in_page(addr);
> > + n = (re - rs + 1) << PAGE_SHIFT - offset;
>
> The kernel bot has already picked up on this paren issue :)

Right, has been handled. Thanks.

>
> > + if (n > count)
> > + n = count;
> > + aligned_vread(buf, start+offset, n);
> > +
> > + buf += n;
> > + addr += n;
> > + count -= n;
> > + }
> > + spin_unlock(&vb->lock);
> > +
> > + /* zero-fill the left dirty or free regions */
> > + if (count)
> > + memset(buf, 0, count);
> > +}
> > +
> > /**
> > * vread() - read vmalloc area in a safe way.
> > * @buf: buffer for reading data
> > @@ -3545,7 +3590,7 @@ long vread(char *buf, char *addr, unsigned long count)
> > struct vm_struct *vm;
> > char *vaddr, *buf_start = buf;
> > unsigned long buflen = count;
> > - unsigned long n;
> > + unsigned long n, size, flags;
> >
> > addr = kasan_reset_tag(addr);
> >
> > @@ -3566,12 +3611,16 @@ long vread(char *buf, char *addr, unsigned long count)
> > if (!count)
> > break;
> >
> > - if (!va->vm)
> > + vm = va->vm;
> > + flags = va->flags & VMAP_FLAGS_MASK;
> > +
> > + if (!vm && !flags)
> > continue;
> >
>
> This seems very delicate now as going forward, vm _could_ be NULL. In fact, a
> later patch in the series then goes on to use vm and assume it is not null (will
> comment).
>
> I feel we should be very explicit after here asserting that vm != NULL.
>
> > - vm = va->vm;
> > - vaddr = (char *) vm->addr;
> > - if (addr >= vaddr + get_vm_area_size(vm))
> > + vaddr = (char *) va->va_start;
> > + size = flags ? va_size(va) : get_vm_area_size(vm);
>
> For example here, I feel that this ternary should be reversed and based on
> whether vm is null, unles we expect vm to ever be non-null _and_ flags to be
> set?

Now only vm_map_ram area sets flags, all other types has vm not null.
Since those temporary state, e.g vm==NULL, flags==0 case has been
filtered out. Is below you suggested?

size = (!vm&&flags)? va_size(va) : get_vm_area_size(vm);
or
size = (vm&&!flags)? get_vm_area_size(vm):va_size(va);

>
> > +
> > + if (addr >= vaddr + size)
> > continue;
> > while (addr < vaddr) {
> > if (count == 0)
> > @@ -3581,10 +3630,13 @@ long vread(char *buf, char *addr, unsigned long count)
> > addr++;
> > count--;
> > }
> > - n = vaddr + get_vm_area_size(vm) - addr;
> > + n = vaddr + size - addr;
> > if (n > count)
> > n = count;
> > - if (!(vm->flags & VM_IOREMAP))
> > +
> > + if ((flags & (VMAP_RAM|VMAP_BLOCK)) == (VMAP_RAM|VMAP_BLOCK))
> > + vb_vread(buf, addr, n);
> > + else if ((flags & VMAP_RAM) || !(vm->flags & VM_IOREMAP))
> > aligned_vread(buf, addr, n);
> > else /* IOREMAP area is treated as memory hole */
> > memset(buf, 0, n);
> > --
> > 2.34.1
> >
>