RE: [PATCH v4 4/4] RISC-V: Add arch functions to support hibernation/suspend-to-disk

From: JeeHeng Sia
Date: Tue Feb 28 2023 - 00:39:04 EST




> -----Original Message-----
> From: Andrew Jones <ajones@xxxxxxxxxxxxxxxx>
> Sent: Tuesday, 28 February, 2023 1:05 PM
> To: JeeHeng Sia <jeeheng.sia@xxxxxxxxxxxxxxxx>
> Cc: paul.walmsley@xxxxxxxxxx; palmer@xxxxxxxxxxx; aou@xxxxxxxxxxxxxxxxx; linux-riscv@xxxxxxxxxxxxxxxxxxx; linux-
> kernel@xxxxxxxxxxxxxxx; Leyfoon Tan <leyfoon.tan@xxxxxxxxxxxxxxxx>; Mason Huo <mason.huo@xxxxxxxxxxxxxxxx>
> Subject: Re: [PATCH v4 4/4] RISC-V: Add arch functions to support hibernation/suspend-to-disk
>
> On Tue, Feb 28, 2023 at 01:32:53AM +0000, JeeHeng Sia wrote:
> > > > > > load image;
> > > > > > loop: Create pbe chain, return error if failed;
> > > > >
> > > > > This loop pseudocode is incomplete. It's
> > > > >
> > > > > loop:
> > > > > if (swsusp_page_is_forbidden(page) && swsusp_page_is_free(page))
> > > > > return page_address(page);
> > > > > Create pbe chain, return error if failed;
> > > > > ...
> > > > >
> > > > > which I pointed out explicitly in my last reply. Also, as I asked in my
> > > > > last reply (and have been asking four times now, albeit less explicitly
> > > > > the first two times), how do we know at least one PBE will be linked?
> > > > 1 PBE correspond to 1 page, you shouldn't expect only 1 page is saved.
> > >
> > > I know PBEs correspond to pages. *Why* should I not expect only one page
> > > is saved? Or, more importantly, why should I expect more than zero pages
> > > are saved?
> > >
> > > Convincing answers might be because we *always* put the restore code in
> > > pages which get added to the PBE list or that the original page tables
> > > *always* get put in pages which get added to the PBE list. It's not very
> > > convincing to simply *assume* that at least one random page will always
> > > meet the PBE list criteria.
> > >
> > > > Hibernation core will do the calculation. If the PBEs (restore_pblist) linked successfully, the hibernated image will be restore else
> > > normal boot will take place.
> > > > > Or, even more specifically this time, where is the proof that for each
> > > > > hibernation resume, there exists some page such that
> > > > > !swsusp_page_is_forbidden(page) or !swsusp_page_is_free(page) is true?
> > > > forbidden_pages and free_pages are not contributed to the restore_pblist (as you already aware from the code). Infact, the
> > > forbidden_pages and free_pages are not save into the disk.
> > >
> > > Exactly, so those pages are *not* going to contribute to the greater than
> > > zero pages. What I've been asking for, from the beginning, is to know
> > > which page(s) are known to *always* contribute to the list. Or, IOW, how
> > > do you know the PBE list isn't empty, a.k.a restore_pblist isn't NULL?
> > Well, this is keep going around in a circle, thought the answer is in the hibernation code. restore_pblist get the pointer from the PBE,
> and the PBE already checked for validity.
>
> It keeps going around in circles because you keep avoiding my question by
> pointing out trivial linked list code. I'm not worried about the linked
> list code being correct. My concern is that you're using a linked list
> with an assumption that it is not empty. My question has been all along,
> how do you know it's not empty?
>
> I'll change the way I ask this time. Please take a look at your PBE list
> and let me know if there are PBEs on it that must be there on each
> hibernation resume, e.g. the resume code page is there or whatever.
>
> > Can I suggest you to submit a patch to the hibernation core?
>
> Why? What's wrong with it?
Kindly let me draw 2 scenarios for you. Option 1 is to add the restore_pblist checking to the hibernation core and option 2 is to add restore_pblist checking to the arch solution
Although I really don't think it is needed. But if you really wanted to add the checking, I would suggest to go with option 1. again, I really think that it is not needed!

//Option 1
//Pseudocode to illustrate the image loading
initialize restore_pblist to null;
initialize safe_pages_list to null;
Allocate safe page list, return error if failed;
load image;
loop: Create pbe chain, return error if failed;
assign orig_addr and safe_page to pbe;
link pbe to restore_pblist;
/* Add checking here */
return error if restore_pblist equal to null;
return pbe to handle->buffer;
check handle->buffer;
goto loop if no error else return with error;

//option 2
//Pseudocode to illustrate the image loading
initialize restore_pblist to null;
initialize safe_pages_list to null;
Allocate safe page list, return error if failed;
load image;
loop: Create pbe chain, return error if failed;
assign orig_addr and safe_page to pbe;
link pbe to restore_pblist;
return pbe to handle->buffer;
check handle->buffer;
goto loop if no error else return with error;
everything works correctly, continue the rest of the operation
invoke swsusp_arch_resume

//@swsusp_arch_resume()
loop2: return error if restore_pblist is null
increment restore_pblist and goto loop2
create temp_pg_table
continue the rest of the resume operation
>
> Thanks,
> drew