Re: [PATCH v2 1/5] selftests/sgx: Retry the ioctl()'s returned with EAGAIN

From: Jarkko Sakkinen
Date: Thu Sep 08 2022 - 19:19:50 EST


On Thu, Sep 08, 2022 at 03:43:06PM -0700, Reinette Chatre wrote:
> Hi Jarkko and Haitao,
>
> On 9/4/2022 7:04 PM, Jarkko Sakkinen wrote:
> > From: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> >
> > For EMODT and EREMOVE ioctl()'s with a large range, kernel
> > may not finish in one shot and return EAGAIN error code
> > and count of bytes of EPC pages on that operations are
> > finished successfully.
> >
> > Change the unclobbered_vdso_oversubscribed_remove test
> > to rerun the ioctl()'s in a loop, updating offset and length
> > using the byte count returned in each iteration.
> >
> > Fixes: 6507cce561b4 ("selftests/sgx: Page removal stress test")
>
> Should this patch be moved to the "critical fixes for v6.0" series?

I think not because it does not risk stability of the
kernel itself. It's "nice to have" but not mandatory.

>
> > Signed-off-by: Haitao Huang <haitao.huang@xxxxxxxxxxxxxxx>
> > Tested-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > Signed-off-by: Jarkko Sakkinen <jarkko@xxxxxxxxxx>
> > ---
> > v3:
> > * Added a fixes tag. The bug is in v6.0 patches.
> > * Added my tested-by (the bug reproduced in my NUC often).
> > v2:
> > * Changed branching in EAGAIN condition so that else branch
> > is not required.
> > * Addressed Reinette's feedback:
> > ---
> > tools/testing/selftests/sgx/main.c | 42 ++++++++++++++++++++++++------
> > 1 file changed, 34 insertions(+), 8 deletions(-)
> >
> > diff --git a/tools/testing/selftests/sgx/main.c b/tools/testing/selftests/sgx/main.c
> > index 9820b3809c69..59cca806eda1 100644
> > --- a/tools/testing/selftests/sgx/main.c
> > +++ b/tools/testing/selftests/sgx/main.c
> > @@ -390,6 +390,7 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
> > struct encl_segment *heap;
> > unsigned long total_mem;
> > int ret, errno_save;
> > + unsigned long count;
> > unsigned long addr;
> > unsigned long i;
> >
> > @@ -453,16 +454,30 @@ TEST_F_TIMEOUT(enclave, unclobbered_vdso_oversubscribed_remove, 900)
> > modt_ioc.offset = heap->offset;
> > modt_ioc.length = heap->size;
> > modt_ioc.page_type = SGX_PAGE_TYPE_TRIM;
> > -
> > + count = 0;
> > TH_LOG("Changing type of %zd bytes to trimmed may take a while ...",
> > heap->size);
> > - ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
> > - errno_save = ret == -1 ? errno : 0;
> > + do {
> > + ret = ioctl(self->encl.fd, SGX_IOC_ENCLAVE_MODIFY_TYPES, &modt_ioc);
> > +
> > + errno_save = ret == -1 ? errno : 0;
> > + if (errno_save != EAGAIN)
> > + break;
> > +
> > + EXPECT_EQ(modt_ioc.result, 0);
>
> If this check triggers then there is something seriously wrong and in that case
> it may also be that this loop may be unable to terminate or the error condition would
> keep appearing until the loop terminates (which may be many iterations). Considering
> the severity and risk I do think that ASSERT_EQ() would be more appropriate,
> similar to how ASSERT_EQ() is used in patch 5/5.
>
> Apart from that I think that this looks good.
>
> Thank you very much for adding this.
>
> Reinette

Hmm... I could along the lines:

/*
* Get time since Epoch is milliseconds.
*/
unsigned long get_time(void)
{
struct timeval start;

gettimeofday(&start, NULL);

return (unsigneg long)start.tv_sec * 1000L + (unsigned long)start.tv_usec / 1000L;
}

and

#define IOCTL_RETRY_TIMEOUT 100

In the test function:

unsigned long start_time;

/* ... */

start_time = get_time();
do {
EXPECT_LT(get_time() - start_time(), IOCTL_RETRY_TIMEOUT);

/* ... */
}

/* ... */

What do you think?

BR, Jarkko