Re: linux-next test error

From: Theodore Y. Ts'o
Date: Thu Sep 06 2018 - 09:12:30 EST


On Thu, Sep 06, 2018 at 05:56:31PM +0530, Souptick Joarder wrote:
> > Yes, I'd start with converting ext4_page_mkwrite() - that should be pretty
> > straightforward - and we can leave block_page_mkwrite() as is for now. I
> > don't think allocating other VM_FAULT_ codes is going to cut it as
> > generally the filesystem may need to communicate different error codes back
> > and you don't know in advance which are interesting.

Changing the return values of ext4_page_mkwrite() and
ext4_filemap_fault() is definitely safe. If you want to start
changing the type of "ret" to vm_fault_t and introduce a new variable
"err", now you have to be super careful not to screw things up. (I
believe one of the earlier patches didn't get that right.)

> As block_page_mkwrite() is getting called from 2 places in ext4 and nilfs and
> both places fault handler code convert errno to VM_FAULT_CODE using
> block_page_mkwrite_return(), is it required to migrate block_page_mkwrite()
> to use vm_fault_t return type and further complicate the API or better to
> leave this API in current state ??

So I don't see the point of changing return value block_page_mkwrite()
(although to be honest I haven't see the value of the vm_fault_t
change at all in the first place, at least not compared to the pain it
has caused) but no, I don't think it's worth it.

The API for block_page_mkwrite() can simply be defined as "0 on
success, < 0 on error". You can add documentation that it's up to
caller of block_page_mkwrite() to call block_page_mkwrite_return() to
translate the error to a vm_fault_t.

> > One solution for passing error codes we could use with vm_fault_t is a
> > scheme similar to ERR_PTR. So we can store full error code in vm_fault_t
> > and still have a plenty of space for the special VM_FAULT_ return codes...
> > With that scheme converting block_page_mkwrite() would be trivial.
> >
> I didn't get this part. Any reference code will be helpful ?

So what we do for functions that need to either return an error or a
pointer is to call encode the error as a "pointer" by using ERR_PTR(),
and the caller can determine whether or not it is a valid pointer or
an error code by using IS_ERR_VALUE() and turning it back into an
error by using PTR_ERR(). See include/linux/err.h.

Similarly, all valid vm_fault_t's composed of VM_FAULT_xxx are
positive integers, and all errors are passed using the kernel's
convention of using a negative error code. So going through lots of
machinations to return both an error code and a vm_fault_t *really*
wasn't necessary.

The issue, as near as I can understand things, for why we're going
through all of this churn, was there was a concern that in the mm
code, that all of the places which received a vm_fault_t would
sometimes see a negative error code. The proposal here is to just
*accept* that this will happen, and just simply have them *check* to
see if it's a negative error code, and convert it to the appropriate
vm_fault_t in that case. It puts the onus of the change on the mm
layer, where as the "blast radius" of the vm_fault_t "cleanup" is
spread out across a large number of subsystems.

Which I wouldn't mind, if it wasn't causing pain. But it *is* causing
pain.

And it's common kernel convention to overload an error and a pointer
using the exact same trick. We do it *all* over the place, and quite
frankly, it's less error prone than changing functions to return a
pointer and an error. No one has said, "let's do to the ERR_PTR
convention what we've done to the vm_fault_t -- it's too confusing
that a pointer might be an error, since people might forget to check
for it." If they did that, it would be NACK'ed right, left and
center. But yet it's a good idea for vm_fault_t?

- Ted