Re: status of block-integrity

From: Hannes Reinecke
Date: Thu Jan 09 2014 - 04:17:25 EST

Next message: Michal Nazarewicz: "Re: [PATCH 5/7] mm/page_alloc: separate interface to set/get migratetype of freepage"
Previous message: Gao feng: "[PATCH audit-next 2/2] Audit: make audit netlink socket net namespace unaware"
In reply to: Martin K. Petersen: "Re: status of block-integrity"
Next in thread: Martin K. Petersen: "Re: status of block-integrity"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]

On 01/08/2014 04:23 PM, Martin K. Petersen wrote:

"Hannes" == Hannes Reinecke <hare@xxxxxxx> writes:

Hannes,

Hannes> As there is no user (apart from oracleasm) no-one can attach
Hannes> protection information to any data, so even the most dedicated
Hannes> admin cannot exercise this path, let alone find issues here.

That's not how it works!

If the filesystem has not attached protection information to a bio the
block layer will do it for you. The block layer generates protection
information for writes and verifies it for reads. That's how it's worked
since day one. The code is there, it is used by everyone with a
DIX-capable HBA. See Documentation/block/data-integrity.txt.

Normal applications do not want to have to deal with generating
protection information, using an async I/O model, keeping completion
state around for extended periods of time to figure out whether the I/O
actually completed or not and so on. So the kernel-to-platter protection
scheme we have in place now is good enough.

Ah. I stand corrected.
Sorry.

That doesn't mean that I'm not interested in augmenting libaio. I
am. Very. And I know of several applications that are keen to use
it. But getting page cache passthrough and filesystem interaction
working is non-trivial. That's what has inhibited progress, not
extending the libaio API.

Same here. Actually I _like_ DIX, but the missing libaio / userland API makes it very hard to utilize it.

Hannes> Doug Gilbert and I are currently discussing LID4 / ROD Token
Hannes> copy for sg3_utils and the block layer, so any patches would be
Hannes> very helpful here.

I'm only doing LID1 right now. Any particular reason you are exploring
LID4 and ROD?

Yes. LID4/ROD token is far easier to use (conceptually).
With LID1/XCopy you have the ambiguity on where to actually send the command to; the spec is silent in this area.
Also for LID1/XCopy you have three steps:
- Query the source disk
- Query the target disk
- Send the command to either source or target
Which is very awkward and one has to think really carefully on how to implement this without all sorts of layering violations.
And you have to mix-and-match between all the various xcopy descriptors;
if there's only one it's easy enough, but when several things are getting interesting.

With LID4/ROD Token copy you basically have _two_ steps:
- Get the ROD Token from the source device
- Send the ROD Token to the target device

Which is far easier (conceptually).
Or that's the hope, anyway.
Also, the ROD Token in principle has an independent lifetime, so
you could take an arbitrary amount of time between those steps.
It might expire, though, but then failure is always an option when
working with copy offload.

As said, Doug and me are working on putting this into sg3_utils, then
we'll have a better idea on the actual workings.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
hare@xxxxxxx +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: J. Hawn, J. Guild, F. Imendörffer, HRB 16746 (AG Nürnberg)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/

Next message: Michal Nazarewicz: "Re: [PATCH 5/7] mm/page_alloc: separate interface to set/get migratetype of freepage"
Previous message: Gao feng: "[PATCH audit-next 2/2] Audit: make audit netlink socket net namespace unaware"
In reply to: Martin K. Petersen: "Re: status of block-integrity"
Next in thread: Martin K. Petersen: "Re: status of block-integrity"
Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]