Re: PING^7 (was Re: [PATCH v2 00/14] Corrections and customization of the SG_IO command whitelist (CVE-2012-4542))

From: Martin K. Petersen
Date: Tue May 28 2013 - 16:25:10 EST


>>>>> "Vladislav" == Vladislav Bolkhovitin <vst@xxxxxxxx> writes:

Vladislav> Linux block layer is purely artificial creature slowly
Vladislav> reinventing wheel creating more problems, than solving.

On the contrary. I do think we solve a whole bunch of problems.


Vladislav> It enforces approach, where often "impossible" means
Vladislav> "impossible in this interface".

I agree we have limitations. I do not agree that all limitations are
bad. Sometimes it's OK to say no.


Vladislav> For instance, how about copy offload? How about atomic
Vladislav> writes?

I'm actively working on copy offload. Nobody appears to be interested in
atomic writes. Otherwise I'd work on those as well.


Vladislav> Why was it needed to create special blk integrity interface
Vladislav> with the only end user - SCSI?

Simple. Because we did not want to interleave data and PI 512+8+512+8
neither in memory, nor at DMA time. Furthermore, the ATA EPP proposal
was still on the table so I also needed to support ATA.

And finally, NVM Express uses the blk_integrity interface as well.


Vladislav> The block layer keeps repeating SCSI. So, maybe, after all,
Vladislav> it's better to acknowledge that direct usage of SCSI without
Vladislav> any intermediate layers and translations is more productive?
Vladislav> And for those minors not using SCSI internally, translate
Vladislav> from SCSI to their internal commands? Creating and filling
Vladislav> CDB fields for most cases isn't anyhow harder, than creating
Vladislav> and feeling bio fields.

This is quite possibly the worst idea I have heard all week.

As it stands it's a headache for the disk ULD driver to figure out which
of the bazillion READ/WRITE variants to send to a SCSI/ATA device. What
makes you think that an application or filesystem would be better
equipped to make that call?

See also: WRITE SAME w/ zeroes vs. WRITE SAME w/ UNMAP vs. UNMAP

See also: EXTENDED COPY vs. the PROXY command set

See also: USB-ATA bridge chips

You make it sound like all the block layer does is filling out
CDBs. Which it doesn't in fact have anything to do with at all.

When you are talking about CDBs we're down in the SBC/SSC territory.
Which is such a tiny bit of what's going on. We have transports, we have
SAM, we have HBA controller DMA constraints, system DMA constraints,
buffer bouncing, etc. There's a ton of stuff that needs to happen before
the CDB and the data physically reach the storage.

You seem to be advocating that everything up to the point where the
device receives the command is in the way. Well, by all means. Why limit
ourselves to the confines of SCSI? Why not get rid of POSIX
read()/write(), page cache, filesystems and let applications speak
ST-506 directly?

I know we're doing different things. My job is to make a general purpose
operating system with interfaces that make sense to normal applications.
That does not preclude special cases where it may make sense to poke at
the device directly. For testing purposes, for instance. But I consider
it a failure when we start having applications that know about hardware
intricacies, cylinders/heads/sectors, etc. That road leads straight to
the 1980s...

--
Martin K. Petersen Oracle Linux Engineering
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/