Re: [bisected] Re: todays git: WARNING: atdrivers/ata/libata-sff.c:1017 ata_sff_hsm_move+0x45e/0x750()

From: Alan Cox
Date: Sat Jan 10 2009 - 10:29:40 EST


> All the S/G counts printed out were divisible by 4 (36 for INQUIRY and 96
> for REQUSET SENSE). It's the *actual* byte count for the REQUEST SENSE that's
> no divisible. The SCSI/ATAPI devices are free to sent less data than requested
> on non block transfer commands.

That is just fine - if the sg list is not corrupt or being mishandled and
the atapi pio code is not buggy.

RTFS a bit and it becomes obvious that the core libata code has a bug:


>From libata-sff.c:

/* consumed can be larger than count only for the last transfer */
WARN_ON_ONCE(qc->cursg && count != consumed);

The big clue turns out to be that the code doesn't match the comment.

Next note the check on qc->cursg. If my input sg list is a 36 byte single
sg entry then qc->cursg should be NULL by the WARN_ON() - but it isn't.

If qc->cursg is NULL when the sg_next() is run then we don't warn because
we are quite happy with the last segment being padded or underrunning.
What we actually want to explode on is a case where we transfer more
bytes than are wanted and where there are more sg entries to perform - at
that point we would corrupt.


So at least one failure case is

Core code issues an SG list for 96 bytes
Drive indicates it wishes to return 18 bytes

data_xfer transfers 18 bytes + 2 padding (correctly) -> 20 bytes


At this point __atapi_pio_bytes breaks

it updates qc->curbytes by 18
it updates the offset by 18

The last segment is not exhausted so it does not update qc->cursg

qc->cursg is not updated and the WARN erroneously uses !=

The bogus WARN_ON_ONCE() triggers.

So the bug is the WARN_ON being wrong. In fact __atapi_pio_bytes doesn't
know enough to do the WARN check correctly as it doesn't know if it is
the last request being made. It just happens it didn't break before
because all our transfers are word aligned.

We can remove the WARN for the moment, but someone should probably fix
the sanity check logic.

Alan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/