Re: [PATCH] [RFC] make hd_struct->in_flight atomic to avoid diskstatcorruption
From: Tejun Heo
Date: Thu Apr 16 2009 - 10:41:23 EST
Hello, Nikanth, Jens.
Nikanth Karthikesan wrote:
>> Hmm. Did you observe this behaviour?
>
> Sorry, not on current kernels. But on a very old 2.6.5 kernel.
>
> Reading Documentation/iostats.txt and the changelog of commit
> e71bf0d0ee89e51b92776391c5634938236977d5 made me assume that this could be a
> problem even today.
The only problem we can run into there is if a request doesn't get
attributed to a partition on issue but gets attributed to a partition
on completion, which seems to be possible if a new partition is added
while IO on the whole device which fell into the new partition area is
already in progress, which, on the first glance, seems possible if the
admin tries really hard. I think we can get around the problem by
doing part->in_flight = min(max(new_val, part0->in_flight), 0) in
dec_in_flight(). This is pretty extreme corner case tho.
>> A quick glance at the code reveals
>> that the callers of part_inc_in_flight() and part_dec_in_flight() in the
>> block layer are always done under the queue lock. Ditto
>> part_round_stats(), which calls part_round_stats_single() and also needs
>> protection for in_flight.
>>
>> That basically just leaves the code reading this out and reporting, and
>> driver calls to part_round_stats(). I'd suggest looking there instead,
>> we're not going to make ->in_flight an atomic just because of some
>> silliness there that could be fixed.
>
> Isn't this also true for the stats protected by the
> part_stat_lock()? Only places where we are only reading seems to be
> called without the queue lock.
part_stat_lock() doesn't protect against simultaneous access. I don't
think we have any place where in_flight is updated without queuelock
and the counters being equal to or smaller then ulong, reading
shouldn't be a problem.
I don't think the bug you saw in 2.6.5 kernel applies to upstream
kernel. The minus in_flight value was seen on the diskstats of the
whole device which can't be affected by partition coming up while IOs
are in progress.
Thanks.
--
tejun
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/