Re: multi-second application stall in open()

From: Josh Hunt
Date: Mon Jun 25 2012 - 12:22:30 EST


On Mon, Jun 25, 2012 at 8:30 AM, Vivek Goyal <vgoyal@xxxxxxxxxx> wrote:
> On Fri, Jun 22, 2012 at 04:34:07PM -0500, Josh Hunt wrote:
>
> [..]
>> Shouldn't the queue stay on the RR list until it is empty?
>
> This does look odd. cfqq should stay on service tree as long as it has
> requests.
>
> Can you attach the full log again. Also make sure that blktrace is not
> dropping any trace events.
>
> In slice_expire() we check following.
>
>        if (cfq_cfqq_on_rr(cfqq) && RB_EMPTY_ROOT(&cfqq->sort_list))
>                cfq_del_cfqq_rr(cfqd, cfqq);
>
> So for some reason RB_EMPTY_ROOT() is returning true. But we must have
> added the request and it should not have been empty.
>
> cfq_insert_request()
>  cfq_add_rq_rb()
>    elv_rb_add()
>
> So may be little more tracing after request addition will help. Just check
> that RB_EMPTY_ROOT() is not true after addition of request and also print
> number of requests queued.
>
> In slice_expired() we can probably put a BUG_ON() which checks following.
>
> BUG_ON(RB_EMPTY_ROOT(&cfqq->sort_list) && (cfqq->queued[0] || cfqq->queued[1]));
>
> Thanks
> Vivek

Vivek

First off thanks for all the time you've spent helping me on this :)

I'm attaching the log. I will add more instrumentation based on your
mail and let you know the results.

Thanks
--
Josh

Attachment: sda.parsed-moreverbose2.bz2
Description: BZip2 compressed data