Re: [GIT PULL] aio: fix sleeping while TASK_INTERRUPTIBLE

From: Benjamin LaHaise
Date: Sun Feb 01 2015 - 17:15:36 EST


On Sun, Feb 01, 2015 at 01:01:06PM -0800, Linus Torvalds wrote:
> On Sun, Feb 1, 2015 at 6:40 AM, Benjamin LaHaise <bcrl@xxxxxxxxx> wrote:
> >
> > Chris Mason (1):
> > fs/aio: fix sleeping while TASK_INTERRUPTIBLE
>
> Ugh.
>
> This patch is too ugly to live. As far as I can tell, this is another
> case of people just mindlessly trying to make the warning go away,
> rather than fixing anything in teh code itself. In fact, the code gets
> less readable, and more hacky, with that insane "running" variable
> that doesn't actually add anything to the logic, just makes the code
> harder to read, and it's *very* non-obvious why this is done in the
> first place.
>
> If you want to shut up the warning without actually changing the code,
> use sched_annotate_sleep(). The comment about why the nested sleep
> isn't a problem ("sleeps in kmap or copy_to_user don't trigger
> warnings: If we don't copy enough events out, we'll loop through
> schedule() one time before sleeping").

It's ugly, but it actually is revealing a bug. Spurious wake ups caused
by the task already being added to ctx->wait when calling into mutex_lock()
could inadvertently cause things to go wrong. I can envision there being
code invoked that possibly expects a 1-1 relationship between sleeps and
wake ups, which being on the additional wait queue might break.

> I'm just about to push out a commit that makes
> "sched_annotate_sleep()" do the right thing, and *not* set
> TASK_RUNNING, but instead just disable the warning for that case.
> Which makes all these games unnecessary. I'm just waiting for my
> 'allmodconfig' build to finish before I push it out. Just another
> minute or two, I think.
>
> I really detest debug code (or compiler warnings) that encourage
> people to write code that is *worse* than the code that causes the
> debug code or warning to trigger. It's fundamentally wrong when those
> "fixes" actually make the code less readable and maintainable in the
> long run.

I think in this case the debug code reveals an actual bug. I looked at
other ways to fix it, and after a few attempts, Chris Mason's solution was
the least-bad. An alternative approach would be to go back to making
ctx->ring_lock back into a spinlock, but it ends up being just as much
(or even more) code churn.

> Linus

-ben
--
"Thought is the essence of where you are now."
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/