Re: [PATCH 3.10 090/180] xfs: xfs_iflush_cluster fails to abort on error

From: Willy Tarreau
Date: Mon Aug 22 2016 - 01:18:46 EST


Hi Dave,

On Mon, Aug 22, 2016 at 02:21:08PM +1000, Dave Chinner wrote:
> > - if (error || !bp) {
> > + if (error == -EAGAIN) {
>
> Wrong. Errors changed sign in XFS in 3.17.

Ah my bad, sorry for this.

> /rant
>
> So, after just having to point this out (again!) for a different
> stable kernel patchset review, and this specific problem causing
> user-reported stable kernel regression and filesystem corruption
> *months ago*. That resulted in discussion and new stable commits to
> fix the problem. So now I'm left to wonder about the process of
> stable kernels.

Yep I remember this discussion now, I'm sorry.

> AFAICT, stable kernel maintainers are not watching what happens with
> other stable kernels, nor are they talking to other stable kernel
> maintainers. I should not have to tell every single stable kernel
> maintainer that a specific patch needs to be changed after it's
> already been reported broken, triaged and fixed in other stable
> kernels. You've all got a record that the patch needs to be included
> in a stable kernel, but nobody is seems to notice when it comes to
> fixing problems with a stable patch even when that all happens on
> stable@xxxxxxxxxxxxxxxx
>
> Seriously, guys, pick up your act a bit and start talking between
> yourselvesi and tracking regressions and fixes so the burden of
> catching known reported and fixed problems with backports doesn't
> rely on the upstream developers noticing the problem when hundreds
> of patches for random stable kernels go past on lkml every week...

We definitely do exchange quite a bit and I pick patches from 3.14 for
3.10, but sometimes I can simply pick the original one for various
reasons (eg: I if had queued its upstream ID earlier). That's also why
the review process helps. I'm sincerely sorry that I failed on this one
and that you had to deal with it again, I'm going to fix it now.

Thanks,
Willy