Re: Could not set non-blocking flag with 2.6.24-rc5

From: Andrew Morton
Date: Mon Dec 17 2007 - 05:28:29 EST


On Fri, 14 Dec 2007 01:45:39 +0100 "Rafael J. Wysocki" <rjw@xxxxxxx> wrote:

> On Thursday, 13 of December 2007, Tino Keitel wrote:
> > Hi folks,
> >
> > I often build Debian packages inside a chroot. Today I discovered a
> > failure during an "aptitude update", which is a command to download new
> > package lists for the package management. In strace, the lines around
> > the failure look like this:
> >
> > 99% [Working]) = 14 14
> > [pid 5986] select(6, [3 4 5], [], NULL, {0, 500000}) = 0 (Timeout)
> > [pid 5986] gettimeofday({1197576353, 670510}, NULL) = 0
> > [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0
> > [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> > 99% [Working]) = 14 14
> > [pid 5986] select(6, [3 4 5], [], NULL, {0, 500000}) = 0 (Timeout)
> > [pid 5986] gettimeofday({1197576354, 173902}, NULL) = 0
> > [pid 5986] rt_sigprocmask(SIG_BLOCK, [WINCH], [], 8) = 0
> > [pid 5986] rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
> > 99% [Working]) = 14 14
> > [pid 5986] select(6, [3 4 5], [], NULL, {0, 500000} <unfinished ...>
> > [pid 5988] <... select resumed> ) = 1 (in [3], left {105, 0})
> > [pid 5988] read(3, "", 56559) = 0
> > [pid 5988] fcntl64(-1, F_GETFL) = -1 EBADF (Bad file
> > descriptor)
> > [pid 5988] fcntl64(-1, F_SETFL,
> > O_ACCMODE|O_CREAT|O_EXCL|O_NOCTTY|O_TRUNC|O_APPEND|O_SYNC|O_ASYNC|O_DIRECT|O_LARGEFILE|O_DIRECTORY|O_NOFOLLOW|O_NOATIME|0xfff8003c)
> > = -1 EBADF (Bad file descriptor)
> > [pid 5988] write(2, ""..., 41FATAL -> Could not set non-blocking flag
> > ) = 41
> > [pid 5988] write(2, ""..., 19Bad file descriptor) = 19
> > [pid 5988] write(2, ""..., 1
> > ) = 1
> > [pid 5988] exit_group(100) = ?
> > Process 5988 detached
> >
> > This happened with a kernel after 2.6.24-rc5
> > (4af75653031c6d454b4ace47c1536f0d2e727e3e). I rebooted into 2.6.23.8
> > and it worked. Now I rebooted into 2.6.24-rc5 again and was able to
> > reproduce the failure, so it looks like a kernel issue to me.
>
> Yes, it does.
>
> I'm not sure who to forward it to, though. Andrew, can you help, please?

hm, I missed this email, sorry.

That application is passing an fd of -1 into the kernel's fcntl64(). Could
be an application bug, could be a kernel change which triggered an
application bug, could be a kernel bug.

Can we see more of the strace output please? Let's see if we can find
where the application got that -1 from.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/