Re: scheduler bug: process running since 5124095h

From: Török Edwin
Date: Sun Mar 28 2010 - 04:49:32 EST


On 03/27/2010 11:46 AM, Török Edwin wrote:
> Hi Ingo, Peter,
>
> top has just shown me this:
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
>
> 6524
> edwin 20 0 228m 10m 8116 R 2 0.3 5124095h gkrellm
>
> Now obviously that process is not running since 5124095h!
> It looks like some overflow to me, the time in nanoseconds would be
> approx 0xFFFFFE1D2D476000, which is approx. minus 34 minutes.
> Thats about consistent with the uptime, but I don't know why it became
> negative:
> 11:45:48 up 42 min, 9 users, load average: 0.56, 0.25, 0.19
>
> I've attached the cfs-debug-info.sh output.
>
> This happens when using Linux 2.6.33 (actually glisse's drm-radeon tree
> which is based on 2.6.33), its the first time I noticed this.
>
> I don't know what caused it, the last things I did was:

I have a simple way to reproduce this:
1. Boot the system, run top, confirm everything is normal
2. Run latencytop, and quit (I used version 0.5)
3. Run top, see 5124095h in the TIME column

For example:
6649 daemon 20 0 74500 8892 3796 S 6 0.2 0:00.03 debsecan
4255 root 20 0 8908 360 260 S 2 0.0 0:00.04 irqbalance
1 root 20 0 10332 692 580 S 0 0.0 5124095h init

The processes that get the 5124095h seem random, but there are plenty of
them (if you sort by 'T', the entire top display is filled with
processes running since 5124095h).

Best regards,
--Edwin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/