Starvation of one RT task when the runtime of another exceeds period.

From: Daniel K.
Date: Wed Jun 18 2008 - 11:31:34 EST


I will demonstrate how to get an RT task stuck, and not rescheduled by
(ab)using cgroups and RT scheduling. This is on a 4 core system running
2.6.26-rc6 with two patches applied to make it work at all.

http://marc.info/?i=1213732878.3223.95.camel@xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
http://marc.info/?i=1213789854.16944.216.camel@twins

mkdir /dev/cgroup
mount -t cgroup -o cpu,cpuset cgroup /dev/cgroup

# Set up cgroup 0
mkdir /dev/cgroup/0
echo 3 > /dev/cgroup/0/cpuset.cpus
echo 0 > /dev/cgroup/0/cpuset.mems
echo 100000 > /dev/cgroup/0/cpu.rt_period_us
echo 5000 > /dev/cgroup/0/cpu.rt_runtime_us

# Set up cgroup 1
mkdir /dev/cgroup/1
echo 3 > /dev/cgroup/1/cpuset.cpus
echo 0 > /dev/cgroup/1/cpuset.mems
echo 100000 > /dev/cgroup/1/cpu.rt_period_us
echo 5000 > /dev/cgroup/1/cpu.rt_runtime_us

# Start task 1, and assign it to cgroup 0
schedtool -R -p 1 -e burnP6 &
[1] 3309
echo 3309 > /dev/cgroup/0/tasks

At this point task 1 use 20% CPU.

# Start task 2, and assign it to cgroup 1
schedtool -R -p 1 -e burnP6 &
[2] 3313
echo 3313 > /dev/cgroup/1/tasks

At this point task 2 use 20% CPU.
Both tasks use 40% of CPU core#3 in total.

# Assign an insane amount of runtime (over 100%, ref. my other mail)
echo 30000 > /dev/cgroup/1/cpu.rt_runtime_us

Now, task 2 use 100% of the CPU, and completely starves task 1, which
ceases to get scheduled.

# Cut down on the insanity
echo 5000 > /dev/cgroup/1/cpu.rt_runtime_us

Now task 2 use only 20% of the CPU again, task 1 does still not get
scheduled.

Let's call this state 'stuck'

I can make task 1 get unstuck by assigning its PID to another cgroup.

# Kick task 1, so it gets scheduled again.
echo 3309 > /dev/cgroup/1/tasks

Assuming we go back to state 'stuck', a 'killall burnP6' will only kill
task 2, task 1 is still waiting for someone to come and kick it in the
butt. As soon as that happens, it will get killed as well.

One time even both tasks got stuck and did not get scheduled, and I
needed to kick both tasks to get them going again.

Well, this wasn't really a question, but I'm sure this is not how it's
supposed to behave?


Daniel K.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/