On 18/07/25 14:35, Huang Shijie wrote:Okay, I will change the commit message later..
When detach_tasks() scans the src_cpu's task list, the task listHow about something like so:
may shrink during the scanning. For example, the task list
may have four tasks at the beginning, it may becomes to two
during the scanning in detach_tasks():
Task list at beginning : "ABCD"
Task list in scanning : "CD"
(ABCD stands for differnt tasks.)
In this scenario, the env->loop_max is still four, so
detach_tasks() may scan twice for some tasks:
the scanning order maybe : "DCDC"
"""
detach_tasks() uses struct lb_env.loop_max as an env.src_rq->cfs_tasks
iteration count limit. It is however set without the source RQ lock held,
and besides detach_tasks() can be re-invoked after releasing and
re-acquiring the RQ lock per LBF_NEED_BREAK.
This means that env.loop_max and the actual length of env.src_rq->cfs_tasks
as observed within detach_tasks() can differ. This can cause some tasks to
be unnecessarily iterated over more than once, for instance:
env.loop_max := 4
detach_tasks()
// Here env.src->cfs_tasks only contains two tasks which can't be
// migrated anywhere, so they're put back in the list each time.
env.src->cfs_tasks := [p1, p0]
// The iteration goes:
p0; cfs_tasks := [p0, p1]
p1; cfs_tasks := [p1, p0]
p0; cfs_tasks := [p0, p1]
p1; cfs_tasks := [p1, p0]
// IOW we iterate over each task twice
"""
In the Specjbb test, this issue can be catched many times.^^^^^^^
caught
(Over 330,000 times in a 30-min Specjbb test)Reviewed-by: Valentin Schneider <vschneid@xxxxxxxxxx>
The patch introduces "first_back" to record the first task which
is put back to the task list. If we get a task which is equal to
first_back, we break the loop, and avoid to scan twice for it.
Signed-off-by: Huang Shijie <shijie@xxxxxxxxxxxxxxxxxxxxxx>