Re: s390 && user_enable_single_step() (Was: odd utrace testingresults on s390x)

From: Roland McGrath
Date: Wed Jan 06 2010 - 15:56:51 EST


> do_signal is called before do_single_step. The order of checks of the
> TIF_ bits is 1) machine checks, 2) need resched, 3) signal pending, 4)
> notify resume, 5) restarting system call, 6) single step.

I see. Then the potential issue I raised would exist.

> But why is that important ? If the TIF_SINGLE_STEP bit is set the order
> of do_signal vs. do_single_step does not seem to be important to me.
> There will be a SIGTRAP if TIF_SINGLE_STEP is set, no ?

Right. It only becomes relevant if something else clears TIF_SINGLE_STEP,
which does not happen now. I was discussing the scenario of having
user_disable_single_step clear it. That might happen inside a stop,
i.e. inside do_signal (get_signal_to_deliver). So given that order of
checks, it becomes possible for user_disable_single_step to affect the
pending-step-should-SIGTRAP situation.

> But I agree, it is probably better to make all arches look the same in
> regard to that pending step report.

Right. That means we should leave the status quo of not clearing
TIF_SINGLE_STEP in user_disable_single_step.

> > Martin, I suggest having copy_thread clear TIF_SINGLE_STEP.
> > That bit is always task-private state that should not be copied.
>
> Then let us do this.

Yes, good.

> The LCTLG of multiple control registers is rather expensive. Does it
> happen often that user_*_single_step is called without need? For gdb is
> doesn't matter, the cost to switch between tracer and tracee is already
> large, the cycles added to FixPerRegisters won't matter much. For
> utrace things might be different.

In old (current) ptrace, user_*_single_step is never called on current.
In utrace, it's always called on current, so utrace-based ptrace alone
adds this second reload overhead beyond the same context-switch overhead
old ptrace has. Indeed that addition may be neglible.

In other circumstances with utrace, it is very possible to wind up with
user_disable_single_step being called superfluously when there was no
stop (and so not necessarily any context switch or other high overhead).
On other machines, user_disable_single_step is pretty cheap even where
user_enable_single_step is quite costly. Given how simple and cheap it
is to short-circuit the excess work on s390, I think it is worthwhile.

It looks like the context-switch code already checks some magic bits in
per_info to decide whether to do the costly reload. So it seems vaguely
consistent with that to conditionalize FixPerRegisters similarly. The
single_step cases are a special case with an easy one-bit check to
short-circuit, so skipping all of FixPerRegisters seems worthwhile IMHO.

To be really optimization-happy about it, you'd also hack FixPerRegisters
itself to do the reload on current only if PSW_MASK_PER is or was set (if
I'm following the code correctly). Or perhaps it should check PER_EM_MASK
instead to match what __switch_to does. (I don't understand the
distinction between those two tests, if there is one.) Extra frobbity
would be to leave the old state too when clearing PSW_MASK_PER, and then
just have the trap handler lazily ignore a hit when current doesn't have
it set. Then in a case where there is no hit before context switch,
you haven't done anything. But that is probably both excessive and
not even a win if the PER use is single-step and so there will really
very likely be a hit before context switch.


Thanks,
Roland
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@xxxxxxxxxxxxxxx
More majordomo info at http://vger.kernel.org/majordomo-info.html
Please read the FAQ at http://www.tux.org/lkml/