ptrace attach in multi-threaded processes
Konstantin Belousov
kostikbel at gmail.com
Wed Jul 13 03:30:49 UTC 2016
On Tue, Jul 12, 2016 at 11:24:14AM -0700, Mark Johnston wrote:
> On Tue, Jul 12, 2016 at 08:51:50PM +0300, Konstantin Belousov wrote:
> > On Tue, Jul 12, 2016 at 10:05:02AM -0700, Mark Johnston wrote:
> > > On Tue, Jul 12, 2016 at 08:57:53AM +0300, Konstantin Belousov wrote:
> > > I suppose it is not strictly incorrect. I find it surprising that a
> > > PT_ATTACH followed by a PT_DETACH may leave the process in a different
> > > state than it was in before the attach. This means that it is not
> > > possible to gcore a process without potentially leaving it stopped, for
> > > instance. This result may occur in a single-threaded process
> > > as well, since a signal may already be queued when the PT_ATTACH handler
> > > sends SIGSTOP.
> > I still miss somethine. Isn't this an expected outcome from sending a
> > signal with STOP action ?
>
> It is. But I also expect a PT_DETACH operation to resume a stopped
> process, assuming that a second SIGSTOP was not posted while the
> process was suspended.
But as far as the situation was discussed, it seems that real SIGSTOP raced
with PT_ATTACH. And the offered interpretation that SIGSTOP was delivered
'a bit later' than PT_ATTACH would fit into the description.
>
> >
> > > Indeed, I somehow missed that. I had assumed that the leaked TDB_XSIG
> > > represented a bug in ptracestop().
> > It could, I did not made any statements that deny the bug:
>
> To be clear, the root of my issue comes from the following: the SIGSTOP
> from PT_ATTACH may be handled concurrently with a second signal
> delivered to a second thread in the same process. Then, the resulting
> behaviour depends on the order in which the recipient threads suspend in
> ptracestop(). If the thread that received SIGSTOP suspends last, its
> td_xsig will be overwritten with the userland-provided value in the
> PT_DETACH handler. If it suspends first, its td_xsig will be preserved,
> and upon PT_DETACH the process will be suspended again in issignal().
>
> I'm not sure if this is considered a bug. ptracestop() is handling all
> signals (including the SIGSTOP generated by the PT_ATTACH handler) in a
> consistent way, but this results in inconsistent behaviour from the
> perspective of a ptrace(2) consumer.
Still I do not understand what is inconsistent.
Let look at it from the other side (before, we discussed the implementation
in kernel). Is this happens in gcore(1) ? If yes, gcore interaction
with ptrace(2) looks like this:
ptrace(PT_ATTACH, g_pid);
waitpid(g_pid, &g_status, 0);
...
if (sig == SIGSTOP)
sig = 0;
ptrace(PT_DETACH, g_pid, 1, sig);
It sounds as if it is desirable for you to modify gcore(1) to consume
all signals, or at least, all STOP signals before PT_DETACH. I do not
understand why do you want it, but that would probably give you the
behaviour you want:
ptrace(PT_ATTACH, g_pid);
waitpid(g_pid, &g_status, 0);
...
/* still consume implicit SIGSTOP from attach */
if (sig == SIGSTOP)
sig = 0;
do {
error = waitpid(g_pid, &g_status, WNOHANG | WSTOPPED);
} while (error == 0);
ptrace(PT_DETACH, g_pid, 1, sig);
More information about the freebsd-current
mailing list