<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>N3618: What can signal handlers do? (CWG 1441)</title>
</head>
<body>
<table summary="Identifying information for this document.">
	<tr>
                <th>Doc. No.:</th>
                <td>WG21/N3618</td>
        </tr>
        <tr>
                <th>Date:</th>
                <td>2013-3-17</td>
        </tr>
        <tr>
                <th>Reply to:</th>
                <td>Hans-J. Boehm</td>
        </tr>
        <tr>
                <th>Phone:</th>
                <td>+1-650-857-3406</td>
        </tr>
        <tr>
                <th>Email:</th>
                <td><a href="mailto:Hans.Boehm@hp.com">Hans.Boehm@hp.com</a></td>
        </tr>
</table>
	
<H1>N3618: What can signal handlers do? (CWG 1441)</h1>

<P>
This is an attempt to summarize the current state of discussions
around CWG issue 1441.  Much of this discussion has occurred within
SG1, and it has moved significantly past the original issue,
so it seemed appropriate to turn it into a separate paper.  This
attempts to reflect the contributions of many people, especially
Lawrence Crowl, Jens Maurer, Clark Nelson, and Detlef Vollmann.

<H2>Background</h2>

<P>
<A HREF="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3501.html#1441">CWG Issue 1441</a> points out that in the process of
relaxing the restrictions on asynchronous signal handlers to allow
use of atomics, we inadvertently
made it impossible to use even local variables of non-volatile,
non-atomic type.
As a result of an initial discussion within CWG, Jens Maurer generated a
<A HREF="http://wiki.edg.com/twiki/pub/Wg21kona2012/CoreWorkingGroup/proposed_resolution_core-1441.html">proposed resolution</a>, which addresses
that specific issue.
<P>
Later discussion in SG1, both in
<A HREF="http://wiki.edg.com/twiki/bin/view/Wg21portland2012/SG1Minutes10-18">
Portland</a> and during
<A HREF="http://wiki.edg.com/twiki/bin/view/Wg21bristol/FebPhoneCallMinutes">the February 2013 SG1 teleconference</a>, raised a number of additional
issues.  Both Jens' solution and all prior versions of the standard
still give undefined behavior to code involving signal handlers which
we believe should clearly be legal.  For example, a signal handler
should be allowed to access "read-only" data that has not been modified
since the signal handler was installed.  Our goal is to correct such
oversights, and allow some realistic signal handlers to be portable,
while preserving a significant amount of implementation
freedom with respect to what is allowable in a signal handler.  In
particular, we do not want to reinvent Posix' notion of
async-signal-safe functions here.

<H2>Proposed resolution and discussion</h2>

<P>
We give several proposed changes and summarize the reasoning behind the
change as well as some of the past discussion:

<H3>Replace 1.9p6, the paragraph imposing the restrictions on signal handlers</h3>
<P>
Replace 1.9p6 [intro.execution]:
<BLOCKQUOTE>
	<DEL>When the processing of the abstract machine is interrupted
		by receipt of a signal, the values of objects which are neither</del>
<OL>
	<LI> <DEL>of type <CODE>volatile std::sig_atomic_t</code> nor</del>
	<LI> <DEL>lock-free atomic objects (29.4)</del>
</ol>
<DEL>are unspecified during the execution of the signal handler, and the
	value of any object not in either of these two categories that
	is modified by the handler becomes undefined.</del>
</blockquote>
<P>
with
<BLOCKQUOTE>
	<P>
	<ins>If a signal handler is executed as a result of a call to the
		<code>raise</code> function, then the execution of the
		handler is sequenced after the invocation of the
		<code>raise</code> function and before its return.
		When a signal is received for another reason,
		the execution of the signal handler is
		unsequenced with respect to the rest of the program.</ins>
</BLOCKQUOTE>
<P>
The original restriction would now be expressed elsewhere (see below)
in terms of data races.  This means that signal handlers can now access
variables also accessed in mainline code, so long as the required
happens-before orders are established.
<P>
We concluded during the February discussion that the old "interrupted by
a signal" phrase referred to an asynchronous signal, and was basically OK.
But after reading the C standard I'm not sure, and it makes sense to me
to be more explicit.  This is my latest attempt to do so.

<H3>Expand the discussion of data races to cover signal handler invocations we want to prohibit</h3>

<P>
Change the normative part of 1.10p21 [intro.multithread] as follows:
<BLOCKQUOTE>
<P>
<ins>Two actions are potentially concurrent if they are performed by
different threads, or at least one is performed by a signal handler and they
are not both performed by the same signal-handler invocation.</ins>
The execution of a program contains a data race if it contains two
<ins>potentially concurrent</ins> conflicting actions
<del>in different threads</del>, at least one of which is not atomic,
and neither happens before the other. Any such data race results in
undefined behavior.
</blockquote>

<P>
<B>Discussion:</b>
<P>
There was some discussion during the February phone call as to whether
we should view signal handlers as being performed by a specific thread at
all, and I think we were moving towards removing that notion.
A signal handler probably cannot portably tell which thread
it's running on.  But after thinking about this more, I don't
know how to reconcile this change with <code>atomic_signal_fence</code>,
so I am once again inclined to leave things more like they are.

<P>
There was some earlier discussion about the difference in treatment
between "data races" and undefined behavior due to unsequenced operations
in the same full expression (1.9p15).  The conclusion appears to be
that behavior is in fact the same, though for reasons that appear to
apply only to C++, not C.  In more detail:
<P>
Question 1: Do unsequenced atomic operations on the same object cause undefined
behavior?  Is <code> {atomic&lt;int&gt; i; i = i++;}</code>
legal?

<P>
Answer: Yes. The operations on <code>i</code> are all function calls, and hence
indeterminately sequenced, not unsequenced.  Hence there is no undefined
behavior.  The answer for C may be different.

<P>
Question 2:
Is <code>{ atomic&lt;int *&gt;p = 0; int i; (i = 17, p = &amp;i, 1) + (p? *p : 0)}
</code>legal?

<P>
Answer: After some false starts, the answer appears to be yes.
The store to <code>p</code> and the initial test of <code>p</code>
are indeterminately sequenced.  If the latter occurs first, the potentially
unsequenced access to <code>*p</code> doesn't occur.  In the other
case, the store to <code>i</code> is sequenced before the store
to <code>p</code>, which is sequenced before the test on <code>p</code>,
which is sequenced before the questionable load from <code>*p</code>.
This again relies heavily on the fact that atomic operations are
function calls in C++.

<P>
Thus there appears to be no real difference between the treatment of unsequenced
conflicting operations and data races, and we could model races between
a signal handler and mainline code in the same thread using either mechanism.
We could possibly even remove the distinction entirely.  The situation in C
is potentially (and accidentally) different.

<H3>Ensure that signal handler invocation happens after signal handler
	installation</h3>

<P>
Insert in 18.10 [support.runtime] after p7:

<BLOCKQUOTE>
<P>
The function <code>signal</code> defined in <code>&lt;csignal&gt;</code>
shall ensure that a call to <code>signal</code> synchronizes
with any resulting invocation of the newly installed signal handler.
</blockquote>

<P>
<B>Discussion:</b>

<P>
Note that 29.8p6 already talks about synchronizes-with relationships between
a thread and a signal handler in the same thread, so I don't think this is a
very fundamental change in perspective.

<P>
This does have the effect of allowing full atomics to be used in
communicating with a signal handler.  I can now allocate an object,
assign it to an atomic pointer variable, and have a signal handler access
the non-atomic objects through that variable, just as another thread could.
Since signal handlers obey strictly more scheduling constraints than threads,
I think this is entirely expected, and what we had in mind all the time.

<H3>Clarify which C-like functions can be used in a signal
	handler</h3>

<P>
Change 18.10 [support.runtime] p9 as follows:

<BLOCKQUOTE>
<P>
The common subset of the C and C++ languages consists of all declarations,
definitions, and expressions that may appear in a well formed C++ program
and also in a conforming C program.
<ins>A plain lock-free atomic operation is an invocation of a function
<i>f</i>
from clause 29,
such that <i>f</i> is not a member function, and either <i>f</i> is
the function <code>is_lock_free</code>, or
for any atomic argument <code>a</code> passed to <i>f</i>,
<code>is_lock_free(a)</code> yields true.</ins>
A POF ("plain
old function") is a function that uses only features from this common
subset, and that does not directly or indirectly use any function that is
not a POF, except that it may use <del>functions defined in Clause 29
that are not member functions
</del><ins>plain lock-free atomic operations</ins>.
All signal handlers shall have C linkage.
<del>A POF that could be used as
a signal handler in a conforming C program does not produce undefined
behavior when used as a signal handler in a C++ program.</del>
The behavior of any <del>other</del> function <ins>other than a POF</ins>
used as a signal handler in a C++ program
is implementation-defined.
</blockquote>
<P>
<B>Discussion:</b>
<P>
Some of the phone call discussion seems to have overlooked the existing
clause 29 exemption which, for example, makes calls to
<code>is_atomic</code> legal.
<P>
That exemption was too broad, since it allowed non-lock-free calls.
Calls that acquire locks need to be prohibited in signal handlers,
since they typcailly deadlock if the mainline thread already holds the
lock.
<P>
I don't understand the meaning of a normative sentence that says "X does not
have undefined behavior".  We otherwise define its meaning, so why would it
possibly have undefined behavior without this sentence?  Hence I'm
proposing to rephrase.
<P>
(This paragraph removes the need for a 1.10p5 change I previously proposed.)

</body>
</html>
