<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>N3787: What can signal handlers do? (CWG 1441)</title>
</head>
<body>
<table summary="Identifying information for this document.">
	<tr>
                <th>Doc. No.:</th>
                <td>WG21/N3787</td>
        </tr>
	<tr>
		<th>Revision of:</th>
		<td>WG21/N3633</td>
        <tr>
                <th>Date:</th>
                <td>2013-10-14</td>
        </tr>
        <tr>
                <th>Reply to:</th>
                <td>Hans-J. Boehm</td>
        </tr>
        <tr>
                <th>Phone:</th>
                <td>+1-650-857-3406</td>
        </tr>
        <tr>
                <th>Email:</th>
                <td><a href="mailto:Hans.Boehm@hp.com">Hans.Boehm@hp.com</a></td>
        </tr>
</table>
	
<H1>N3787: What can signal handlers do? (CWG 1441)</h1>

<P>
This is an attempt to summarize the current state of discussions
around CWG issue 1441.  Much of this discussion has occurred within
SG1, and it has moved significantly past the original issue,
so it seemed appropriate to turn it into a separate paper.  This
attempts to reflect the contributions of many people, especially
Lawrence Crowl, Jens Maurer, Clark Nelson, and Detlef Vollmann.
<P>
This version is a minor revision of N3633.  It reflects some
changes due to SG1 concerns in Chicago, and reflects the
easiest-to-address comments from the CWG discussion in Chicago.
It does not reflect all of the latter.
More discussion and almost certainly another paper revision is required.

<H2>Background</h2>

<P>
<A HREF="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3501.html#1441">CWG Issue 1441</a> points out that in the process of
relaxing the restrictions on asynchronous signal handlers to allow
use of atomics, we inadvertently
made it impossible to use even local variables of non-volatile,
non-atomic type.
As a result of an initial discussion within CWG, Jens Maurer generated a
<A HREF="http://wiki.edg.com/twiki/pub/Wg21kona2012/CoreWorkingGroup/proposed_resolution_core-1441.html">proposed resolution</a>, which addresses
that specific issue.
<P>
Pre-Bristol discussion in SG1, both in
<A HREF="http://wiki.edg.com/twiki/bin/view/Wg21portland2012/SG1Minutes10-18">
Portland</a> and during
<A HREF="http://wiki.edg.com/twiki/bin/view/Wg21bristol/FebPhoneCallMinutes">the February 2013 SG1 teleconference</a>, raised a number of additional
issues.  Both Jens' solution and all prior versions of the standard
still give undefined behavior to code involving signal handlers which
we believe should clearly be legal.  Our goal was to correct such
oversights, and allow some realistic signal handlers to be portable,
while preserving a significant amount of implementation
freedom with respect to what is allowable in a signal handler.  In
particular, we do not want to reinvent Posix' notion of
async-signal-safe functions here.
<P>
This issue was revisited as part of the concurrency group (SG1)
<A HREF="http://wiki.edg.com/twiki/bin/view/Wg21bristol/SG1">
meetings in Bristol</a>.
After some initial debate,
that discussion concluded with several straw polls reaffirming that we
want lock-free atomics to be usable in signal handlers, that signal
handlers should be  able to read data written before the handler installation,
and that accesses to ordinary variables in signal handlers should be OK,
so long as there are happens-before relationships separating such code
from mainline code accesses.  Some of us remember this as the original
intent behind the C++11 changes, but recollections vary.
<P>
In spite of reaffirming the intent of this paper,
the changes below were not brought
forward as a change to the C++14 working paper, since there was a feeling we
should take more time to consider alternate approaches to the wording,
and that input from WG14 would be desirable.
<P>
My current hope is that something along the lines of this proposal,
but probably not precisely these words, will find its way into the draft.

<H2>Proposed resolution and discussion</h2>

<P>
We give several proposed changes and summarize the reasoning behind the
change as well as some of the past discussion:

<H3>Replace 1.9p6, the paragraph imposing the restrictions on signal handlers</h3>
<P>
Replace 1.9p6 [intro.execution]:
<BLOCKQUOTE>
	<DEL>When the processing of the abstract machine is interrupted
		by receipt of a signal, the values of objects which are neither</del>
<OL>
	<LI> <DEL>of type <CODE>volatile std::sig_atomic_t</code> nor</del>
	<LI> <DEL>lock-free atomic objects (29.4)</del>
</ol>
<DEL>are unspecified during the execution of the signal handler, and the
	value of any object not in either of these two categories that
	is modified by the handler becomes undefined.</del>
</blockquote>
<P>
with
<BLOCKQUOTE>
	<P>
	<ins>If a signal handler is executed as a result of a call to the
		<code>raise</code> function, then the execution of the
		handler is sequenced after the invocation of the
		<code>raise</code> function and before its return.
		When a signal is received for another reason,
		the execution of the signal handler is
		unsequenced with respect to the rest of the program.</ins>
</BLOCKQUOTE>
<P>
The original restriction would now be expressed elsewhere (see below)
in terms of data races.  This means that signal handlers can now access
variables also accessed in mainline code, so long as the required
happens-before orders are established.
<P>
We concluded during the February discussion that the old "interrupted by
a signal" phrase referred to an asynchronous signal, and was basically OK.
But after reading the C standard I'm not sure, and it makes sense to me
to be more explicit.  This is my latest attempt to do so.

<H3>Weaken the restriction on unsequenced operations</h3>

<P>
Change one sentence in 1.9p15 [intro.execution]

<blockquote>
<P>If a side effect on a scalar
object is unsequenced relative to either another side effect on the same scalar object or a value computation
using the value of the same scalar object, <ins>and they are not potentially
concurrent (1.10, intro.thread),</ins> the behavior is undefined.
<ins>[<i>Note:</i> The next section imposes similar, but more complex
restrictions on potentially concurrent computations. <i>-- end note</i>]</ins>
</blockquote>

<P>
<B>Discussion:</b>
<P>
This is a delicate area.  Asynchronous signal handlers are unsequenced.
If we said nothing else, even atomic operations in mainline code and the
signal handler might introduce undefined behavior.  We don't want
that.

<P>
This is not an issue for regular unsequenced expressions.  Consider
the question of whether the following is legal:
<blockquote>
<P>
<code>
	{
	<br>
	&nbsp;&nbsp;atomic&lt;int *&gt;p = 0;
	<br>
	&nbsp;&nbsp;int i;
	<br>
	&nbsp;&nbsp;(i = 17, p = &amp;i, 1) + (p? *p : 0);
	<br>
	}
</code>
</blockquote>

<P>
After some false starts, we concluded in the February phone call that
the answer is yes, for reasons having more to do with function calls in
expressions than atomics.
The store to <code>p</code> and the initial test of <code>p</code>
are indeterminately sequenced.  If the latter occurs first, the potentially
unsequenced access to <code>*p</code> doesn't occur.  In the other
case, the store to <code>i</code> is sequenced before the store
to <code>p</code>, which is sequenced before the test on <code>p</code>,
which is sequenced before the questionable load from <code>*p</code>.
This again relies heavily on the fact that atomic operations are
function calls in C++.  The situation in C is unfortunately different.

<P>
In spite of earlier contradictory conclusions, there are however strong
reasons to treat unsequenced expressions differently from data races in
signal handlers.  These have to do with weaker memory orders.
Consider the following example:

<P>
<B>Mainline code:</b>
<code>x = 1; y.store(1, memory_order_mo1);</code>
<P>
<B>Signal handler:</b>
<code>if (y.load(memory_order_mo2)) tmp = x;</code>
<P>
This should or should not result in undefined behavior, depending on mo1 and
mo2.  I don't think this is expressible without relying on happens-before.
<P>
Fortunately, I think this doesn't apply within expressions:
<blockquote>
	<P>
	<code>(x = 1, y.store(1, memory_order_relaxed), 0) + (y.load(memory_order_relaxed)? x : 1)</code>
</blockquote>
<P>
(all variables initially zero as usual)
must return 1.  A compiler that violates this by reordering the initial
two stores and performing the <code>y.load()</code> in the middle is broken.
(At least so we claim with only mild uncertainty.)

<P>
Thus the restriction on unsequenced operations should apply only to code
that may not run concurrently.  For code that may run concurrently
(threads and signal handlers) we need the happens-before-based notion of
data races that reflects memory_order specifications.

<H3>Expand the discussion of data races to cover signal handler invocations we want to prohibit</h3>

<P>
Change the normative part of 1.10p21 [intro.multithread] as follows: 
<BLOCKQUOTE>
<P>
<ins>Two actions are <i>potentially concurrent</i> if they are performed by
different threads, or if at least one is performed by a signal handler,
they are unsequenced, and they
are not both performed by the same signal handler invocation.</ins>
The execution of a program contains a data race if it contains two
<ins>potentially concurrent</ins> conflicting actions
<del>in different threads</del>, at least one of which is not atomic,
and neither happens before the other.
Any such data race results in undefined behavior.
<ins>As a special exception
to the preceding rule, two accesses to the same
object of type <code>volatile sig_atomic_t</code> cannot result in
a data race if both occur in the same thread t (with one occurring
in a signal handler).  The evaluations
of such <code>volatile sig_atomic_t</code> objects take values as
though the execution of the signal handler had been sequenced at
one particular point, consistent with the actual happens-before
ordering, during the execution of t.</ins>
</blockquote>

<P>
<B>Discussion:</b>
<P>
By the above reasoning, we need to give signal handlers the same
data-race-based treatment
as threads.  Memory_order specifications
must be respected in determining whether there is undefined behavior.
<P>
There was some discussion during the February phone call as to whether
we should view signal handlers as being performed by a specific thread at
all, and I think we were moving towards removing that notion.
A signal handler probably cannot portably tell which thread
it's running on.  But after thinking about this more, I don't
know how to reconcile this change with <code>atomic_signal_fence</code>,
so I am once again inclined to leave things more like they are.

<P>
These changes should now have the effect of allowing full atomics to be used in
communicating with a signal handler.  I can now allocate an object,
assign it to an atomic pointer variable, and have a signal handler access
the non-atomic objects through that variable, just as another thread could.
Since signal handlers obey strictly more scheduling constraints than threads,
I think this is entirely expected, and what we had in mind all the time.


<H3>Ensure that signal handler invocation happens after signal handler
	installation</h3>

<P>
Insert in 18.10 [support.runtime] after p7:

<BLOCKQUOTE>
<P>
<ins>The function <code>signal</code> defined in <code>&lt;csignal&gt;</code>
shall ensure that a call to <code>signal</code> synchronizes
with any resulting invocation of the newly installed signal handler.</ins>
</blockquote>

<P>
<B>Discussion:</b>

<P>
This is necessary to allow signal handlers to access data that is
read-only after installation of the handler.  I expect this happens
all the time already.
<P>
Note that 29.8p6 already talks about synchronizes-with relationships between
a thread and a signal handler in the same thread, so I don't think this is a
very fundamental change in perspective.


<H3>Clarify which C-like functions can be used in a signal
	handler</h3>

<P>
Change 18.10 [support.runtime] p9 as follows:

<BLOCKQUOTE>
<P>
The common subset of the C and C++ languages consists of all declarations,
definitions, and expressions that may appear in a well formed C++ program
and also in a conforming C program.
<ins>A <i>plain lock-free atomic operation</i> is an invocation of a function
<i>f</i>
from clause 29,
such that <i>f</i> is not a member function, and either <i>f</i> is
the function <code>is_lock_free</code>, or
for any atomic argument <code>a</code> passed to <i>f</i>,
<code>is_lock_free(a)</code> yields true.</ins>
A POF ("plain
old function") is a function that uses only features from this common
subset, and that does not directly or indirectly use any function that is
not a POF, except that it may use <del>functions defined in Clause 29
that are not member functions
</del><ins>plain lock-free atomic operations</ins>.
All signal handlers shall have C linkage.
<del>A POF that could be used as
a signal handler in a conforming C program does not produce undefined
behavior when used as a signal handler in a C++ program.</del>
The behavior of any <del>other</del> function <ins>other than a POF</ins>
used as a signal handler in a C++ program
is implementation-defined.
</blockquote>
<P>
<B>Discussion:</b>
<P>
Since we currently refer to C99 as the base document and C99 does not
support <code>thread_local</code>, this somewhat accidentally prohibits
use of <code>thread_local</code> in signal handlers.  Discussion in Bristol
suggests this may be a good thing, since <code>thread_local</code> might
be implemented with a e.g. a locked hash table, which would result in
deadlocks if access from signal handlers were allowed.
<P>
Some of the earlier phone call discussion seems to have overlooked the existing
clause 29 exemption which, for example, makes calls to
<code>is_atomic</code> legal.
<P>
That exemption was too broad, since it allowed non-lock-free calls.
All calls that acquire locks need to be prohibited in signal handlers,
since they typically deadlock if the mainline thread already holds the
lock.
<P>
I don't understand the meaning of a normative sentence that says "X does not
have undefined behavior".  We otherwise define its meaning, so why would it
possibly have undefined behavior without this sentence?  Hence I'm
proposing to rephrase.
<P>
(This paragraph removes the need for a 1.10p5 change I previously proposed.)

</body>
</html>
