﻿<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>N3786: Prohibiting "out of thin air" results in C++14</title>
</head>
<body>
<table summary="Identifying information for this document.">
	<tr>
                <th>Doc. No.:</th>
                <td>WG21/N3786</td>
        </tr>
        <tr>
                <th>Date:</th>
                <td>2013-9-24</td>
        </tr>
        <tr>
                <th>Reply to:</th>
                <td>Hans-J. Boehm</td>
	</tr>
	<tr>
		<th>Related to:</th>
		<td>N3710, LWG 2265</td>
        </tr>
        <tr>
                <th>Phone:</th>
                <td>+1-650-857-3406</td>
        </tr>
        <tr>
                <th>Email:</th>
                <td><a href="mailto:Hans.Boehm@hp.com">Hans.Boehm@hp.com</a></td>
        </tr>
</table>
<h1>N3786: Prohibiting "out of thin air" results in C++14</h1>
<p>
We argued in N3710 that the current attempt prohibit out-of-thin-air
results in 29.3p9 [atomics.order] is both insufficient, in that it leaves
it largely impossible to reason about programs with
<code>memory_order_relaxed</code>, and seriously
harmful, in that it arguably disallows all reasonable implementations
of <code>memory_order_relaced</code> on architectures like ARM and
POWER.  This proposal attempts
to address the latter "seriously harmful" part for C++14, without
addressing the "insufficient" part, which would still need to be addressed
in C++17, possibly along the lines of N3710.
<p>
This proposal thus eliminates some
amount of complicated standardese that we actually want vendors like
ARM and IBM to ignore.
<p>
There is no consensus in SG1 for a more extensive proposal,
such as that in N3710, at this time,
i.e. in time for C++14.

<h2>Proposed wording for 29.3</h2>
<p>
This removes the current badly formulated requirement, and instead
promotes the existing note discouraging "out-of-thin-air" results
to normative encouragment.
<p>
The core problem here is that we do not
know (after 10 years of failed attempts, originally in a Java context)
how to precisely state this requirement without strengthening it so
that it impacts code performance.  (See N3710 for details.)
Thus it remains of little use to anyone trying to formulate a precise
correctness argument.  It is roughly analogous to our current statements
about progress guarantees, which are also phrased as normative
encouragement in large part because they would be extremely difficult to
make precise while remaining implementable.

<h2>Proposed wording</h2>
<p>
Change 29.3p9-11 [atomics.order] as follows
<blockquote>
<p>
<del>An atomic store shall only store a value that has been computed from
constants and program input values by a finite sequence of program
evaluations, such that each evaluation observes the values of variables as
computed by the last prior assignment in the sequence. The ordering of
evaluations in this sequence shall be such that:</del>
<ul>
<li> <del>if an evaluation <i>B</i> observes a value computed by
<i>A</i> in a different thread, then <i>B</i> does not happen
before <i>A</i>, and</del>
<li> <del>if an evaluation A is included in the sequence, then every
evaluation that assigns to the same variable and happens before A is included.
</del>
</ul>
<p>
<ins>
Implementations should ensure that no "out-of-thin-air" values
are computed that circularly depend on their own computation.
</ins>
<p>
<del>[ Note: The second requirement disallows
"out-of-thin-air" or "speculative"
stores of atomics when relaxed atomics are used.
Since unordered operations
are involved, evaluations may appear in this sequence out of
thread order.  For example, with x and y initially zero,</del>
<pre>
<del>
// Thread 1:
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);
// Thread 2:
r2 = x.load(memory_order_relaxed);
y.store(42, memory_order_relaxed);
</del>
</pre>
<p><del>
is allowed to produce <code>r1</code> = <code>r2</code> = 42.
The sequence of evaluations justifying this consists of:</del>
<pre>
<del>
y.store(42, memory_order_relaxed);
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);
r2 = x.load(memory_order_relaxed);
</del>
</pre>
<p>
<del>On the other hand,</del>
<ins>[<i>Note:</i>  For example, with <code>x</code> and <code>y</code>
initially zero,</ins>
<pre>
// Thread 1:
r1 = y.load(memory_order_relaxed);
x.store(r1, memory_order_relaxed);
// Thread 2:
r2 = x.load(memory_order_relaxed);
y.store(r2, memory_order_relaxed);
</pre>
<p>
<del>may</del> <ins>should</ins>
not produce <code>r1</code> =<ins>=</ins> <code>r2</code> =<ins>=</ins> 42,
since <ins>the store of 42 to <code>y</code> is only possible
if the store to <code>x</code> stores 42, which circularly depends on
the store to <code>y</code> storing 42.  Note that without
this restriction, such an execution is possible.</ins>
<del>there is no sequence of evaluations
that results in the computation of 42. In the absence of "relaxed"
operations and read-modify-write operations with weaker than
<code>memory_order_acq_rel</code> ordering,
the second requirement has no impact.</del> -end note ]
<p>
[ Note: <del>The requirements do allow</del>
<ins>The recommendation similarly disallows</ins>
<code>r1</code> == <code>r2</code> == 42 in the following example,
with <code>x</code> and <code>y</code> <ins>again</ins> initially zero:

<pre>
// Thread 1:
r1 = x.load(memory_order_relaxed);
if (r1 == 42) y.store(<del>r1</del><ins>42</ins>, memory_order_relaxed);
// Thread 2:
r2 = y.load(memory_order_relaxed);
if (r2 == 42) x.store(42, memory_order_relaxed);
</pre>
<p>
<del>However, implementations should not allow such behavior.</del> -end note ]
</blockquote>

</body>
</html>

