<?xml version="1.0" encoding="us-ascii"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.1//EN"
	"http://www.w3.org/TR/xhtml11/DTD/xhtml11.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en-us">

<head>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii" />
<title>WG21/N2171: Sequencing and the concurrency memory model (revised)</title>
<style type="text/css">
.deleted {
	text-decoration: line-through;
}
.inserted {
	text-decoration: underline;
}
</style>
</head>

<body>

<table summary="This table provides identifying information for this document.">
	<tr>
		<th>Doc. No.:</th>
		<td>WG21/N2171<br />
		J16/07-0031</td>
	</tr>
	<tr>
		<th>Date:</th>
		<td>2007-03-12</td>
	</tr>
	<tr>
		<th>Reply to:</th>
		<td>Clark Nelson</td>
		<td>Hans-J. Boehm</td>
	</tr>
	<tr>
		<th>Phone:</th>
		<td>+1-503-712-8433</td>
		<td>+1-650-857-3406</td>
	</tr>
	<tr>
		<th>Email:</th>
		<td><a href="mailto:clark.nelson@intel.com">clark.nelson@intel.com</a></td>
		<td><a href="mailto:Hans.Boehm@hp.com">Hans.Boehm@hp.com</a></td>
	</tr>
</table>
<h1>Sequencing and the concurrency memory model (revised)</h1>
<p>This paper is a revision of
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1944.htm">N2052</a>. 
Significant changes relative to that paper are detailed below.</p>
<p>This paper is also a successor to, but not a revision of,
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1944.htm">N1944</a>. 
N1944 was basically an exploratory paper, despite the amount of nearly-WD-ready 
text proposed; its style of presentation was very heavy on explanation and motivation. 
Consequently, it is certain to be useful as a tutorial introduction and/or rationale 
for this paper.</p>
<p>But based on the amount of positive feedback received, the exploratory phase 
could hopefully be considered complete. Furthermore, some of the feedback received 
would have been difficult to address in a document organized as N1944 was. It now 
seems highly desirable to have a cohesive presentation of the changed WD text, emphasizing 
the result rather than the process. This paper also presents work on aspects of 
sequencing explicitly related to concurrency, addressing other feedback on N1944.</p>
<p>This paper should also be viewed as a successor to
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2006/n1942.html">N1942</a>, 
the memory model proposal. Again, much of the explanatory material from N1942 is 
not repeated here. In an attempt to simplify, some of the terminology has changed 
from N1942.</p>
<h2>Contents</h2>
<ul>
	<li><a href="#n2052">Significant changes since N2052</a></li>
	<li><a href="#n1944">Significant changes in the proposed wording since N1944</a></li>
	<li><a href="#rearranging">Rearranging the text of &quot;Program execution&quot;</a></li>
	<li><a href="#execution">The text proposed for &quot;Program execution&quot;</a></li>
	<li><a href="#location">The definition of &quot;memory location&quot;</a></li>
	<li><a href="#races">Multi-threaded executions and data races</a></li>
	<li><a href="#operators">Sequencing for specific operators</a></li>
	<li><a href="#temporaries">Sequencing for destruction of temporaries</a></li>
	<li><a href="#miscellaneous">Fixes for miscellaneous sequencing issues</a></li>
	<li><a href="#loops">Semantics of some non-terminating loops</a></li>
</ul>
<h2><a id="n2052">Significant changes since N2052</a></h2>
<p>Some editorial notes were added, pointing out significant implications and observations which 
may interest the reader.</p>
<p>A statement on the sequencing of value computations has been added to
<a href="#s1.9p16">1.9p16</a>.</p>
<p>Transitivity was added to the definition of the &quot;precedes&quot; relation in
<a href="#s1.10p9">1.10p9</a>.</p>
<p>The section on library thread-safety has been deleted, in anticipation of a more 
comprehensive paper by a more qualified author.</p>
<h3>Sequencing of compound assignment and post-increment</h3>
<p>The following troublesome example was deleted from the proposed wording for
<a href="#s1.9p17">1.9p17</a>:</p>
<blockquote>
	<pre>int increment_x() { return x++; }
<!--      -->x++ + increment_x();                <em>// Evaluation order unspecified; x may be incremented only once</em>
<!--      -->increment_x() + increment_x();      <em>// </em>x<em> is incremented twice</em></pre>
</blockquote>
<p>Instead, wording was added explicitly forbidding the sequencing of an indeterminately-sequenced 
function call &quot;within&quot; a postfix increment (therefore decrement) or compound assignment 
(therefore prefix increment or decrement); see <a href="#s5.2.6p1">5.2.6p1</a> and
<a href="#s5.17p1">5.17p1</a>.</p>
<p>This change was motivated by a combination of factors. Firstly, it is not clear 
from the existing standard whether the interpretation that allows the outcome described 
in the example is correct or not; therefore, the status quo is a matter of opinion, 
with different experts expressing different opinions.</p>
<p>Secondly, it is clear that the existing standard describes the example as having 
unspecified behavior, but not undefined behavior. Unspecified behavior is described 
in the C++ standard as applying to &quot;a well-formed program construct and correct 
data&quot;. If the example expression is truly as unreliable as is claimed, it seems 
unhelpful (if not disingenuous) to classify it into such a benign-sounding category.</p>
<p>Between the alternatives of reclassifying the expression as having undefined 
behavior, and of tightening the sequencing rules sufficiently to eliminate the possibility 
of (apparently) losing a side effect, the latter is far less radical, easier to 
specify, more conducive to writing reliable programs, and probably has minimal if 
any impact on existing implementations.</p>
<h3>Concurrency memory model changes</h3>
<p>Some sections of the threads memory model (1.10) have been in flux recently and 
are subject to further debate.</p>
<p>The notion of a &quot;modification order&quot; was added recently to ensure that the perceived 
value of a single atomic variable does not &quot;flip-flop&quot; in unexpected ways. The precise 
way in which this should be integrated into the rest of the model was not clear, 
though we have significantly more confidence in the current approach than its predecessors.</p>
<p>There have also been recent back-and-forth changes in the definition of &quot;synchronizes 
with&quot;, and particularly in whether synchronizes-with relationships should exist 
between &quot;relaxed&quot; accesses to the same atomic variable. This impacts whether atomic 
read-modify-write accesses can effectively be used as fences to order other accesses, 
and whether synchronization operations on variables accessed by only a single thread 
can be effectively optimized by the compiler. The latter in turn impacts how well 
the compiler can combine &quot;small&quot; threads. Some of this may also need to be revisited 
if explicit fences are added to the library, as proposed in N2153.</p>
<h2><a id="n1944">Significant changes in the proposed wording since N1944</a></h2>
<p>The WD text proposed in N1944 introduced ambiguity in the use of the term &quot;evaluation&quot;. 
Most new uses of that term were intended to reflect usage in mathematics, as in 
the computation of a value, without side effects. This usage is inconsistent with 
C/C++ tradition, and the way the term is used in the standard. So when it is necessary 
to talk about evaluations that do not have side effects, the term &quot;value computation&quot; 
is now used.</p>
<p>There is a new paragraph defining and explaining the &quot;sequenced before&quot; relation; 
see <a href="#s1.9p14">1.9p14</a>.</p>
<p>To reflect the consensus from the discussion in Berlin, a note has been added 
clearly stating that there is no requirement of consistency for operations whose 
sequencing is not constrained; see <a href="#s1.9p16">1.9p16</a>.</p>
<p>The statement of the &quot;no interleaving&quot; rule for functions has been updated; see
<a href="#s1.9p17">1.9p17</a>. Also, an example has been added pointing out a possibly-surprising 
interpretation of &quot;unspecified behavior&quot;.</p>
<p>Resolutions are proposed for several questions raised but not answered in N1944, 
mostly in <a href="#miscellaneous">Fixes for miscellaneous sequencing issues</a>.</p>
<h2><a id="rearranging">Rearranging the text of &quot;Program execution&quot;</a></h2>
<p>The changes proposed in N1944 were mainly in section 1.9 (Program execution) 
and various locations in clause 5 (Expressions), plus a couple of spots in clause 
12 (Special member functions). The &quot;undefined behavior&quot; rule, a key paragraph in 
the understanding of sequencing, which basically describes what may be called an 
&quot;intra-thread data race&quot;, is currently in 5p4, which is widely separated from the 
bulk of the discussion of the principles of sequencing in 1.9. Furthermore, it would 
seem logical to describe concurrency &#8212; and particularly inter-thread data races 
&#8212; in a new section building on and immediately following 1.9. Therefore we propose 
to move the &quot;undefined behavior&quot; rule from 5p4 to 1.9.</p>
<p>Within 1.9 with the changes proposed in N1944, the bulk of the discussion of 
sequencing is in p15-16. Paragraph 8, which currently contains the &quot;no overlap&quot; 
rule for function execution, should be merged into p16, which discusses many other 
sequencing constraints on function calls. And if, as proposed, the references to 
sequence points and evaluation are removed from p11 (the &quot;least requirements&quot;), 
then the definitions in p7 are not needed until p15; moving paragraph 7 down would 
result in a more cohesive presentation.</p>
<p>Finally, it could be argued that cohesiveness would be increased still further 
by moving the discussion of reassociation (concerning implications of the &quot;as-if&quot; 
rule) to immediately follow the &quot;least requirements&quot; (which is basically the normative 
statement of the &quot;as-if&quot; rule), instead of showing up in the middle of the discussion 
of expressions and sequencing.</p>
<p>This table shows the proposed shifting of content, assuming regular paragraph 
(re-)numbering. The letters in the central columns are just tags, intended to illustrate 
how text moves around (in lieu of arrows): the tag stays with the content.</p>
<table>
	<tr>
		<th>Paragraph number</th>
		<th>Old content</th>
		<th colspan="2"></th>
		<th>New content</th>
	</tr>
	<tr>
		<td>1.9p7</td>
		<td>Definitions of &quot;side effect&quot;, &quot;sequence point&quot;</td>
		<td>A</td>
		<td>C</td>
		<td>Effect of asynchronous signal</td>
	</tr>
	<tr>
		<td>1.9p8</td>
		<td>&quot;No overlap&quot; rule for function execution</td>
		<td>B</td>
		<td>C</td>
		<td>Allocation of automatic objects</td>
	</tr>
	<tr>
		<td>1.9p9</td>
		<td>Effect of asynchronous signal</td>
		<td>C</td>
		<td>C</td>
		<td>The &quot;least requirements&quot;</td>
	</tr>
	<tr>
		<td>1.9p10</td>
		<td>Allocation of automatic objects</td>
		<td>C</td>
		<td>E</td>
		<td>Note concerning reassociation</td>
	</tr>
	<tr>
		<td>1.9p11</td>
		<td>The &quot;least requirements&quot;</td>
		<td>C</td>
		<td>D</td>
		<td>Definition of &quot;full-expression&quot;</td>
	</tr>
	<tr>
		<td>1.9p12</td>
		<td>Definition of &quot;full-expression&quot;</td>
		<td>D</td>
		<td>D</td>
		<td>Note concerning default arguments</td>
	</tr>
	<tr>
		<td>1.9p13</td>
		<td>Note concerning default arguments</td>
		<td>D</td>
		<td>A</td>
		<td>Definition of &quot;side effect&quot;, &quot;evaluation&quot;</td>
	</tr>
	<tr>
		<td>1.9p14</td>
		<td>Note concerning reassociation</td>
		<td>E</td>
		<td>[new]</td>
		<td>Definition of &quot;sequenced before&quot;</td>
	</tr>
	<tr>
		<td>1.9p15</td>
		<td>Sequencing between full-expressions</td>
		<td>F</td>
		<td>F</td>
		<td>Sequencing between full-expressions</td>
	</tr>
	<tr>
		<td>1.9p16</td>
		<td>Sequencing constraints on function calls</td>
		<td>G</td>
		<td>5p4</td>
		<td>The &quot;undefined behavior&quot; rule</td>
	</tr>
	<tr>
		<td>1.9p17</td>
		<td>Operators that impose a sequence point</td>
		<td>[delete]</td>
		<td>G+B</td>
		<td>Sequencing constraints on function calls, including the &quot;no overlap&quot; 
		rule</td>
	</tr>
</table>
<h2><a id="execution">The text proposed for &quot;Program execution&quot;</a></h2>
<p>So here is the proposed reading of section 1.9, beginning with p6 (just for the 
sake of context). Each paragraph is introduced with its proposed paragraph number, 
and an explanation of its source. Text from the current working draft to be replaced 
or deleted is <span class="deleted">stricken through</span>. Replacement or added 
text is <span class="inserted">underlined</span>. Footnotes are presented here in 
the same style as examples and notes. If the introductory paragraphs, editorial 
notes and stricken text were deleted, the result would be a longish block of consecutive 
paragraphs, as proposed for the standard.</p>
<p>1.9p6 (unchanged):</p>
<blockquote>
	<p>The observable behavior of the abstract machine is its sequence of reads 
	and writes to <code>volatile</code> data and calls to library I/O functions. 
	An implementation can offer additional library I/O functions as an extension. 
	[ <em>Footnote:</em> Implementations that do so should treat calls to those 
	functions as &quot;observable behavior&quot; as well. &#8212;<em>end footnote</em> ]</p>
</blockquote>
<p><strong>Editorial note:</strong> This definition of observable behavior is not 
clearly consistent with the &quot;least requirements&quot; described in the proposed 1.9p9 
below (and is arguably incorrect, especially for multithreaded programs). Core issue 
612 has been opened to consider this inconsistency, and any corrections necessary 
for multithreading will be drafted in accordance with its resolution.</p>
<p>1.9p7 (unchanged from the current p9, except for the addition of an omitted word):</p>
<blockquote>
	<p>When the processing of the abstract machine is interrupted by receipt of 
	a signal, the values of objects with type other than <code>volatile std::sig_atomic_t</code> 
	are unspecified, and the value of any object not of <span class="inserted">type</span>
	<code>volatile std::sig_atomic_t</code> that is modified by the handler becomes 
	undefined.</p>
</blockquote>
<p>1.9p8 (unchanged from the current p10):</p>
<blockquote>
	<p>An instance of each object with automatic storage duration (3.7.2) is associated 
	with each entry into its block. Such an object exists and retains its last-stored 
	value during the execution of the block and while the block is suspended (by 
	a call of a function or receipt of a signal).</p>
</blockquote>
<p>1.9p9 (original text from p11):</p>
<blockquote>
	<p>The least requirements on a conforming implementation are:</p>
	<ul>
		<li><span class="deleted">At sequence points, volatile objects are stable 
		in the sense that previous evaluations are complete and subsequent evaluations 
		have not yet occurred.</span> <span class="inserted">Accesses to volatile 
		objects are initiated strictly according to the rules of the abstract machine.</span></li>
		<li>At program termination, all data written into files shall be identical 
		to one of the possible results that execution of the program according to 
		the abstract semantics would have produced.</li>
		<li>The input and output dynamics of interactive devices shall take place 
		in such a fashion that prompting messages actually appear prior to a program 
		waiting for input. What constitutes an interactive device is implementation-defined.</li>
	</ul>
	<p>[ <em>Note:</em> more stringent correspondences between abstract and actual 
	semantics may be defined by each implementation. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.9p10 (unchanged from p14):</p>
<blockquote>
	<p>[ <em>Note:</em> operators can be regrouped according to the usual mathematical 
	rules only where the operators really are associative or commutative.<sup>11)</sup> 
	For example, in the following fragment</p>
	<blockquote>
		<p><em>[unchanged text omitted]</em></p>
	</blockquote>
	<p>However on a machine in which overflows do not produce an exception and in 
	which the results of overflows are reversible, the above expression statement 
	can be rewritten by the implementation in any of the above ways because the 
	same result will occur. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.9p11 (original text from p12):</p>
<blockquote>
	<p>A <dfn>full-expression</dfn> is an expression that is not a subexpression 
	of another expression. If a language construct is defined to produce an implicit 
	call of a function, a use of the language construct is considered to be an expression 
	for the purposes of this definition. <span class="inserted">A call to a destructor 
	generated at the end of the lifetime of an object other than a temporary object 
	is an implicit full-expression.</span> Conversions applied to the result of 
	an expression in order to satisfy the requirements of the language construct 
	in which the expression appears are also considered to be part of the full-expression. 
	[ <em>Example:</em></p>
	<blockquote>
		<p><em>[unchanged example omitted]</em></p>
	</blockquote>
</blockquote>
<p>1.9p12 (unchanged from p13):</p>
<blockquote>
	<p>[ <em>Note:</em> the evaluation of a full-expression can include the evaluation 
	of subexpressions that are not lexically part of the full-expression. For example, 
	subexpressions involved in evaluating default argument expressions (8.3.6) are 
	considered to be created in the expression that calls the function, not the 
	expression that defines the default argument. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.9p13 (original text from p7):</p>
<blockquote>
	<p>Accessing an object designated by a <code>volatile</code> lvalue (3.10), 
	modifying an object, calling a library I/O function, or calling a function that 
	does any of those operations are all <dfn>side effects</dfn>, which are changes 
	in the state of the execution environment. <span class="deleted">Evaluation 
	of an expression might produce side effects.</span> <span class="inserted">
	<dfn>Evaluation</dfn> of an expression (or sub-expression) in general includes 
	both value computations (including fetching a value previously assigned to an 
	object) and initiation of side effects.</span> <span class="deleted">At certain 
	specified points in the execution sequence called <dfn>sequence points</dfn>, 
	all side effects of previous evaluations shall be complete and no side effects 
	of subsequent evaluations shall have taken place.</span> [ <em>Footnote:</em> 
	Note <span class="deleted">that some aspects of sequencing in the abstract machine 
	are unspecified; the preceding restriction upon side effects applies to that 
	particular execution sequence in which the actual code is generated. Also note</span> 
	that when a call to a library I/O function returns, the side effect is considered 
	complete, even though some external actions implied by the call (such as the 
	I/O itself) may not have completed yet. &#8212;<em>end footnote</em> ]</p>
</blockquote>
<p><a id="s1.9p14">1.9p14 (new paragraph):</a></p>
<blockquote class="inserted">
	<p>&quot;<dfn>Sequenced before</dfn>&quot; is an asymmetric, transitive, pair-wise relation 
	between evaluations executed by a single thread, which induces a partial order 
	among those evaluations. Given any two evaluations <var>A</var> and <var>B</var>, 
	if <var>A</var> is sequenced before <var>B</var>, then the execution of <var>
	A</var> shall precede the execution of <var>B</var>. If <var>A</var> is not 
	sequenced before <var>B</var> and <var>B</var> is not sequenced before <var>
	A</var>, then <var>A</var> and <var>B</var> are <dfn>unsequenced</dfn>. [
	<em>Note:</em> The execution of unsequenced evaluations can overlap. &#8212;<em>end 
	note</em> ] Evaluations <var>A</var> and <var>B</var> are <dfn>indeterminately 
	sequenced</dfn> when either <var>A</var> is sequenced before <var>B</var>, or
	<var>B</var> is sequenced before <var>A</var>, but it is unspecified which. 
	[ <em>Note:</em> Indeterminately sequenced evaluations shall not overlap, but 
	either could be executed first. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.9p15 (original text from p15):</p>
<blockquote>
	<p><span class="deleted">There is a sequence point at the completion of evaluation 
	of each full-expression.</span> <span class="inserted">Every value computation 
	and side effect associated with a full-expression is sequenced before every 
	value computation and side effect associated with the next full-expression to 
	be evaluated.</span> [ <em>Footnote:</em> As specified in 12.2,
	<span class="deleted">after the &quot;end-of-full-expression&quot; sequence point</span>
	<span class="inserted">after a full-expression is evaluated</span>, a sequence 
	of zero or more invocations of destructor functions for temporary objects takes 
	place, usually in reverse order of the construction of each temporary object. 
	&#8212;<em>end footnote</em> ]</p>
</blockquote>
<p><a id="s1.9p16">1.9p16 (original text from clause 5 paragraph 4):</a></p>
<blockquote>
	<p>Except where noted, <span class="deleted">the order of evaluation</span>
	<span class="inserted">evaluations</span> of operands of individual operators<span class="inserted">,</span> 
	and <span class="inserted">of</span> subexpressions of individual expressions<span class="deleted">, 
	and the order in which side effects take place, is unspecified</span>
	<span class="inserted">are unsequenced</span>. [ <em>Footnote:</em> The precedence 
	of operators is not directly specified, but it can be derived from the syntax. 
	&#8212;<em>end footnote</em> ] <span class="inserted">[ <em>Note:</em> In an expression 
	that is evaluated more than once during the execution of a program, unsequenced 
	and indeterminately sequenced evaluations of its subexpressions need not be 
	performed consistently in different evaluations. &#8212;<em>end note</em> ] Except 
	where noted, the value computations of the operands of an operator are sequenced 
	before the value computation of the result of the operator.</span>
	<span class="deleted">Between the previous and next sequence point a scalar 
	object shall have its stored value modified at most once by the evaluation of 
	an expression. Furthermore, the prior value shall be accessed only to determine 
	the value to be stored. The requirements of this paragraph shall be met for 
	each allowable ordering of the subexpressions of a full expression; otherwise 
	the behavior is undefined.</span> <span class="inserted">If a side effect on 
	a scalar object is unsequenced relative to either a different side effect on 
	the same scalar object, or a value computation using the value of the same scalar 
	object, the behavior is undefined.</span> [ <em>Example:</em></p>
	<blockquote>
		<pre>i = v[i++];       <em>// the behavior is undefined</em>
<!--      -->i = 7, i++, i++;  <em>//</em> i <em>becomes</em> 9
<!--      -->i = ++i + 1;      <em>// the behavior is undefined</em>
<!--      -->i = i + 1;        <em>// the value of</em> i <em>is incremented</em></pre>
	</blockquote>
	<p>&#8212;<em>end example</em> ]</p>
</blockquote>
<p><strong>Editorial note:</strong> It has been pointed out that, under this proposed 
wording, unsequenced read accesses to a single volatile object (clearly) entail 
undefined behavior, which was not clearly the case with the previous wording. The 
key difference is that the new words refer to a &quot;side effect&quot;, which definitely 
includes reading a volatile object, whereas the previous words referred to modifying 
an object &quot;by the evaluation of an expression&quot;, which is ambiguous with respect 
to reading a volatile object &#8212; since such an action is a side effect, modification 
of the object accessed (or of some other volatile object) is possible but not inevitable.</p>
<p><a id="s1.9p17">1.9p17 (original text is p16 with p8 inserted):</a></p>
<blockquote>
	<p></p>
	<p>When calling a function (whether or not the function is inline),
	<span class="deleted">there is a sequence point after the evaluation of all 
	function arguments (if any) which takes place</span> <span class="inserted">
	every value computation and side effect associated with any argument expression, 
	or with the postfix expression designating the called function, is sequenced</span> 
	before execution of any <span class="deleted">expressions or statements</span>
	<span class="inserted">expression or statement</span> in the
	<span class="inserted">body of the called</span> function
	<span class="deleted">body</span>. <span class="inserted">[ <em>Note:</em> Value 
	computations and side effects associated with different argument expressions 
	are unsequenced. &#8212;<em>end note</em> ]</span> <span class="deleted">There is 
	also a sequence point after the copying of a returned value and before the execution 
	of any expressions outside the function. [ <em>Footnote:</em> The sequence point 
	at the function return is not explicitly specified in ISO C, and can be considered 
	redundant with sequence points at full-expressions, but the extra clarity is 
	important in C++. In C++, there are more ways in which a called function can 
	terminate its execution, such as the throw of an exception. &#8212;<em>end footnote</em> 
	]</span> <span class="deleted">Once the execution of a function begins, no expressions 
	from the calling function are evaluated until execution of the called function 
	has completed.</span> <span class="inserted">Every evaluation in the calling 
	function (including other function calls) that is not otherwise specifically 
	sequenced before or after the execution of the body of the called function is 
	indeterminately sequenced with respect to the execution of the called function.</span> 
	[ <em>Footnote:</em> In other words, function executions do not &quot;interleave&quot; 
	with each other. &#8212;<em>end footnote</em> ] Several contexts in C++ cause evaluation 
	of a function call, even though no corresponding function call syntax appears 
	in the translation unit. [ <em>Example:</em> evaluation of a new expression 
	invokes one or more allocation and constructor functions; see 5.3.4. For another 
	example, invocation of a conversion function (12.3.2) can arise in contexts 
	in which no function call syntax appears. &#8212;<em>end example</em> ] The
	<span class="deleted">sequence points at function-entry and function-exit</span>
	<span class="inserted">sequencing constraints on the execution of the called 
	function</span> (as described above) are features of the function calls as evaluated, 
	whatever the syntax of the expression that calls the function might be.</p>
</blockquote>
<p>Deleted as redundant with descriptions of operators (original text from p17):</p>
<blockquote>
	<p><span class="deleted">In the evaluation of each of the expressions</span></p>
	<blockquote>
		<pre><span class="deleted">a &amp;&amp; b
a || b
a ? b : c
a , b</span></pre>
	</blockquote>
	<p><span class="deleted">using the built-in meaning of the operators in these 
	expressions (5.14, 5.15, 5.16, 5.18), there is a sequence point after the evaluation 
	of the first expression. [ <em>Footnote:</em> The operators indicated in this 
	paragraph are the built-in operators, as described in clause 5. When one of 
	these operators is overloaded (clause 13) in a valid context, thus designating 
	a user-defined operator function, the expression designates a function invocation, 
	and the operands form an argument list, without an implied sequence point between 
	them. &#8212;<em>end footnote</em> ]</span></p>
</blockquote>
<h2><a id="location">The definition of &quot;memory location&quot;</a></h2>
<p>New paragraphs inserted as 1.7p3 et seq.:</p>
<blockquote class="inserted">
	<p>A <dfn>memory location</dfn> is either an object of scalar type, or a maximal 
	sequence of adjacent bit-fields all having non-zero width. Two threads of execution 
	can update and access separate memory locations without interfering with each 
	other.</p>
	<p>[<em>Note</em>: Thus a bit-field and an adjacent non-bit-field are in separate 
	memory locations, and therefore can be concurrently updated by two threads of 
	execution without interference. The same applies to two bit-fields, if one is 
	declared inside a nested struct declaration and the other is not, or if the 
	two are separated by a zero-length bit-field declaration, or if they are separated 
	by a non-bit-field declaration. It is not safe to concurrently update two bit-fields 
	in the same struct if all fields between them are also bit-fields, no matter 
	what the sizes of those intervening bit-fields happen to be. &#8212;<em>end note</em> 
	]</p>
	<p>[<em>Example</em>: A structure declared as <code>struct {char a; int b:5, 
	c:11, :0, d:8; struct {int ee:8;} e;}</code> contains four separate memory locations: 
	The field <code>a</code>, and bit-fields <code>d</code> and <code>e.ee</code> 
	are each separate memory locations, and can be modified concurrently without 
	interfering with each other. The bit-fields <code>b</code> and <code>c</code> 
	together constitute the fourth memory location. The bit-fields <code>b</code> 
	and <code>c</code> can not be concurrently modified, but <code>b</code> and
	<code>a</code>, for example, can be. <em>&#8212;end example</em>.] </p>
</blockquote>
<h2><a id="races">Multi-threaded executions and data races</a></h2>
<p>Insert a new section between 1.9 and 1.10, titled &quot;Multi-threaded executions 
and data races&quot;.</p>
<p>1.10p1:</p>
<blockquote class="inserted">
	<p>Under a hosted implementation, a C++ program can have more than one <dfn>
	thread of execution</dfn> (a.k.a. <dfn>thread</dfn>) running concurrently. Each 
	thread executes a single function according to the rules expressed in this standard. 
	The execution of the entire program consists of an execution of all of its threads. 
	[<em>Note:</em> Usually the execution can be viewed as an interleaving of all 
	its threads. However some kinds of atomic operations, for example, allow executions 
	inconsistent with a simple interleaving, as described below. &#8212;<em>end note</em> 
	] Under a freestanding implementation, it is implementation-defined whether 
	a program can have more than one thread of execution.</p>
</blockquote>
<p>1.10p2:</p>
<blockquote class="inserted">
	<p>The execution of each thread proceeds as defined by the remainder of this 
	standard. The value of an object visible to a thread <var>T</var> at a particular 
	point might be the initial value of the object, a value assigned to the object 
	by <var>T</var>, or a value assigned to the object by another thread, according 
	to the rules below.</p>
</blockquote>
<p>1.10p3:</p>
<blockquote class="inserted">
	<p>Two expression evaluations <dfn>conflict</dfn> if one of them modifies a 
	memory location and the other one accesses or modifies the same memory location.</p>
</blockquote>
<!--
<blockquote class="inserted">
	<p>If two conflicting evaluations are performed by the same thread, and neither 
	is sequenced before the other, then the execution sequence contains an intra-thread 
	data race. Any intra-thread data race is an undefined operation, and no requirements 
	are placed on such an execution.</p>
</blockquote>
-->
<!--
<blockquote class="inserted">
	<p>[<em>Note:</em> The purpose of the rest of this section is (1) to define 
	an inter-thread data race, which will also give rise to an undefined operation, 
	and (2) to define how an assignment to an object in one thread might affect 
	the value of that object as seen by other threads. None of this is relevant 
	to implementations that are limited to a single thread.]</p>
</blockquote>
-->
<p>1.10p4:</p>
<blockquote class="inserted">
	<p>The library defines a number of operations, such as operations on locks and 
	atomic objects, that are specially identified as synchronization operations. 
	These operations play a special role in making assignments in one thread visible 
	to another. A <dfn>synchronization operation</dfn> is either an acquire operation 
	or a release operation, or both, on one or more memory locations. [<em>Note:</em> 
	For example, a call that acquires a lock will perform an acquire operation on 
	the locations comprising the lock. Correspondingly, a call that releases the 
	same lock will perform a release operation on those same locations. Informally, 
	performing a release operation on <var>A</var> forces prior side effects on 
	other memory locations to become visible to other threads that later perform 
	an acquire operation on <var>A</var>. &#8212;<em>end note</em> ]</p>
</blockquote>
<p><a id="s1.10p5">1.10p5-6, previously containing the definition of 
&quot;inter-thread ordered before&quot;,</a> have been deleted from this revision. Subsequent paragraphs 
will be renumbered eventually.</p>
<!--
<blockquote class="inserted">
	<p>An expression evaluation <var>A</var> is <dfn>inter-thread ordered before</dfn> 
	another evaluation <var>B</var> if:</p>
	<ul>
		<li><var>A</var> is sequenced before <var>B</var> and either <var>A</var> 
		performs an acquire operation, or <var>B</var> performs a release operation; 
		or</li>
		<li><var>A</var> is an unordered atomic read and <var>B</var> is an unordered 
		atomic write, and either the value written by <var>B</var> varies depending 
		on the value read by <var>A</var>, or the execution of <var>B</var> is conditioned 
		on the value read by <var>A</var>.</li>
	</ul>
	<p>[<em>Note:</em> Neither &quot;the value written varies depending on the value 
	read&quot; nor &quot;the execution is conditioned on the value read&quot; refers to a static 
	notion of dependence. For example, the value of <code>(0 * x)</code> does not 
	vary depending on the value of <code>x</code>. <em>&#8212;end note</em>]</p>
	<p>[<em>Note:</em> This definition is redundant for most synchronization 
	operations, since those that read a value will usually have acquire semantics, 
	and those that update a value will usually have release semantics. However, 
	for isolated operations that do not provide such guarantees, it avoids results 
	that can only be justified by inherently &quot;circular&quot; executions.]</p>
</blockquote>
<p>1.10p6:</p>
<blockquote class="inserted">
	<p>[<em>Note:</em> An evaluation <var>A</var> can only be inter-thread ordered 
	before <var>B</var> if <var>A</var> is also sequenced before <var>B</var>. For 
	race-free programs making conventional use of locks, the distinction between 
	&quot;inter-thread ordered before&quot; and &quot;sequenced before&quot; is unimportant. The distinction 
	becomes important with very weakly ordered library synchronization primitives. 
	&#8212;<em>end note</em> ]</p>
</blockquote>
-->
<p>This was rewritten in terms of &quot;synchronizes with&quot;, which is restricted to synchronization 
operations, instead of explicitly including store-load dependencies in a &quot;communicates 
with&quot; relation as in N1944. This version is intended to be equivalent, since we 
insist that &quot;happens before&quot; together with store-load dependencies remains acyclic. 
We need that for the race free implies sequential consistency proof, and for one 
of the examples.</p>
<p>1.10p7:</p>
<blockquote class="inserted">
	<p>All modifications to a particular atomic object <var>M</var> occur in some 
	particular total order, called the <dfn>modification order</dfn> of <var>M</var>. 
	An evaluation <var>A</var> that performs a release operation on an object
	<var>M</var> <dfn>synchronizes with</dfn> an evaluation <var>B</var> that performs 
	an acquire operation on <var>M</var> and reads either the value written by
	<var>A</var> or a later value in the modification order of <var>M</var>. [<em>Note:</em> 
	The specifications of the synchronization operations define when one reads the 
	value written by another. For atomic variables, the definition is clear. All 
	operations on a given lock occur in a single total order. Each lock acquisition 
	&quot;reads the value written&quot; by the last lock release. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.10p8:</p>
<blockquote class="inserted">
	<p>An evaluation <var>A</var> <dfn>happens before</dfn> an evaluation <var>B</var> 
	if:</p>
	<ul>
		<li><var>A</var> is sequenced before <var>B</var> and either <var>A</var> 
		performs an acquire operation or <var>B</var> performs a release operation; 
		or</li>
		<li><var>A</var> synchronizes with <var>B</var>; or</li>
		<li>for some evaluation <var>X</var>, <var>A</var> happens before <var>X</var> 
		and <var>X</var> happens before <var>B</var>.</li>
	</ul>
</blockquote>
<p><a id="s1.10p9">1.10p9</a>:</p>
<blockquote class="inserted">
	<p>An evaluation <var>A</var> <dfn>precedes</dfn> an evaluation <var>B</var> 
	if:</p>
	<ul>
		<li><var>A</var> happens before <var>B</var>; or</li>
		<li><var>A</var> is an assignment, and <var>B</var> observes the value stored 
		by <var>A</var>; or</li>
		<li>for some evaluation <var>X</var>, <var>A</var> precedes <var>X</var> 
		and <var>X</var> precedes <var>B</var>.</li>
	</ul>
</blockquote>
<p>1.10p10:</p>
<blockquote class="inserted">
	<p>A multi-threaded execution is <dfn>consistent</dfn> if each thread observes 
	values of objects that obey the following constraints:</p>
	<ul>
		<li>No evaluation precedes itself.</li>
		<li>Each read access <var>B</var> to a scalar object <var>M</var> observes 
		the value assigned to <var>M</var> by a side effect <var>A</var> only if 
		there is no other side effect <var>X</var> to <var>M</var> such that
		<ul>
			<li><var>A</var> is sequenced before or happens before <var>X</var>, 
			or <var>A</var> and <var>X</var> are synchronization operations and
			<var>A</var> precedes <var>X</var> in the modification order of <var>
			M</var>, and</li>
			<li><var>X</var> is sequenced before or happens before <var>B</var>.
			</li>
		</ul>
		</li>
	</ul>
	<p>[<em>Note:</em> The first condition implies that a read operation <var>B</var> 
	cannot &quot;see&quot; an assignment <var>A</var> if <var>B</var> happens before <var>
	A</var>. It also prevents cyclic situation in which, for example <code>x</code> 
	and <code>y</code> are initially zero, one thread evaluates <code>x = y;</code> 
	while another evaluates <code>y = x;</code>, each sees the result of the other 
	thread, and both <code>x</code> and <code>y</code> obtain a value of 42. The 
	second condition effectively asserts that later assignments hide earlier ones 
	if there is a well-defined order between them. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>1.10p11:</p>
<blockquote class="inserted">
	<p>An execution contains an <dfn>inter-thread data race</dfn> if it contains 
	two conflicting actions in different threads, at least one of which is not atomic, 
	and neither happens before the other. Any inter-thread data race results in 
	undefined behavior. A multi-threaded program that does not contain a data race 
	exhibits the behavior of a consistent execution. [<em>Note:</em> It can be shown 
	that programs that correctly use simple locks to prevent all inter-thread data 
	races, and use no other synchronization operations, behave as though the executions 
	of their constituent threads were simply interleaved, with each observed value 
	of an object being the last value assigned in that interleaving. This is normally 
	referred to as &quot;sequential consistency&quot;. However, this applies only to race-free 
	programs, and race-free programs cannot observe most program transformations 
	that do not change single-threaded program semantics. In fact, most single-threaded 
	program transformations continue to be allowed, since any program that behaves 
	differently as a result must perform an undefined operation. &#8212;<em>end note</em> 
	]</p>
</blockquote>
<p>1.10p12:</p>
<blockquote class="inserted">
	<p>[<em>Note:</em> Compiler transformations that introduce assignments to a 
	potentially shared memory location that would not be modified by the abstract 
	machine are generally precluded by this standard, since such an assignment might 
	overwrite another assignment by a different thread in cases in which an abstract 
	machine execution would not have encountered a data race. &#8212;<em>end note</em> 
	]</p>
</blockquote>
<p>Various other changes in the base language are no doubt needed, but not yet clear. 
I think there is somewhat of a consensus that thread-safety of static initialization 
should be explicitly indicated with a new keyword such as &quot;async&quot;? Exception issues 
should probably be deferred to the thread API proposal.</p>
<h2><a id="operators">Sequencing for specific operators</a></h2>
<p>It seems appropriate to remind the reader, at this point in the paper, that the 
proposal is to move 5p4 from its current location.</p>
<p>5.2.2p8 (function call); deleted as redundant with (new) 1.9p17:</p>
<blockquote>
	<p><span class="deleted">The order of evaluation of arguments is unspecified. 
	All side effects of argument expression evaluations take effect before the function 
	is entered. The order of evaluation of the postfix expression and the argument 
	expression list is unspecified.</span></p>
</blockquote>
<p><a id="s5.2.6p1">5.2.6p1</a> (post-increment):</p>
<blockquote>
	<p>The value <span class="deleted">obtained by applying</span>
	<span class="inserted">of</span> a postfix <code>++</code>
	<span class="inserted">expression</span> is the value <span class="deleted">
	that the</span> <span class="inserted">of its</span> operand
	<span class="deleted">had before applying the operator</span>. [ <em>Note:</em> 
	the value obtained is a copy of the original value &#8212;<em>end note</em> ] The 
	operand shall be a modifiable lvalue. The type of the operand shall be an arithmetic 
	type or a pointer to a complete object type. <span class="deleted">After the 
	result is noted, the</span> <span class="inserted">The</span> value of the
	<span class="inserted">operand</span> object is modified by adding <code>1</code> 
	to it, unless the object is of type <code>bool</code>, in which case it is set 
	to <code>true</code>. [ <em>Note:</em> this use is deprecated, see Annex D. 
	&#8212;<em>end note</em> ] <span class="inserted">The value computation of the
	<code>++</code> expression is sequenced before the modification of the operand 
	object. With respect to an indeterminately-sequenced function call, the operation 
	of postfix <code>++</code> is a single evaluation. [ <em>Note:</em> Therefore, 
	a function call shall not intervene between the lvalue-to-rvalue conversion 
	and the side effect associated with any single postfix <code>++</code> operator. 
	&#8212;<em>end note</em> ]</span> The result is an rvalue. The type of the result 
	is the cv-unqualified version of the type of the operand. See also 5.7 and 5.17.</p>
</blockquote>
<p>5.14p2 (logical AND operator), and also 5.15p2 (logical OR operator):</p>
<blockquote>
	<p>The result is a <code>bool</code>. <span class="deleted">All side effects 
	of the first expression except for destruction of temporaries (12.2) happen 
	before the second expression is evaluated.</span> <span class="inserted">If 
	the second expression is evaluated, every value computation and side effect 
	associated with the first expression is sequenced before every value computation 
	and side effect associated with the second expression.</span></p>
</blockquote>
<p>5.16p1 (conditional operator):</p>
<blockquote>
	<p>Conditional expressions group right-to-left. The first expression is implicitly 
	converted to <code>bool</code> (clause 4). It is evaluated and if it is
	<code>true</code>, the result of the conditional expression is the value of 
	the second expression, otherwise that of the third expression.
	<span class="deleted">All side effects of the first expression except for destruction 
	of temporaries (12.2) happen before the second or third expression is evaluated.</span> 
	Only one of the second and third expressions is evaluated.
	<span class="inserted">Every value computation and side effect associated with 
	the first expression is sequenced before every value computation and side effect 
	associated with the second or third expression.</span></p>
</blockquote>
<p><a id="s5.17p1">5.17p1</a> (assignment and compound assignment operators):</p>
<blockquote>
	<p>The assignment operator (<code>=</code>) and the compound assignment operators 
	all group right-to-left. All require a modifiable lvalue as their left operand 
	and return <span class="deleted">an lvalue with the type and value of the left 
	operand after the assignment has taken place</span> <span class="inserted">an 
	lvalue referring to the left operand</span>. The result in all cases is a bit-field 
	if the left operand is a bit-field. <span class="inserted">In all cases, the 
	assignment is sequenced after the value computation of the right and left operands, 
	and before the value computation of the assignment expression. With respect 
	to an indeterminately-sequenced function call, the operation of a compound assignment 
	is a single evaluation. [ <em>Note:</em> Therefore, a function call shall not 
	intervene between the lvalue-to-rvalue conversion and the side effect associated 
	with any single compound assignment operator. &#8212;<em>end note</em> ]</span></p>
</blockquote>
<p>5.18p1 (comma operator):</p>
<blockquote>
	<p>A pair of expressions separated by a comma is evaluated left-to-right and 
	the value of the left expression is discarded. The lvalue-to-rvalue (4.1), array-to-pointer 
	(4.2), and function-to-pointer (4.3) standard conversions are not applied to 
	the left expression. <span class="deleted">All side effects (1.9) of the left 
	expression, except for the destruction of temporaries (12.2), are performed 
	before the evaluation of the right expression.</span> <span class="inserted">
	Every value computation and side effect associated with the left expression 
	is sequenced before every value computation and side effect associated with 
	the right expression.</span> The type and value of the result are the type and 
	value of the right operand; the result is an lvalue if its right operand is 
	an lvalue, and is a bit-field if its right operand is an lvalue and a bit-field.</p>
</blockquote>
<h2><a id="temporaries">Sequencing for destruction of temporaries</a></h2>
<p>12.2p3:</p>
<blockquote>
	<p>When an implementation introduces a temporary object of a class that has 
	a non-trivial constructor (12.1, 12.8), it shall ensure that a constructor is 
	called for the temporary object. Similarly, the destructor shall be called for 
	a temporary with a non-trivial destructor (12.4). Temporary objects are destroyed 
	as the last step in evaluating the full-expression (1.9) that (lexically) contains 
	the point where they were created. This is true even if that evaluation ends 
	in throwing an exception. <span class="inserted">The value computations and 
	side effects of destroying a temporary object are associated only with the full-expression, 
	not with any specific subexpression.</span></p>
</blockquote>
<p>12.2p4:</p>
<blockquote>
	<p>There are two contexts in which temporaries are destroyed at a different 
	point than the end of the full-expression. The first context is when a default 
	constructor is called to initialize an element of an array. If the constructor 
	has one or more default arguments, <span class="inserted">the destruction of</span> 
	any <span class="deleted">temporaries</span> <span class="inserted">temporary</span> 
	created in <span class="deleted">the</span> <span class="inserted">a</span> 
	default argument <span class="deleted">expressions are destroyed immediately 
	after return from the constructor</span> <span class="inserted">expression is 
	sequenced before the construction of the next array element, if any</span>.</p>
</blockquote>
<p>12.2p5:</p>
<blockquote>
	<p>The second context is when a reference is bound to a temporary. The temporary 
	to which the reference is bound or the temporary that is the complete object 
	of a subobject to which the reference is bound persists for the lifetime of 
	the reference except as specified below. A temporary bound to a reference member 
	in a constructor&#8217;s ctor-initializer (12.6.2) persists until the constructor 
	exits. A temporary bound to a reference parameter in a function call (5.2.2) 
	persists until the completion of the full expression containing the call. A 
	temporary bound to the returned value in a function return statement (6.6.3) 
	persists until the function exits. <span class="deleted">In all these cases, 
	the temporaries created during the evaluation of the expression initializing 
	the reference, except the temporary to which the reference is bound, are destroyed 
	at the end of the full-expression in which they are created and in the reverse 
	order of the completion of their construction.</span> <span class="inserted">
	The destruction of a temporary whose lifetime is not extended by being bound 
	to a reference is sequenced before the destruction of any of any temporary which 
	is constructed earlier in the same full-expression.</span> If the lifetime of 
	two or more temporaries to which references are bound ends at the same point, 
	these temporaries are destroyed at that point in the reverse order of the completion 
	of their construction. In addition, the destruction of temporaries bound to 
	references shall take into account the ordering of destruction of objects with 
	static or automatic storage duration (3.7.1, 3.7.2); that is, if <code>obj1</code> 
	is an object with the same storage duration as the temporary and created before 
	the temporary is created the temporary shall be destroyed before <code>obj1</code> 
	is destroyed; if obj2 is an object with the same storage duration as the temporary 
	and created after the temporary is created the temporary shall be destroyed 
	after obj2 is destroyed. [ Example:</p>
</blockquote>
<h2><a id="miscellaneous">Fixes for miscellaneous sequencing issues</a></h2>
<p>3.6.2p1 (initialization of non-local objects):</p>
<blockquote>
	<p>Objects with static storage duration (3.7.1) shall be zero-initialized (8.5) 
	before any other initialization takes place. A reference with static storage 
	duration and an object of POD type with static storage duration can be initialized 
	with a constant expression (5.19); this is called <dfn>constant initialization</dfn>. 
	Together, zero-initialization and constant initialization are called <dfn>static 
	initialization</dfn>; all other initialization is <dfn>dynamic initialization</dfn>. 
	Static initialization shall be performed before any dynamic initialization takes 
	place. Dynamic initialization of an object is either ordered or unordered. Definitions 
	of explicitly specialized class template static data members have ordered initialization. 
	Other class template static data members (i.e., implicitly or explicitly instantiated 
	specializations) have unordered initialization. Other objects defined in namespace 
	scope have ordered initialization. Objects defined within a single translation 
	unit and with ordered initialization shall be initialized in the order of their 
	definitions in the translation unit. The order of initialization is unspecified 
	for objects with unordered initialization and for objects defined in different 
	translation units. <span class="inserted">An unordered initialization is indeterminately 
	sequenced with respect to every other dynamic initialization.</span> [ <em>Note:</em> 
	8.5.1 describes the order in which aggregate members are initialized. The initialization 
	of local static objects is described in 6.7. &#8212;<em>end note</em> ]</p>
</blockquote>
<p>8.5.1p17 (aggregate initialization); new paragraph:</p>
<blockquote>
	<p><span class="inserted">The full-expressions in an <var>initializer-clause</var> 
	are evaluated in the order in which they appear.</span></p>
</blockquote>
<p>12.6.2p3 (mem-initializers):</p>
<blockquote>
	<p>The <var>expression-list</var> in a <var>mem-initializer</var> is used to 
	initialize the base class or non-static data member subobject denoted by the
	<var>mem-initializer-id</var>. The semantics of a <var>mem-initializer</var> 
	are as follows:</p>
	<ul>
		<li>if the <var>expression-list</var> of the <var>mem-initializer</var> 
		is omitted, the base class or member subobject is value-initialized (see 
		8.5);</li>
		<li>otherwise, the subobject indicated by <var>mem-initializer-id</var> 
		is direct-initialized using <var>expression-list</var> as the <var>initializer</var> 
		(see 8.5).</li>
	</ul>
	<blockquote>
		<p><em>[unchanged example omitted]</em></p>
	</blockquote>
	<p><span class="deleted">There is a sequence point (1.9) after the initialization 
	of each base and member.</span> <span class="inserted">The initialization of 
	each base and member constitutes a full-expression.</span>
	<span class="deleted">The <var>expression-list</var> of</span>
	<span class="inserted">Any expression in</span> a <var>mem-initializer</var> 
	is evaluated as part of the <span class="deleted">initialization of the corresponding 
	base or member</span> <span class="inserted">full-expression that performs the 
	initialization</span>.</p>
</blockquote>
<p>14.2 (template arguments):</p>
<blockquote>
	<dl>
		<dt><var>template-argument:</var></dt>
		<dd><var><span class="deleted">assignment-expression</span>
		<span class="inserted">constant-expression</span></var></dd>
		<dd><var>type-id</var></dd>
		<dd><var>id-expression</var></dd>
	</dl>
</blockquote>
<h2><a id="loops">Semantics of some non-terminating loops</a></h2>
<!--
<p>Insert a new paragraph just before 6.5.1:</p>
<p>6.5p5:</p>
<blockquote class="inserted">
	<p>A non-terminating loops that occurs in a program with more than one thread, 
	and fails to perform an acquire operation after a finite number of initial operations, 
	has unspecified behavior. [ <em>Note:</em> This allows compilers to move assignments 
	above non-terminating loops under certain conditions, and allows some such erroneous 
	loops to be diagnosed. Intentionally infinite loops should contain an acquire 
	operation, such as accessing an atomic variable. &#8212;<em>end note</em> ]</p>
</blockquote>
-->
<p>Concern has been expressed about whether it is safe and legal for a compiler 
to optimize based on the assumption that a loop will terminate. The canonical example:</p>
<blockquote>
	<pre>for (T * p = q; p != 0; p = p-&gt;next)
<!--  -->    ++count;
<!--  -->x = 42;</pre>
</blockquote>
<p>Is it valid for the compiler to move the assignment to <code>x</code> above the 
loop? If the loop terminates, clearly yes, because the overall effect of the code 
doesn&#39;t change; furthermore, in the absence of synchronization, there is no guarantee 
that the assignment to <code>x</code> will not be visible to a different thread 
before any assignments to <code>count</code>.</p>
<p>But what if the loop doesn&#39;t terminate? For example, may a user assume that a 
non-terminating loop effects synchronization, and may therefore be used to prevent 
a data race? Clearly, a loop that contains any explicit synchronizations must be 
assumed to interact with a different thread, and a loop that contains a volatile 
access or a call to an I/O function must be assumed to interact with the environment, 
so optimization opportunities for such a loop are already limited. But what about 
a simple loop, as above?</p>
<p>If such a loop does not terminate, then clearly neither the loop itself nor any 
code following the loop can have any observable behavior. Moreover, as the &quot;least 
requirements&quot; refer to data written to files &quot;at program termination&quot;, the presence 
of a non-terminating loop may even nullify observable behavior preceding entry to 
the loop (for example, because of buffered output). For these reasons, there are 
problems with concluding that a strictly-conforming program can contain any non-terminating 
loop. We therefore conclude that a compiler is free to assume that a simple loop 
will terminate, and to optimize based on that assumption.</p>
<!--
<h2><a id="library">Library thread-safety</a></h2>
<p>Add a new section after 17.4.4.8, entitled &quot;Thread safety&quot;:</p>
<blockquote class="inserted">
	<p>Unless otherwise specified:</p>
	<ul>
		<li>Every data type (e.g. container) implemented by the library shall be 
		thread-safe in the same sense as an ordinary scalar object: The client must 
		ensure that an operation that logically updates an object is not executed 
		concurrently with another operation that reads or writes the same object. 
		The implementation must protect against accesses to shared data that do 
		not correspond to conflicting accesses at the abstract level, i.e. updates 
		that occur in response to logical &quot;read&quot; operations, or against accesses 
		to a data structure shared by multiple abstract objects. For example, implementations 
		of &quot;read operations&quot; that maintain an internal shared cache must use internal 
		synchronization mechanisms to protect that cache, as will any implementations 
		that maintain other forms of per class, as opposed to per object, data.</li>
		<li>Library calls do not introduce synchronizes-with relationships.</li>
		<li>Operations that allocate memory, such as <code>allocator&lt;T&gt;::allocate()</code>, 
		do not modify shared data. Hence they can be invoked concurrently from different 
		threads without introducing a data race. </li>
	</ul>
</blockquote>
-->

</body>

</html>
