<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html;charset=utf-8">
<title>N3069: Various threads issues in the library (LWG 1151) (revised)</title>
</head>
<body>
<table summary="Identifying information for this document.">
	<tr>
                <th>Doc. No.:</th>
                <td>WG21/N3069<br>
		J16/10-0059</td>
        </tr>
	<tr>
		<th>Revision of:</th>
		<td>WG21/N3040 = J16/10-0030</td>
	</tr>
        <tr>
                <th>Date:</th>
                <td>2010-3-11</td>
        </tr>
        <tr>
                <th>Reply to:</th>
                <td>Hans-J. Boehm</td>
        </tr>
        <tr>
                <th>Phone:</th>
                <td>+1-650-857-3406</td>
        </tr>
        <tr>
                <th>Email:</th>
                <td><a href="mailto:Hans.Boehm@hp.com">Hans.Boehm@hp.com</a></td>
        </tr>
</table>
	
<H1>N3069: Various threads issues in the library (LWG 1151)</h1>
<P>
LWG 1151 (US 63) points out that we had made only an incomplete attempt at
dealing with threading issues throughout the library.  This is an attempt
to improve matters by addressing as many of the remaining issues as we
are currently aware of.  Based on experience with other languages,
it seems likely that more issues will arise in this area.  But it appears
to be getting harder to find the holes.

This relies on discussions with many people, including Lawrence Crowl,
Howard Hinnant, Peter Dimov, Nick Maclaren, and Paul McKenney.  The last
section, including the solution, was mostly copied from LWG issue 1329,
which was authored by Jeffrey Yaskin.

<H2>Communication through I/O</h2>
<P>
27.4p4 [iostreams.objects] currently specifies:

<BLOCKQUOTE>
<P>
Concurrent access to a synchronized (27.5.2.4) standard iostream object&rsquo;s
formatted and unformatted input (27.7.1.1) and output (27.7.2.1) functions or a
standard C stream by multiple threads shall not result in a data race (1.10).
[ Note: users must still synchronize concurrent use of these objects and streams
by multiple threads if they wish to avoid interleaved characters. -end note ]
</blockquote>
<P>
This raises the question of whether I can use iostreams to synchronize threads. 
If I set <I>X</i> to 1 in thread 1, write to a stream <I>f</i>, and then read
the written value, or one that depends on it in thread 2, is thread 2 guaranteed
to see the write to <I>X</i> (or a later one)?
<P>
The above paragraph appears to apply only to the standard iostreams,
making the answer to this question a bit unclear, since access to other
streams already produces a data race, avoiding the
question.  Based on discussions in the concurrency group, there still
appear to be fairly esoteric way to use IO for thread communication,
either by rebinding the standard streams to files, or through the
C library.  Hence we probably do need to address the issue, in spite
of statements to N3040 to the contrary.
<P>
We should be consistent with the
"sequential consistency for data-race-free programs" rule.  Anything that
does not introduce a race should ensure the appropriate happens-before
relationship unless weaker memory ordering is explicitly specified.
Thus, in the above example case, thread 2 should be guaranteed
to see the write to <I>X</i>. 
<P>
<B>Proposed resolution:</b>
<P>
Insert a second paragraph into [iostreams.threadsafety], after 27.2.3p1:
<BLOCKQUOTE>
	<P><INS>
		If one thread makes a library call <I>a</i> that writes a
		value to a stream, and as a result another thread
		reads this value from the stream through a library call
		<I>b</i> such that this does not result in a data race,
		then <I>a</i> happens before <I>b</i>.
</ins></blockquote>

<H2>Additional non-modifying non-const member functions</h2>
<P>
A number of standard container member functions are treated as const
for race detection, in spite of the fact that they are not const
functions.  The <TT>at()</tt> function was inadvertently omitted from this
list.
<P>
During LWG discussions, Alisdair pointed out that "unordered associative
containers" are not "associative containers", and should be listed
separately.
<P>
<B>Proposed Resolution:</b>
<P>
Edit 23.2.2p1 [container.requirements.dataraces] as follows:

<BLOCKQUOTE>
<P>
For purposes of avoiding data races (17.6.4.8), implementations
shall consider the following functions to be
const: <TT>begin</tt>, <TT>end</tt>, <TT>rbegin</tt>, <TT>rend</tt>,
<TT>front</tt>, <TT>back</tt>, <TT>data</tt>, <TT>find</tt>,
<TT>lower_bound</tt>, <TT>upper_bound</tt>, <TT>equal_range</tt><INS>,
<TT>at</tt></ins>
and, except in associative <INS>or unordered associative</ins> containers,
<TT>operator[]</tt>.
</blockquote>

<H2>More precise definition of "atomic"</h2>
<P>
The current spec is not very clear what it means for an "atomic
read-modify-write" operation to be "atomic".  Clarification:
<P>
<B>Proposed Resolution:</b>
<P>
Before 29.3p11[atomics.order] add a paragraph:
<BLOCKQUOTE>
<P>
<INS>
Atomic read-modify-write operations shall always read the last value (in
the modification order) written before the write associated with the
read-modify-write operation.
</ins>
</blockquote>

<H2>Iterator issues</h2>
<P>
The current draft is unclear when iterator operations may conflict with
accesses to other iterators or to the underlying container.
<P>
<B>Proposed resolution:</b>
<P>
Add the following paragraph after 17.6.4.8p5 [res.on.data.races]:
<BLOCKQUOTE>
<P>
<INS>
Operations on 
iterators obtained by calling a standard library container
or string member function may
access the underlying container, but shall not modify it.
[<I>Note:</i> In particular, container operations that invalidate iterators
conflict with operations on iterators associated with that container.
<I> -- end note.</i>]
</ins></blockquote>
<P>
There was a question about whether the race behavior of algorithms
such as equal() is sufficiently described.  Equal() appears to be
adequately described, since it uses InputIterators, implying that,
according to 17.6.4.8p5 [res.on.data.races] it is not allowed to update
anything through the iterators.  Unfortunately, many of the algorithms
use ForwardIterators only for input, which means they are underspecified.
An implementation currently could read and write back the objects
referenced by the iterators, which is clearly unacceptable.
<P>
<B>Proposed resolution:</b>
<P>
Add after 25.1p3 [algorithms.general]:

<BLOCKQUOTE>
<P>
<INS>
For purposes of determining the existence of data races,
algorithms shall not modify objects referenced through an iterator argument
unless the specification requires such modification.
</ins></blockquote>
<P>
Howard Hinnant points out that this is not as clear as it should be
without more context.  I'm also not sure that it is sufficient to
deal with something like <TT>nth_element()</tt>, which has a specification that
already seems less clear than I would like about what is being modified.
But it was decided that this is a clear improvement over what we have.

<H2>Constructors and destructors of synchronization objects can race</h2>
<P>
This arose during discussion of LWG 1218.  LWG 1221 is also related.
<P>
I think we all agree that constructors and destructors of all library
objects, including synchronization objects like mutexes, are not protected
against data races.  It is the programmer's responsibility to ensure
that construction happens before any other use, and any other use
happens before destruction.
<P>
The one exception here seem to be the somewhat more liberal condition
variable rules, which allow a wait to complete after a condition variable
has been destroyed.
<P>
There seems to be some agreement that the standard mostly says all of this
already.  In particular, the object lifetime rules in 3.8 seem to
essentially state this, assuming 3.8 is interpreted as referring to
happens before (1.10) ordering.  Unfortunately, that's not as clear
as it should be.
<P>
I believe the CWG needs to make a change of adding something like
the following at the beginning of 3.8 [basic.life].  This will become
a CWG issue.
<B>We are NOT requesting that this be added as part of this
proposal.</b>  However we will assume an interpretation along these
lines, and believe that the general intent here is uncontroversial.
Possible addition to 3.8:
<BLOCKQUOTE>
<P>
All statements about the ordering of evaluations in this section,
using words like "before", "after", and "during", refer to the
happens before order defined in 1.10[intro.multithread].  [Note: We
ignore situations in which evaluations are unordered by happens
before, since these require a data race (1.10)[intro.multithread], which already
results in undefined behavior --end note]
</blockquote>
<P>
We do propose to make the following related
clarification to the library description:
<P>
<B>Proposed resolution:</b>
<P>
Add the following as a second paragraph to
17.6.3.10 [res.on.objects]:
<BLOCKQUOTE>
<P>
<INS>
[<I>Note:</i>
In particular, the program is required to ensure that completion
of the constructor
for any standard-library-defined class happens before any other member function
invocation on that class object, and
unless otherwise specified, to ensure that completion of
any member function invocation, other than destruction, on a class object
happens before destruction of that object.  This applies even
to objects, such as mutexes, intended for thread synchronization.
<I>-- end note.</i>]
</ins></blockquote>
<P>
I believe condition variables are the only intended exception to this.
However, the current precondition, as modified by LWG 1221, does not
make this clear.  It states "There shall be no thread blocked on *this",
which is entirely redundant with the default lifetime rules, which
would require that any wait calls happen before destruction.
<P>
<B>Proposed resolution:</b>
<P>
Add after 30.5.1p4 [thread.condition.condvar], whether
or not updated by LWG1221, and after 30.5.2p3:
<BLOCKQUOTE>
<P>
<INS>
This relaxes the usual rules, which would have
required all wait calls to happen before destruction.  Only the
notifications to unblock the wait must happen before destruction.
</ins></blockquote>
<H2>Bitset, vector&lt;bool&gt; (LWG 1329)</h2>
<P>
The common implementation of <tt>vector&lt;bool&gt;</tt> is as an
unsynchronized bitfield.  The addition of 23.2.2 [container.requirements.dataraces]/2 would require either a
change in representation or a change in access synchronization, both of
which are undesireable with respect to compatibility and performance.
</p>
<P>
The <TT>bitset</tt> has a conceptually similar issue.  Unfortunately,
it appears easier to resolve a bit differently.

<p><b>Proposed resolution:</b></p>
<p>
Modify 23.2.2 [container.requirements.dataraces]:
</p>

<p>
Edit paragraph 2 as follows:
</p>

<blockquote>
<P>
2 Notwithstanding (17.6.4.8), implementations are required to avoid data
races when the contents of the contained object in different elements in
the same sequence<ins>, excepting <code>vector&lt;bool&gt;</code>,</ins>
are modified concurrently.
</blockquote>

<p>
Edit paragraph 3 as follows:
</p>

<blockquote>
<P>	
3 [<i>Note:</i>
For a <code>vector&lt;int&gt; x</code> with a size greater than one,
<code>x[1] = 5</code> and <code>*x.begin() = 10</code>
can be executed concurrently without a data race,
but <code>x[0] = 5</code> and <code>*x.begin() = 10</code>
executed concurrently may result in a data race.
<ins>As an exception to the general rule,
for a <code>vector&lt;bool&gt; y</code>,
<code>y[0] = true</code> may race with <code>y[1] = true</code>.</ins>
&mdash;<i>end note</i>]
</blockquote>

<P>Add at the end of the returns clause for bitset operator[]
(20.5.2p57 [bitset.members]):

<BLOCKQUOTE>
<P>
<INS>
For the purpose of determining the presence of a data race (1.10)
any access or update through the resulting reference potentially
accesses or modifies, respectively, the entire underlying bitset.
</ins></blockquote>

</body>
</html>
