<html>
<head>
<title>Forward progress guarantees for Parallelism TS features</title>

<style type="text/css">
  ins { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  .new { text-decoration:none; font-weight:bold; background-color:#D0FFD0 }
  .rule { background-color:#FFFF40 }
  .ex { background-color: #D0FFF0 }
  del { text-decoration:line-through; background-color:#FFA0A0 }  
  strong { font-weight: inherit; color: #2020ff }
</style>
</head>

<body>
<table>
<tr><td>Document number:</td> <td>P0299R1</td></tr>
<tr><td>Date:</td> <td>2016-06-24</td></tr>
<tr><td>Author:</td> <td>Torvald Riegel &lt;triegel@redhat.com&gt;</td></tr>
<tr><td>Audience:</td> <td>LWG</td></tr>
</table>

<h1>P0299R1: Forward progress guarantees for the Parallelism TS features</h1>

<p>This paper contains wording that clarifies the forward progress requirements
in the Parallelism TS.  See P0072R1 for background.</p>

<h2>Wording</h2>

<p>First, add the following paragraphs after the paragraphs added by P0296R2 to
1.10.2.

<blockquote class=new>
<p>For a thread of execution providing <em>parallel forward progress
guarantees</em>, the implementation is not required to ensure that the thread
will eventually make progress if it has not yet executed any execution step;
once this thread has executed a step, it provides concurrent forward progress
guaranteees.

<p>[Note: This does not specify a requirement for
when to start this thread of execution, which will typically be specified
by the entity that creates this thread of execution. For example, a thread of
execution that provides concurrent forward progress guarantees and executes
tasks from a set of tasks in an arbitrary order, one after the other,
satisfies the requirements of parallel forward progress for these tasks. ---
end note]

<p>For a thread of execution providing <em>weakly parallel forward progress
guarantees</em>, the implementation does not ensure that the thread will
eventually make progress.

<p>[Note: Threads of execution providing weakly parallel forward progress
guarantees cannot be expected to make progress regardless of whether
other threads make progress or not; however, blocking with forward
progress guarantee delegation, as defined below, can be used to ensure that such
threads of execution make progress eventually. --- end note]

<p>Concurrent forward progress guarantees are stronger than parallel forward
progress guarantees, which in turn are stronger than weakly parallel forward
progress guarantees. [Note: For example, some
kinds of synchronization between threads of execution may only make
progress if the respective threads of execution provide parallel forward
progress guarantees, but will fail to make progress under weakly parallel
guarantees. --- end note]

<p>When a thread of execution P is specified to <em>block with forward
progress guarantee delegation</em> on the completion of a set S of threads of
execution, then throughout the whole time of P being blocked on S,
the implementation shall ensure that the forward progress guarantees provided
by at least one thread of execution in S is at least as strong as P's forward
progress guarantees. [Note:
It is unspecified which thread or threads of execution in S are chosen and for which number of
execution steps. The strengthening is not permanent and not necessarily
in place for the rest of the lifetime of the affected thread of execution.
As long as P is blocked, the implementation has to eventually select and potentially
strengthen a thread of execution in S. --- end note]
Once a thread of execution in S terminates, it is removed from S. Once S is
empty, P is unblocked.

<p>[Note: A thread of execution B thus can temporarily provide an effectively
stronger forward progress guarantee for a certain amount of time, due to
a second thread of execution A being blocked on it with forward progress
guarantee delegation. In turn, if B then blocks with forward progress guarantee
delegation on C, this may also temporarily provide a stronger forward progress
guarantee to C.--- end note]

<p>[Note: If all threads of execution in S finish executing (e.g., they
terminate and do not use blocking synchronization incorrectly), then P's
execution of the operation that blocks with forward progress guarantee
delegation will not result in P's progress guarantee being effectively
weakened. --- end note]

<p>[Note: This does not remove any constraints regarding blocking
synchronization for threads of execution providing parallel or weakly parallel
forward progress guarantees because the implementation is not required to
strengthen a particular thread of execution whose too-weak progress guarantee
is preventing overall progress. --- end note]
</blockquote>

<p>Next, adapt 25.2.3 as follows.

<p>Generally, to avoid misinterpretations, replace all occurrences of ``thread''
with ``thread of execution'' (see N4231 for background); given that this is a
mechanical change, these changes are not spelled out.

<p>Specify which forward progress guarantees are provided by threads of
execution supplied by the library by changing 25.2.3p3 and 25.2.3p4:

<blockquote>
<p>The invocations of element access functions in parallel algorithms invoked
with an execution policy object of type <code>parallel_execution_policy</code>
are permitted to execute in an unordered fashion in either the invoking thread
<ins>of execution </ins>or in a thread <ins>of execution </ins>implicitly created by the library to
support parallel algorithm execution<ins>. If the threads of execution created by
<code>std::thread</code> provide concurrent forward progress guarantees
(see 1.10.2 [intro.progress]), then
a thread of execution implicitly created by the library will provide parallel
forward progress guarantees; otherwise, the provided
forward progress guarantee is implementation-defined</ins>. Any such invocations executing in the same thread
of execution are indeterminately sequenced with respect to each other. [Note:
It is the caller's responsibility to ensure correctness, for example that the
invocation does not introduce data races or deadlocks. --- end note]

<p><em>[ No change to the examples ]</em></p>

<p>The invocations of element access functions in parallel algorithms invoked
with an execution policy of type <code>parallel_vector_execution_policy</code>
are permitted to execute in an unordered fashion in unspecified threads<ins> of
execution</ins>, and unsequenced with respect to one another within each thread<ins> of
execution. These threads of execution are either the invoking thread of
execution or threads of execution implicitly created by the library; the latter will provide
weakly parallel forward progress guarantees.</ins> [Note: This means that multiple
function object invocations may be interleaved on a single thread of
execution. --- end note]

<p>[Note: This overrides the usual guarantee from the C++ standard, Section 1.9
[intro.execution] that function executions do not interleave with one another.
--- end note]

<p>Since <code>parallel_vector_execution_policy</code> allows the execution of
element access functions to be interleaved on a single thread<del>,</del><ins> of execution,
blocking</ins> synchronization, including the use of mutexes, risks deadlock. Thus<ins>,</ins>
the synchronization with <code>parallel_vector_execution_policy</code> is restricted as
follows:

<p>A standard library function is <em>vectorization-unsafe</em> if it is specified
to synchronize with another function invocation, or another function invocation
is specified to synchronize with it, and if it is not a memory allocation or
deallocation function. Vectorization-unsafe standard library functions may not
be invoked by user code called from <code>parallel_vector_execution_policy</code>
algorithms.

<p>[Note: Implementations must ensure that internal synchronization inside
standard library routines does not <del>induce deadlock</del><ins>prevent forward progress when those
routines are executed by threads of execution with weakly parallel forward
progress guarantees</ins>. This can be achieved by either avoiding synchronization
incompatible with those guarantees, or by executing these routines
differently. --- end note]
</blockquote>

Insert a new paragraph before paragraph 6:

<blockquote class=new>
<p> If an invocation of a parallel algorithm uses threads of execution
implicitly created by the library, then the invoking thread of execution will
either (1) temporarily block with forward progress guarantee delegation
(see 1.10.2 [intro.progress])
on the completion of these library-managed threads of execution,
or (2) eventually execute an element access function; the thread will
continue to do so until the algorithm is finished. [Note: In blocking with
forward progress guarantee delegation in this context, a thread of execution
created by the library is considered to have finished execution as soon as it
has finished the execution of the particular element access function that the
invoking thread of execution logically depends on. ---end note]
</blockquote>

</body>
</html>