<!DOCTYPE html>

<html>
<head>
    <meta charset="utf-8">
    <title>Better Integration of Sender Executors</title>
    <style type="text/css">
    body
    {
        font-size: 10pt;
        font-family: sans-serif;
    }
    code
    {
        display: block;
        white-space: pre;
        margin-top: -1.00em;
        margin-left: 2.0em;
    }
    p
    {
        text-align: justify
    }
    p > code
    {
        margin-top: initial;
        margin-left: initial;
    }
    li {
        text-align: justify
    }
    ol.p
    {
        margin-left: -2.0em;
    }
    blockquote.note
    {
        background-color:#E0E0E0;
        padding-left: 15px;
        padding-right: 15px;
        padding-top: 1px;
        padding-bottom: 1px;
    }
    p.note
    {
        background-color:#E0E0E0;
        padding-left: 15px;
        padding-right: 15px;
        padding-top: 1px;
        padding-bottom: 1px;
    }
    ins { background-color: #CCFFCC; }
    del { background-color: #FFCCCC; }
    .insert { background-color: #CCFFCC; }
    address {text-align:right;}
    h1 {text-align:center;}
    span.comment {color:#C80000;}
    table {
        border-collapse: collapse;
    }
    td, th {
        border: 1px solid silver;
    }
    </style>
</head>
<body>
<pre>
Doc. no:  P1349R0
Audience: LEWG, SG1
Date:     2018-11-06
Reply-To: Vinnie Falco (<a href="mailto:vinnie.falco@gmail.com">vinnie.falco@gmail.com</a>)
</pre>
<hr>
<h1>Better Integration of Sender Executors</h1>



<h2>Contents</h2>
<ol>
<li><a href="#overview">Overview</a></li>
<li><a href="#motivation">Motivation and Scope</a></li>
<li><a href="#impact">Impact On the Standard</a></li>
<li><a href="#discussion">Discussion</a></li>
<li><a href="#changes">Proposed Changes</a></li>
<li><a href="#credits">Acknowledgements</a></li>
<li><a href="#references">References</a></li>
</ol>

<!----------------------------------------------------------------------------->

<h2 id="overview">1. Overview</h2>
<p>
This document proposes changes to P1194r0 [1] which allow it to better
integrate with the proposed Executors design described in P0443r9 [2].
</p>

<!----------------------------------------------------------------------------->

<h2 id="motivation">2. Motivation and Scope</h2>
<p>
P1194r0 describes a new executor interface allowing the implementation of
Future-like asynchrony without the overhead of synchronization typically
required when <tt>std::promise::set_value</tt> may be called concurrently
with <tt>std::future::get</tt>. We fully support this goal, and assume
that the paper delivers on the claim that the proposed executor interfaces
<tt>make_value_task</tt> and <tt>make_bulk_value_task</tt> achieve it.
</p>
<p>
However, the paper also proposes harmful changes to P0443r9 which affect
performance, and proposes an integration into P0443r9 which conflicts
with the design of Executors. In this paper we highlight the problems with
P1194r0 stemming from a misunderstanding of Executors, roll back the harmful
changes, and propose a clean way to add lazy executors on top of P0443r9.
</p>

<!----------------------------------------------------------------------------->

<h2 id="impact">3. Impact on the Standard</h2>
<p>
This is a pure library proposal. It does not add any new language features,
nor does it alter any existing standard library headers. However, this
library also requires the library features offered in P0443r9.
</p>

<!----------------------------------------------------------------------------->

<h2 id="discussion">4. Discussion</h2>
<p>
In this section we review the problematic passages from P1194r0 by quoting each
passage and describing the issues. We assume that readers are already familiar
with P0443r9 and P1194r0.
</p>
<ul>
<li><em>"It achieves this by replacing P0443r9’s six <tt>Executor::*execute</tt>&hellip;"</em>
    </li>
    <p>
        The <em>Executor</em> concept does not prescribe any execution functions,
        it only specifies the base requirements of <em>CopyConstructible</em>,
        <em>Destructible</em>, and <em>EqualityComparable</em>. The execution
        functions referred to by P1194 are actually part of the interface of
        each interface concept. P0443r9 defines two interface concepts:
        <em>OneWayExecutor</em> and <em>BulkOneWayExecutor</em>. We note that
        these interface concepts are orthogonal. A particular executor can
        support zero, one, or both concepts. An executor is not required to
        implement all interfaces.
    </p>
<li><em>"Note: This paper only seeks to add support for lazy task composition
    and execution to P0443 while maintaining feature parity with P0443."
    </em></li>
    <p>
        The paper (P1194r0) goes beyond adding support for lazy task
        composition and attempts to simplify the existing interface concepts
        by expressing them in terms of a "grand unified theory" of execution
        which uses lazy execution as the basis operations.
    </p>
<li><em>"The <tt>then_execute</tt> API on the <em>Executor</em> concept is
    renamed <tt>make_value_task</tt>, and <tt>bulk_then_execute</tt> is renamed
    <tt>bulk_make_value_task</tt>."
    </em></li>
    <p>
        As stated earlier, the <em>Executor</em> concept does not specify any
        execution functions. The <tt>then_execute</tt> and <tt>bulk_then_execute</tt>
        interfaces are actually part of the <em>ThenExecutor</em> and
        <em>BulkThenExecutor</em> concepts which were removed prior to P0443r9
        and placed into P1244r0 [3].
    </p>
<li><em>"The following Executor APIs are removed: <tt>execute</tt>,
    <tt>twoway_execute</tt>, <tt>bulk_execute</tt>, and <tt>bulk_twoway_execute</tt>."
    </em></li>
    <p>
        As stated earlier, the <em>Executor</em> concept does not specify any
        execution functions. The interfaces <tt>exceute</tt> and <tt>bulk_execute</tt>
        are part of the <em>OneWayExecutor</em> and <em>BulkOneWayExecutor</em>
        concepts in P0443r9. The <tt>twoway_execute</tt> and <tt>bulk_twoway_execute</tt>
        are part of the <em>TwoWayExecutor</em> and <em>BulkTwoWayExecutor</em>
        concepts which were removed prior to P0443r9 and placed into P1244r0.
    </p>
<li><em>"&hellip;the Executor concept gets a <tt>submit</tt> API&hellip;"
    </em></li>
    <p>
        As stated earlier, the <em>Executor</em> concept only contains
        requirements which are common to all interface concepts. Forcing
        all executors, current and future, to implement <tt>submit</tt> is
        an overly broad and unnecessary requirement to implementing lazy
        execution.
    </p>
<li><em>"We do not expect any performance difference between the two forms of
    the one way execute definition&hellip;"
    </em></li>
    <p>
        This statement is partly false. A performance difference appears
        when a lazy executor is type-erased using the polymorphic wrapper, and
        used to simulate the behavior of the <em>OneWayExecutor</em>. It is
        the author's view that the use of a type-erased lazy executor to
        perform a one way execution cannot avoid incurring an additional memory
        allocation.
    </p>
</ul>
<p>
    While the paper continues on to refer incorrectly to interfaces of the
    <em>Executor</em> concept which do not exist, we refer to the statements
    above for why these references are incorrect.
</p>
<p>
    A stated goal of P1194r0 is to simplify the fundamental concepts involved
    in asynchronous execution. In addition to performance considerations, we
    believe the proposed change of basis operations actually makes
    user-defined executors more difficult to write. Feedback from SG1 during
    Rapperswil anticipated a "zoo of [user-defined] executors," a position
    with which we agree. The <em>OneWayExecutor</em> concept in P0443r9 is
    much simpler to implement than a sender executor concept, as can be seen
    by comparing two hypothetical implementations:
</p>
<blockquote><pre><tt>
//
// Models OneWayExecutor
//
struct inline_executor
{
  friend bool operator==(const inline_executor&, const inline_executor&amp;) noexcept
  {
    return true;
  }

  friend bool operator!=(const inline_executor&, const inline_executor&amp;) noexcept
  {
    return false;
  }

  template &lt;class Function&gt;
  void execute(Function f) const noexcept
  {
    f();
  }
};

//
// Models SenderExecutor
//
// Note: This implementation should be taken with a grain
//       of salt, as the specification in P1196 is insufficient
//       to produce a complete, working implementation.
//
struct inline_executor {

  template &lt;class F, class R&gt;
  struct __inline_receiver {
    F f_;
    R r_;
    template &lt;class... Args&gt;
    void set_value(Args&amp;&amp;... args) {
      std::move(r_).set_value(
        std::move(f_)(std::forward&lt;Args&gt;(args)...)
      );
    }
    template &lt;class E&gt;
    void set_error(E&amp;&amp; e) {
      std::move(r_).set_error(std::forward&lt;E&gt;(e));
    }
    void set_done() {
      std::move(r_).set_done();
    }

    static constexpr void query(std::experimental::execution::receiver_t) { }
  };

  template &lt;class T, class E=std::exception_ptr&gt;
  struct __subject {
    // Ed: omitted for brevity
    // ...
  };

  template &lt;class S, class F&gt;
  struct __task_submit_fn {
    S s_;
    F f_;

    template &lt;typename... Values&gt;
    struct _value_types_helper {
      using type = std::invoke_result_t&lt;F, Values...&gt;;
    };

    static constexpr void query(std::experimental::execution::sender_t) noexcept { }

    template &lt;class Receiver&gt;
    void submit(Receiver&amp;&amp; r) {
      std::move(s_).submit(
        __inline_receiver&lt;F, std::decay_t&lt;Receiver&gt;&gt;{std::move(f_), std::forward&lt;Receiver&gt;(r)}
      );
    }

    auto executor() const { return inline_executor{}; }
  };

  template &lt;class NullaryFunction&gt;
  void execute(NullaryFunction&amp;&amp; f) const {
    f();
  }

  template &lt;std::experimental::execution::ReceiverOf&lt;inline_executor&gt; R&gt;
  void submit(R&amp;&amp; r) const {
    std::forward&lt;R&gt;(r).set_value(*this);
  }

  template &lt;std::experimental::execution::Sender S, class Function&gt;
  auto make_value_task(S&amp;&amp; s, Function f) const {
    return __task_submit_fn&lt;std::decay_t&lt;S&gt;, Function&gt;{
      std::forward&lt;S&gt;(s), std::move(f)
    };
  }

  using sender_desc_t = std::experimental::execution::sender_desc&lt;std::exception_ptr, inline_executor&gt;;
  static constexpr sender_desc_t query(std::experimental::execution::sender_description_t) { return { }; }
  static constexpr void query(std::experimental::execution::sender_t) { }
};
</tt></pre></blockquote>
<p>
    While we are in generally in favor of using a single, more universal primitive
    to express multiple execution strategies we do not believe that the resulting
    complexity pushed onto users justifies adopting the sender executor model
    as that universal primitive.
</p>
<p>
    As shown above, P1194r0 currently has structural problems which prevent it
    from being seriously considered. But can we fix the problems by rigorously
    adopting P0443r9's interface properties in a way that preserves the lazy
    execution features? The answser is of course yes, and the next section
    explains how.
</p>

<!----------------------------------------------------------------------------->

<h2 id="changes">5. Proposed Changes</h2>
<ol>
<li>Introduce two new concepts, <em>SenderExecutor</em> and
    <em>BulkSenderExecutor</em>
    </li>
    <p>
        These two concepts are in addition to the <em>OneWayExecutor</em>
        and <em>BulkOneWayExecutor</em> concepts already described in P0443r9.
        Adding concepts as refinements of <em>Executor</em> is the prescribed
        method of adding additional executor models. This can be seen in
        P1124r0 which adds <em>TwoWayExecutor</em>, <em>BulkTwoWayExecutor</em>,
        <em>ThenExecutor</em>, and <em>BulkThenExecutor</em>. After Executors
        ships, new executor models may continue to be added as refinements.
        We believe the extensible system of interface properties which allows
        for both compile-time and runtime introspection of executor capabilities
        is a remarkably elegant and flexible design which caters to the
        strengths of C++.
    </p>
<li>Add the <tt>submit</tt> and <tt>make_value_task</tt> interfaces as
    requirements of <em>SenderExecutor</em>.
    </li>
    <p>
        As the <tt>execute</tt> interface is part of the <em>OneWayExecutor</em>
        refinement, so should the <tt>submit</tt> and <tt>make_value_task</tt>
        interfaces be part of the <em>SenderExecutor</em> refinement. The
        behavior of these interfaces remains the same as described in the paper.
    </p>
<li>Add a new interface property type <tt>sender_t</tt>.
    </li>
    <p>
        As the <tt>oneway_t</tt> interface property type described in P0443r9
        is used to require, prefer, or query an executor for the one-way
        execution interface capability, so should the <tt>sender_t</tt>
        interface property be defined to determine the sender execution
        interface capability. A possible implementation for the property may
        look like this:
<blockquote><pre><tt>// SenderExecutor interface property
struct sender_t
{
    static constexpr bool is_requirable = true;
    static constexpr bool is_preferable = false;

    template &lt;class... SupportableProperties&gt;
    class polymorphic_executor_type;

    using polymorphic_query_result_type = bool;

    template &lt;class Executor&gt;
    static constexpr bool static_query_v = <em>implementation-defined</em>

    static constexpr bool value() const { return true; }
};
static constexpr sender_t sender;
</tt></pre></blockquote>
    </p>
<li>Add the <tt>submit</tt> and <tt>make_bulk_value_task</tt> interfaces as
    requirements of <em>BulkSenderExecutor</em>.
    </li>
    <p>
        As the <tt>bulk_execute</tt> interface is part of the <em>BulkOneWayExecutor</em>
        refinement, so should the <tt>submit</tt> and <tt>make_bulk_value_task</tt>
        interfaces be part of the <em>BulkSenderExecutor</em> refinement. The
        behavior of these interfaces remains the same as described in the paper.
    </p>
<li>Add a new interface property type <tt>bulk_sender_t</tt>.
    </li>
    <p>
        As the <tt>bulk_oneway_t</tt> interface property type described in
        P0443r9 is used to require, prefer, or query an executor for the
        bulk one-way execution interface capability, so should the
        <tt>bulk_sender_t</tt> interface property be defined to determine the
        bulk sender execution interface capability. A possible implementation
        for the property may look like this:
<blockquote><pre><tt>// BulkSenderExecutor interface property
struct bulk_sender_t
{
    static constexpr bool is_requirable = true;
    static constexpr bool is_preferable = false;

    template &lt;class... SupportableProperties&gt;
    class polymorphic_executor_type;

    using polymorphic_query_result_type = bool;

    template &lt;class Executor&gt;
    static constexpr bool static_query_v = <em>implementation-defined</em>

    static constexpr bool value() const { return true; }
};
static constexpr bulk_sender_t bulk_sender;
</tt></pre></blockquote>
    </p>
</ol>
<p>
    The implementation of a particular lazy executor should add hooks for
    the <tt>require</tt>, <tt>prefer</tt>, and <tt>query</tt> customization
    points described in P0443r9. Users who desire a lazy executor should
    use the aforementioned customization points to obtain an executor with
    the lazy execution feature. This example shows how a generic algorithm
    which depends on lazy execution might acquire lazy executors:
<blockquote><pre><tt>template &lt;typename Executor&gt;
void perform (const Executor&amp; ex)
{
    // change ex to a SenderExecutor
    auto const lazy_ex = require(ex, sender);

    // change ex to a BulkSenderExecutor
    auto const bulk_lazy_ex = require(ex, bulk_sender);    

    &hellip;

</tt></pre></blockquote>
</p>
<p>
    If the changes described in this section are not adopted, then at the
    very least we would like to see the following in a future revision of
    P1194r0:
</p>
<ul>
<li>Correct references to requirements of <em>Executor</em> which
    are actually requirements of individual refinements such as
    <em>OneWayExecutor</em> and <em>BulkOneWayExecutor</em>.</li>
<li>Demonstrate with code that a polymorphic wrapper used to type-erase
    a sender executor does not incur an unnecessary memory allocation
    when used to perform a one-way execute.
    </li>
<li>Provide the rationale for why these changes are not adopted.
    </li>
</ul>
</p>

<!----------------------------------------------------------------------------->

<a name="credits"></a><h2>6. Acknowledgements</h2>
<p>
We thank Christopher Kohlhoff for reviewing the proposed changes for
accuracy, for editorial improvements, and for example implementations.
</p>

<!----------------------------------------------------------------------------->

<a name="references"></a><h2>7. References</h2>

<b>[1]</b>
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1194r0.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1194r0.html</a><br>
<br>

<b>[2]</b>
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0443r9.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p0443r9.html</a><br>
<br>

<b>[3]</b>
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1244r0.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1244r0.html</a><br>
<br>

</body>
</html>
