<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
    <title>Range-based for statements with initializer</title>
    <style type="text/css">
      html { margin: 0; padding: 0; color: black; background-color: white; }
      body { padding: 2em; font-size: medium; font-family: "DejaVu Serif", serif; line-height: 150%; }
      code { font-family: "DejaVu Sans Mono", monospace; color: #006; }

      h1, h2, h3 { margin: 1.5em 0 .75em 0; line-height: 125%; }

      sup, sub { line-height: 0; }

      div.code { white-space: pre-line; font-family: "DejaVu Sans Mono", monospace;
                 border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1em;
                 border-radius: 4px; }

      div.strictpre { white-space: pre; }

      div.code em { font-family: "DejaVu Serif", serif; }

      table { border-top: 2px solid black; border-bottom: 2px solid black; border-collapse: collapse; margin: 3em auto; }
      thead th { border-bottom: 2px solid black; }
      th, td { text-align: left; padding: 1ex 1ex 1ex 5em; }
      th:first-child, td:first-child { padding-left: 1ex; }

      tr.new td { background-color: #EFE; }
      td.new:after { content: "new!"; font-family: "DejaVu Sans", sans-serif; font-weight: bold; font-size: xx-small;
                     vertical-align: top; top: -1em; right: -1em; position: relative; float: right; color: #090; }

      .docinfo { float: right }
      .docinfo p { margin: 0; text-align:right; }
      .docinfo address { font-style: normal; }

      .quote { display: inline-block; clear: both; margin-left: 1ex;
               border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1ex; }

      /*  Use DIV[insert] and DIV[delete] if the entire paragraph is added or removed; otherwise
       *  use DIV[modify] and use INS/DEL elements to mark up individual changes.
       */

      div.insert { border-left: thick solid #0A0; border-right: thick solid #0A0; padding: 0 1em; }
      div.modify { border-left: thick solid #999; border-right: thick solid #999; padding: 0 1em; }
      div.delete { border-left: thick solid #A00; border-right: thick solid #A00; padding: 0 1em; }

      del { color: #A00; text-decoration: line-through; }
      ins { color: #090; }
      ins code, del code { color: inherit; }

      .box { border: thin black solid; padding: 2px; }

      ol { counter-reset: item; }
      ol > li { counter-increment: item; }
      ol ol { list-style: none; padding: 0; }
      ol ol > li:before { content: counters(item, ".") " "; }
    </style>
  </head>
  <body>
    <div class="docinfo">
      <p>ISO/IEC JTC1 SC22 WG21 P0614R1</p>
      <p>Date: 2017-11-06</p>
      <p>To: CWG</p>
      <address>
        Thomas K&ouml;ppe &lt;<a href="mailto:tkoeppe@google.com">tkoeppe@google.com</a>&gt;<br>
      </address>
    </div>

    <h1>Range-based <code>for</code> statements with initializer</h1>

    <h2>Abstract</h2>

    <p>We propose a new versions of the range-based <code>for</code> statement for C++:
      &nbsp; <code>for (<em>init</em>; <em>decl</em> : <em>expr</em>)</code>.
      This statement simplifies common code patterns, help users keep scopes tight
      and offers an elegant solution to a common lifetime problem.</p>

    <h2>Contents</h2>
    <!-- fgrep -e "<h2 id=" for2.html | sed -e 's/.*id="\(.*\)">\(.*\)<\/h2>/<li><a href="#\1">\2<\/a><\/li>/g' -->
    <ol>
      <li><a href="#history">Revision history</a></li>
      <li><a href="#beforeafter">Before/After</a></li>
      <li><a href="#wording">Proposed wording</a></li>
      <li><a href="#discussion">Discussion</a></li>
      <li><a href="#impact">Impact&hellip;</a><ol>
        <li><a href="#impact-std">&hellip; on the Standard</a></li>
        <li><a href="#impact-impl">&hellip; on implementations</a></li>
        <li><a href="#impact-multirange">&hellip; on ongoing design work regarding multi-range iteration</a></li>
        <li><a href="#impact-lifetime">&hellip; on range adaptors and ongoing lifetime issues</a></li>
      </ol></li>
      <li><a href="#future">Future directions</a></li>
      <li><a href="#ack">Acknowledgements</a></li>
    </ol>

    <h2 id="history">Revision history</h2>

    <ul>
      <li><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0614r0.html">P0614r0</a>: initial proposal</li>
      <li>P0614r1: simplified the formal wording to reuse the original range-based <code>for</code> rewrite rule</li>
    </ul>

    <h2 id="beforeafter">Before/After</h2>

    <table>
      <col style="width: 37em">
      <col style="width: 42em">

      <thead>
        <tr><th>Before the proposal</th><th>With the proposal</th></tr>
      </thead>

      <tbody>
        <tr>
          <td><div class="code" style="color: #A00">{
              &nbsp; T thing = f();
              &nbsp; for (auto&amp; x : thing.items()) {
              &nbsp; &nbsp; // Note: &ldquo;for (auto&amp; x : f().items())&rdquo; is WRONG
              &nbsp; &nbsp; mutate(&amp;x);
              &nbsp; &nbsp; log(x);
              &nbsp; }
              }</div></td>
          <td><div class="code" style="color: #0A0">for (T thing = f(); auto&amp; x : thing.items()) {
              &nbsp; mutate(&amp;x);
              &nbsp; log(x);
              }</div></td>
        </tr>
        <tr>
          <td><div class="code" style="color: #A00">{
              &nbsp; std::size_t i = 0;
              &nbsp; for (const auto&amp; x : foo()) {
              &nbsp; &nbsp; bar(x, i);
              &nbsp; &nbsp; ++i;
              &nbsp; }
              }</div></td>
          <td><div class="code" style="color: #0A0">for (std::size_t i = 0; const auto&amp; x : foo()) {
              &nbsp; bar(x, i);
              &nbsp; ++i;
              }</div></td>
        </tr>
      </tbody>
    </table>

    <h2 id="wording">Proposed wording</h2>

    <p>Change the grammar in [stmt.iter] as follows.</p>

    <div class="modify">
      <em>iteration-statement</em>:<br>
      &nbsp; &nbsp; . . .<br>
      &nbsp; &nbsp; <code>for (</code> <ins><em>init-statement</em><sub><em>opt</em></sub></ins> <em>for-range-declaration</em> <code>:</code> <em>for-range-initializer</em> <code>)</code> <em>statement</em><br>
      &nbsp; &nbsp; . . .<br>
    </div>

    <p>Modify paragraph 1 of subsection [stmt.ranged] as follows.</p>

    <div class="modify">
      <p>The range-based <code>for</code> statement</p>
      <div class="code">for ( <ins><em>init-statement</em><sub><em>opt</em></sub></ins> <em>for-range-declaration</em> : <em>for-range-initializer</em> ) <em>statement</em></div>
      <p>is equivalent to</p>
      <div class="code">{
        &nbsp; &nbsp; <ins><em>init-statement</em><sub><em>opt</em></sub></ins>
        &nbsp; &nbsp; auto &amp;&amp;__range = <em>for-range-initializer</em> ;
        &nbsp; &nbsp; auto __begin = <em>begin-expr</em> ;
        &nbsp; &nbsp; auto __end = <em>end-expr</em> ;
        &nbsp; &nbsp; for ( ; __begin != __end; ++__begin ) {
        &nbsp; &nbsp; &nbsp; &nbsp; <em>for-range-declaration</em> = *__begin;
        &nbsp; &nbsp; &nbsp; &nbsp; <em>statement</em>
        &nbsp; &nbsp; }
        }</div>
      <p>where [&hellip;]</p>
    </div>

    <h2 id="discussion">Discussion</h2>

    <p>This proposal shares a lot of the motivation of
      <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0305r1.html">P0305:
      Selection statements with initializer</a>, namely the desire for tight scopes and local
      code. However, there is a more pressing motivation that is unique the the range-based
      <code>for</code> statement:</p>

    <p>In a statement <code>for (auto&amp; x : <em>expr</em>)</code>, the expression
      <code><em>expr</em></code> is evaluated once and bound to a notional variable
      declared as <code>auto&amp;&amp;</code>. When <em>expr</em> is a prvalue, this
      works well and the lifetime of the value is extended to beyond the loop. However,
      when <em>expr</em> is a glvalue, it will happily bind to the notional reference,
      but its lifetime is not extended and the reference is invalid. This pattern is
      a common source of bugs that is hard to spot; it is particularly easily caused
      by member functions that return glvalues (but also by reference-forwarding functions
      such as <code>std::min</code>).</p>

    <p>For example, consider the following simple type that exposes an internal collection:</p>
    
    <div class="code">class T {
      &nbsp; std::vector&lt;int&gt; data_;
      public:
      &nbsp; std::vector&lt;int&gt;&amp; items() { return data_; }
      &nbsp; // ...
      };</div>

    <p>Consider further a function returning a prvalue:</p>

    <div class="code">T foo();</div>
    
    <p>Even users who are familiar with the intricacies of prvalue lifetime extension,
      and who would be confident about a hypothetical statement</p>
    <div class="code">for (auto&amp; x : foo()) { /* ... */ }</div>
    <p>can easily fail to spot that the similar looking</p>
    <div class="code">for (auto&amp; x : foo().items()) { /* ... */ }</div>
    <p>has undefined behaviour. While this particular pitfall will presumably stay with us
      for the foreseeable future (but <a href="#impact-lifetime">see below</a> for further
      discussion), the proposed new syntax will at least allow users to
      write correct code that looks almost as concise and local as the wrong code above:</p>
    <div class="code">for (T thing = foo(); auto&amp; x : thing.items()) { /* ... */ }</div>

    <p>Note that we are <em>not</em> proposing that the <em>init-statement</em> be in the same
      declarative region as any later part of the statement. In other words,
      <code>for (auto x = f(); auto x : x)</code> is valid and the outer
      <code>x</code> is hidden in the loop body. This is consistent with the proposed
      rewrite rule; in the current standard, <code> T x; for (auto x : x)</code> is
      already valid.</p>
    
    <h2 id="impact">Impact&hellip;</h2>

    <h3 id="impact-std">&hellip; on the Standard</h3>

    <p>The proposal is a core language extension. The proposed syntax is ill-formed in the
      current standard. As an extension to the language&rsquo;s statement syntax, this
      change is unlikely to have any impact on the design of the standard library.</p>

    <h3 id="impact-impl">&hellip; on implementations</h3>

    <p>Various implementers have reported that the proposal may pose certain implementation
      challenges, but should be doable in principle and in reasonable reality. The new syntax
      makes it harder to distinguish a range-based from an ordinary <code>for</code> statement
      and requires more sophisticated parsing to distinguish the two.</p>

    <h3 id="impact-multirange">&hellip; on ongoing design work regarding multi-range iteration</h3>

    <p>Proposal <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/p0026r0.pdf">P0026</a>
      was presented in at the 2015 Kona meeting that proposes a syntax extension</p>
    <div class="code">for (auto x : a; auto y : b; auto z : c) {
      &nbsp; f(a, b, c);
    }</div>
    <p>which iterates the ranges <code>a</code>, <code>b</code> and <code>c</code> in
      lock-step (&ldquo;zip&rdquo;) order. Although that proposal has not progressed,
      raises several technical concerns (e.g. how to handle ranges of unequal length),
      and the problem it addresses can be solved in library, we would nonetheless like
      to note that the present proposal is compatible with and orthogonal to that
      extension: One could support an optional initializer syntax with multiple
      ranges just as well:</p>
    <div class="code">for (T thing = f(); auto x : a; auto y : b; auto z : c) {
      &nbsp; thing.bar(a, b, c);
    }</div>
    <p>That syntax is still syntactically unique, although it would prevent a future
      extension to allow an optional increment statement. (The author has no intention
      of proposing such an extension.)</p>
    
    <h3 id="impact-lifetime">&hellip; on range adaptors and ongoing lifetime issues</h3>

    <p>It is pertinent to discuss a related set of range designs and current core and evolution
      issues. Let us revisit the motivating example where the lifetime of the range expression
      value ends prematurely because it is not a prvalue:</p>

    <div class="code" style="color: #A00">for (auto&amp; x : f().things()) { &nbsp;<span style="color: #666;">// dangling reference!</span>
      &nbsp; mutate(&amp;x);
    }</div>

    <p>This problem is exacerbated if we consider a generic design of <em>range adaptors</em>
      which is a central component of many range-based libraries (and has been considered by
      the Ranges study group, SG9). Depending on the details, we may end up with <em>many</em>
      temporaries which all need to be kept alive:</p>

    <div class="code" style="color: #A00">for (auto&amp; x : f().filter(pred).top(10).reverse()) { &nbsp;<span style="color: #666;">// how many temporaries?</span>
      &nbsp; mutate(&amp;x);
    }</div>

    <p>The proposed optional initializer is not sufficient to track all the temporary objects
      that may need to be kept alive during the iteration. In fact, this problem has been
      considered so serious that it is the subject of core issues <a href="http://wg21.link/cwg900">CWG
      900</a> and <a href="http://wg21.link/cwg1498">CWG 1498</a>, and at the 2017 Kona meeting
      CWG decided to send these issues back to EWG for review. One of the possible solutions that
      has been considered is to give the range-based <code>for</code> statement special semantics
      by which <em>all</em> temporary values that are part of the range expression are alive until
      the end of the loop.</p>

    <p>We would like to offer an alternative position and suggest that a core language change
      may not be needed here. First off, ongoing work in the Ranges study group has already
      come to the conclusion that range adaptors should not be constructible from rvalues.
      In such a design, the expression <code>f().filter(pred)</code> would not be allowed
      (assuming, as always, that <code>f()</code> is an rvalue). All we now need is that the
      entire state of the combined adaptor chain be accumulated in the final expression, and
      that that expression be a prvalue, so that no object except that of the final value
      in the adaptor chain is required during iteration. With that design constraint in place,
      and together with the present proposal, we can write iteration over an adapted range
      as follows:</p>

    <div class="code" style="color: #0A0">for (T container = f(); auto&amp; x : container.filter(pred).top(10).reverse()) {
      &nbsp; mutate(&amp;x); &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #666;">// ^^^^^^^^^ ^^^^^^^^^^^^ ^^^^^^^ ^^^^^^^^^</span>
    } &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; &nbsp;<span style="color: #666;">// lvalue &nbsp; &nbsp;prvalue &nbsp; &nbsp; &nbsp;prvalue prvalue</span></div>

    <h2 id="future">Future directions</h2>

    <p>Just for the record, without prejudice or promise, and with malice toward none,
      we would like to note possible future extensions in this area.</p>
    <ul>
      <li>Range-based <code>for</code> with increment: <code>for (<em>init</em>; auto x : e; <em>inc</em>)</code></li>
      <li>Initializer for the <code>do</code> statement: The <code>do ... while</code> statement does not have
        an optional initializer at the moment. One might ask for one, though we would need to think hard whether
        <code>do (<em>init</em>;) <em>statement</em> while (<em>expr</em>);</code> is syntactically unambiguous
        and aesthetically defensible.</li>
    </ul>
    
    <h2 id="ack">Acknowledgements</h2>

    <p>Thanks to Herb Sutter and Titus Winters for encouraging the proposal, to Eric Niebler
      and Ville Voutilainen for technical discussion regarding lifetime issues, and to all
      the implementers who provided feedback on the implementatbility of this idea.</p>

  </body>
</html>
