<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
        "http://www.w3.org/TR/html4/loose.dtd">
<html>
  <head>
    <title>##: Another response to N4074; explicit should never be implicit</title>

    <style>
      p {text-align:justify}
      li {text-align:justify}
      blockquote.note
      {
        background-color:#E0E0E0;
        padding-left: 15px;
        padding-right: 15px;
        padding-top: 1px;
        padding-bottom: 1px;
      }
      ins {color:#00A000}
      del {color:#A00000}
      h1 { text-align: center; }
      th, td { font-family: monospace }
      th { text-align: right }
      pre {
        background-color: #eee;
        padding: 1em;
      }
    </style>
  </head>
  <body>

    <table>
      <tr>
        <th>Document #:</th><td>N4131</td>
      </tr>
      <tr>
        <th>Date:</th><td>2014-08-09</td>
      </tr>
      <tr>
        <th>Reply to:</th><td>Filip Ros&eacute;en &lt;<a
            href="mailto:filip.roseen@gmail.com">filip.roseen@gmail.com</a>&gt;</td>
      </tr>
      <tr>
        <th>Summary:</th><td>Arguments for not allowing <code>return {expr}</code> to call an <code>explicit</code> constructor.</td>
      </tr>
    </table>

    <hr />

    <center>
      <h3>Another response to <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4074.pdf">N4074</a>;<h3>
      <h1>explicit should never be implicit</h1>
    </center>

    <h2>Contents</h2>
    <ul>
      <li><a href="#introduction">Introduction</a></li>
      <li><a href="#meaning-of-explicit">The Meaning of
          <code>explicit</code></a></li>
      <li><a href="#current-praxis">The Current praxis of <em>braced-init-list</em></a></li>
      <li><a href="#narrowing-conversions">Narrowing Conversions</a></li>
      <li><a href="#return-statement">The <code>return</code>-statement</a></li>
      <li><a href="#conclusion">Conclusion</a></li>
    </ul>

    <hr />

    <a name="introduction"></a>
    <h2>Introduction</h2>
    <p>
      If one were to agree with the contents of N4074, the following snippet
      should compile without diagnostics;
    </p>
    <pre>
struct Type1 {
    explicit Type1 (int);
};

Type1 example_f1 () {
    return { 0 };
}</pre>

    <p>
      The main arguments of N4074:
    </p>

    <ul>
      <li><code>return { expr }</code> cannot mean anything besides that we,
        explicitly, want to initialize the <em>return-value</em>.
      </li>
      <li>
        Whoever is authoring and maintaining a function also knows about the
        return type, meaning that a developer is well aware of what is being
        initialized; and with what.
      </li>
      <li>
        Asking a developer to explicitly state that <code>explicit</code>
        initialization is allowed when writing the <em>return-statement</em>
        is redundant; both the compiler and, the developer, know what is going
        on.
      </li>
    </ul>

    <p>
      This paper will try to prove why the proposed change of ISO C++ in N4074
      shouldn't be allowed using several methods, among them are:
    </p>

    <ul>
      <li>Discussions of the, sometimes hidden, implications of such change, and:</li>
      <li>
        Arguments regarding how such initialization will differ from the
        current praxis of C++, and:
      </li>
      <li><em>Proof of Concepts</em> that directly shows why such proposal is not sane.</li>
    </ul>

    <hr />

    <a name="meaning-of-explicit"></a>
    <h2>The Meaning of <code>explicit</code></h2>

    <p>
      Marking a constructor as <code>explicit</code> is often equivalent of
      saying: <em>"such initialization sure is possible, but it's potentially
      not what you want, if you really want to do this; go a head, but I won't
    let it happen without your explicit consent."</em>
    </p> 

    <p>
      If a developer would like to use our <code>explicit</code> constructor,
      we'd like him to go the extra mile and <em>explicitly</em> show us that
      this is the case. We'd like him to show some effort, and more
      specifically; consider if this is really what he wants...
      <code>explicit</code> constructors are, by the invisible contract involved,
      potentially dangerous.
    </p>

<pre>
// meaning-of-explicit.example.1

std::unique_ptr&lt;T&gt; func () {
  static T  x;
  return { &amp;x };     // error: chosen constructor is explicit in copy-initialization
}</pre>


    <p>
      There's no way for an implementation to force a developer to actual walk
      around the block every time he tries to initialize an object using an
      <code>explicit</code> constructor, instead we require him to explicitly
      state his request by writing out the type he'd like to initialize at the
      point where such initialization takes place.
    </p>

    <p>
      <em>"I'll refuse to do this unless you show some effort."</em>
    </p>

    <h4>Implications if N4074 is approved:</h4>

    <p>
      N4074 will effectively make the previously described contract disappear
      in the context of <code>return { <em>expr</em> }</code>, which further
      means that we completely disregard the original intent expressed by the
      author of said constructor.
    </p>

<pre>
// meaning-of-explicit.example.1

std::unique_ptr&lt;T&gt; func () {
  static T  x;
  return { &amp;x };     // compiles, but triggers undefined-behavior
}                    //           if/when the unique_ptr is destroyed</pre>

    <p>
      If the author didn't want the user to <em>"walk the extra mile"</em>,
      the author wouldn't have marked the constructor as <code>explicit</code>.
    </p>

    <hr />

    <a name="current-praxis"></a>
    <h2>The Current Praxis of <em>braced-init-list</em></h2>

    <p>
      A <em>braced-init-list</em> is often referred to as means of <em>uniform
        initialization</em>, meaning that all types can be initialized using the
      same syntax. It doesn't matter if we are initializing an <em>fundamental
        type</em>, or a <em>user-defined type</em> that is initialized with one,
      or several, arguments; the initialization is <em>uniform</em>.
    </p>

    <p>
      The current praxis, backed up by the Standard, does not state that
      <em>uniform initialization</em> is a way to bypass the rules associated
      with initialization of an object of type <code>T</code>, we merely have a way to
      express initialization of any type.
    </p>

    <p>
      Another point of value is that you often hear developers state that one
      of the greatest perks of using a <em>braced-init-list</em> is that it's
      equivalent of saying: <em>"Dear compiler, if you know what type I'm
      trying to initialize.. please, go-ahead."</em>
    </p>
    
    <p>
      It is important to note the usage of
      <em>"you know"</em>, nowhere does it imply that both the compiler <b>and</b>
      the developer <em>"knows the type"</em>. When an initialization requires
      the use of an <code>explicit</code> constructor the compiler sure knows,
      but with the meaning of <code>explicit</code> in mind, an implemenation
      should be worried that the developer doesn't, which is why we get a
      diagnostic in such case.
    </p>

    <h4>Implications if N4074 is approved:</h4>

    <p>
      There are many rules to C++, some more complicated than others, but what
      really makes people go <em>"hmpf"</em> is when seemingly equivalent
      constructs behaves differently.
      
    </p>
    <p>
      Allowing <code>return { ... }</code> to use an <code>explicit</code>
      constructor contradicts the previously, far more simple explanation:
      <em>"Unless a <em>braced-init-list</em> has a {type, object, cast}
      explicitly stated where it is being used, a potential conversion must
      be one that can happen implicitly."</em>
      
    </p>
      C++ has enough rules that are cluttered with <em>"but if this applies,
      that doesn't hold"</em>. We don't need another one of such rule,
      especially when it impediments type-safety and the only real gain is to
      prolong the lifetime of keyboards. Lazyness doesn't go well with writing
      safe initializations.
    </p>

    <p>
      Is the proposed change by N4074 really worth it?
    </p>

    <hr />


    <a name="narrowing-conversions"></a>
    <h2>Narrowing Conversions</h2>
    <p>
      There is a very close relationship between <em>narrowing conversions</em>,
      and the use of a constructor marked as <code>explicit</code>.
    </p>

    <p>
      If a fundamental type <code>T</code> is initialized with a compile-time
      known value which isn't suitable for that type, or if such type is
      initialized with an object of type <code>U</code> which potentially can
      hold a value that isn't representable in <code>T</code>, a diagnostic is
      required.
    </p>
    
    <p>
      The introduction of <em>narrowing conversions</em> in C++ was, and is, a
      very good step towards increased type-safety. It prevents developers from
      making mistakes that can potentially result in a program that behaves in a
      manner which was never intended.
    </p>

    <pre>
// narrowing-conversions.example.1

std::size_t
multiply (int x, int y) {
  return { x * y };  // error: non-constant-expression cannot be narrowed from
}                    //        type 'int' to 'std::size_t'
    </pre>

    <p>
      It is certainly possible to initialize a <code>std::size_t</code> with the
      result of <code>x * y</code>, but since <code>std::size_t</code> cannot
      handle negative numbers this is potentially unsafe.
    </p>

    <p>
    If we play with the idea of writing a wrapper around
    <code>std::size_t</code>, we could end up with something like the below:
    </p>

    <pre>
// narrowing-conversions.example.2

struct SizeType {
  explicit SizeType (  signed int);
           SizeType (unsigned int);

  &hellip;
};

SizeType
multiply (int x, int y) {
  return { x * y }; // error: chosen constructor is explicit in
}                   //        copy-initialization</pre>

    <p>
      The reason <code>SizeType (signed int)</code> is marked
      <code>explicit</code>, is the same as to why we rely on diagnostics
      to inform us of potential <em>narrowing conversions</em>. We rely on the
      compiler to tell us when we are doing something that might lead
      to unforeseen consequences.
    </p>
    
    <h4>Implications if N4074 is approved:</h4>
    <p>
      Since C++11 the use of <code>return { expr }</code> has become almost
      synonym to <em>"safe initialization of any return-type"</em>, if N4074 is
      approved this will no longer be true. This would be one of the scarier
      forms of a <em>breaking change</em>; one that cannot be caught by
      something other than a watchful eye.
    </p>

    <hr />

    <a name="return-statement"></a>
    <h2>The <code>return</code>-statement</h2>

    <pre>
T func1 () {
  return <em>expression-or-braced-init-list</em>;
}</pre>

    <p>
    As the name implies, a <em>return-statement</em> is used to <em>return</em>
    a value to the caller of a function. However, it is of utterly importance
    that we understand that we never directly return the value of the
    <em><code>expression-or-braced-init-list</code></em> associated with the
    statement; we merely say that it is to be used as the initializer for the
    returned value.
    </p>

    <p>
      The <em>return-type</em> of a function is per definition a distant type;
      one cannot know the actual <em>return-type</em> by only interpreting the
      <em>expression-or-braced-init-list</em> used to initialize it.
      The opposite also applies; one cannot know the initializers for the
      <em>return-value</em> by only inspecting the <em>return-type</em>.
    </p>

    <p>
      With the mentioned relation between the <em>return-type</em> and its
      initializer(s), there are side-effects that one has to properly consider:
    </p>

    <ul>
      <li>
        <p>
          A developer should be allowed to change the <em>return-type</em> of a
          function without having to review every <em>return-statement</em> in its body.
          The expected behavior is that such change results in a diagnostic
          unless every initialization of the new <em>return-type</em> follows
          the rules of strict type-safety (meaning that a potential dangerous
          initialization should not implicitly apply).
        </p>

        <p>
          In the below a developer inaccurately thought <em>"ms"</em> was the SI
          unit for microseconds, long story short, it's not. The error is
          however caught during compilation.
        </p>


        <pre>
// return-statement.example.1

/*!
 *  \brief  Benchmark `f()`
 *  \return The duration in ms spent evaluating `f()`
 * */

unsigned long benchmark (std::function&lt;void()&gt; f) {
  &hellip;
}</pre>
<pre>
<b>commit message</b>:
  * updating codebase to C++11, `benchmark` now returns the appropriate
    duration type from &lt;chrono&gt;

<b>commit diff</b>:
  --- benchmark.cpp  2014-07-28 03:56:32.255764544 +0200
  +++ benchmark.cpp  2014-07-28 03:56:53.175682956 +0200
  @@ -5,6 +5,6 @@
    *  \return The duration in ms spent evaluating `f()`
    * */
   
 <del>- unsigned long benchmark (std::function&lt;void()&gt; f) {</del>
 <ins>+ std::chrono::microseconds benchmark (std::function&lt;void()&gt; f) {</ins>
     &hellip;
   }</pre>

      </li>

      <li>
        <p>
          A developer might not know the <em>return-type</em> of a function when
          he writes his <em>return-statement</em>, therefore he should have a
          mechanism to disable initializations that potentially does something
          which was never intended - no matter if such initialization makes use
          of one, or several, arguments.
        </p>


        <pre>
// return-statement.example.2

template&lt;class T&gt;
struct Vector {
  explicit Vector (int size, int capacity = 0);
           Vector (std::initializer_list&lt;T&gt; data);
};

template&lt;class T, class... Ts&gt;
Vector&lt;T&gt; make_vector (Ts... args) {
  return { args... };
}</pre>
<pre>
int main () {
  using secs = std::chrono::seconds;

  auto x = make_vector&lt; int&gt; (1,5,10);
  auto y = make_vector&lt;secs&gt; (10, 20); // error: chosen constructor is explicit in copy-initialization
}</pre>
      </li>
    </ul>

    <h4>Implications if N4074 is approved:</h4>

    <p>
      Even though I agree with the opinion raised by N4074, that a developer
      <em>should</em> know the <em>return-type</em> and the <em>return-paths</em> of
      the function he is working on, I find it of higher value that the compiler
      is able to stop potential brainfarts from ever making it as far as to
      runtime.
    </p>

    <p>
      Neither of the two previous examples would be caught during compilation if
      N4074 is approved. This means that the somewhat trivial errors leaked out
      into the world of runtime, something which the strict <em>type-safety</em>
      of C++ has saved us from in the past.
    </p>

    <hr />

    <a name="conclusion"></a>
    <h3>Conclusion</h3>

    <p>
      The proposed changes by N4074 are a violation of one of the fundamental
      type-safety philosophies of C++; if it's not clear that a potentially
      unsafe conversion can happen, we - as developers - would like the compiler
      to diagnose the potential error.  It doesn't make sense for the rules of
      <em>copy-list-initialization</em> to differ in <em>return-statements</em>
      since we are per definition initializing a distant type - and with that, a
      distant value.
    </p>

    <p>
      If N4074 is approved there are other cases where such a change need to
      propogate for it to make sense. With the philosophy expressed by N4074,
      <code>private</code> member-functions of a class are maintained by the
      same developer who is calling them (as they are implementation details),
      should we then allow <code>explicit</code> constructors to be used when
      invoking such function having <em>copy-list-initialization</em> of the
      arguments involved? After all, the developer <em>should</em> know what
      is going on.
    </p>
  </body>
</html>

