<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    <title>User-defined erroneous behaviour</title>
    <style type="text/css">
      html { margin: 0; padding: 0; color: black; background-color: white; }
      body { margin: 0 auto; padding: 2em; font-size: medium; font-family: "DejaVu Serif", serif; line-height: 150%; max-width: 60em; }
      code { font-family: "DejaVu Sans Mono", monospace; color: #006; }

      h1, h2, h3 { margin: 1.5em 0 .75em 0; line-height: 125%; }
      h1 { clear: both; }

      div.code { white-space: pre-line; font-family: "DejaVu Sans Mono", monospace;
                 border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1em;
                 border-radius: 4px; }

      div.strictpre { white-space: pre; }

      sub, sup { margin: 0; padding: 0; line-height: 100%; }

      table { border-collapse: collapse; margin: 2em auto; }
      table caption { margin: 2ex 0 0 0; caption-side: bottom; font-family: "DejaVu Sans", sans-serif; font-size: small; }

      th, td { text-align: left; padding: .5ex 1em; margin: 0; }
      th {  vertical-align: middle; }
      td {  vertical-align: top; }

      td.new { background-color: #EFE; }
      td.new:after { content: "new!"; font-family: "DejaVu Sans", sans-serif; font-weight: bold; font-size: xx-small;
                     vertical-align: top; top: -1em; right: -1em; position: relative; float: right; color: #090; }

      thead th { border-top: 2px solid #333; border-bottom: 2px solid #333; }
      tbody.lined td, tr.line td { border-bottom: 1px solid #333; }
      tr.final td { border-bottom: 2px solid #333; }

      .code .note { font-family: "DejaVu Sans", sans-serif; font-size: small; padding: 0; margin: 0; color: #333; }

      .docinfo { float: right }
      .docinfo p { margin: 0; text-align:right; }
      .docinfo address { font-style: normal; margin-bottom: 2em; }

      .quote { display: inline-block; clear: both; margin-left: 1ex;
                 border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1ex; }

      .modify { border-left: thick solid #999; border-right: thick solid #999; padding: 0 1em; }
      .insert { border-left: thick solid #0A0; border-right: thick solid #0A0; padding: 0 1em; }
      .insert h3, .insert h4, .insert p { text-decoration: underline; color: #0A0; }
      .comment { color: #456; }
      .inclassit { font-family: "DejaVu Serif", serif; font-style: italic; }
      .insinline { border-bottom: 2px solid #0A0; }

      ins { color: #090; }
      del { color: #A00; }
      ins code, del code, .insert code { color: inherit; }

      ul.wide li { margin-bottom: 1em; }
      ul.wide li div.code { padding: 0.25ex 1ex; margin: 1ex 0; }
    </style>
  </head>
  <body>
    <div class="docinfo">
      <p>ISO/IEC JTC1 SC22 WG21 P3232R1</p>
      <p>Date: 2024-11-18</p>
      <p>To: EWG, LWG, CWG (formerly also SG12, SG23)</p>
      <address>
        Thomas K&ouml;ppe &lt;<a href="mailto:tkoeppe@google.com">tkoeppe@google.com</a>&gt;
      </address>
    </div>

    <h1>User-defined erroneous behaviour</h1>

    <h2>Contents</h2>
    <!-- fgrep -e "<h2 id=" user_def_erroneous.html | sed -e 's/.*id="\(.*\)">\(.*\)<\/h2>/<li><a href="#\1">\2<\/a><\/li>/g' -->
    <ol>
      <li><a href="#history">Revision history</a></li>
      <li><a href="#summary">Summary</a></li>
      <li><a href="#motivation">Motivation</a></li>
      <li><a href="#proposal">Proposal: <code>std::erroneous</code></a></li>
      <li><a href="#howto">How to use <code>std::erroneous</code></a></li>
      <li><a href="#impact">Impact and implementability</a></li>
      <li><a href="#wording">Proposed wording</a></li>
      <li><a href="#ack">Acknowledgements</a></li>
      <li><a href="#references">References</a></li>
    </ol>

    <h2 id="history">Revision history</h2>
    <ul>
      <li><a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3232r0.html">P3232R0:</a> Initial revision.</li>
      <li>P3232R1: This revision; adds further motivation</li>
    </ul>

    <h2 id="summary">Summary</h2>

    <p>
      We propose a language-support library function that has no effect other than
      to cause erroneous behaviour. This allows user-defined APIs to include erroneous behaviour.
    </p>

    <h2 id="motivation">Motivation</h2>

    <p>
      The purpose of introducing the novel erroneous behaviour in
      <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2795r5.html">P2795R5</a>
      was to add to C++ the ability to express operations that have well-defined behaviour in
      the presence of certain programming errors, thereby mitigating the safety and security
      implications of said errors: Erroneous behaviour is well-defined and part of the
      observable behaviour of a program, and the compiler must not assume that it does not
      happen (unlike undefined behaviour).
    </p>
    <p>Recall the guiding observation:</p>
    <p style="text-align: center; font-style: italic; padding: 1em 0;"><span style="padding: 1em; border-radius: 1ex; border: medium solid #0A0;">Erroneous behaviour is well-defined behaviour.</span></p>
    <p>
      P2795R5 only prescribed erroneous behaviour for a particular operation (namely reading an
      uninitialized variable with automatic storage duration).
    </p>
    <p>But being able to declare parts of an API erroneous is also useful for user-defined APIs
      that have preconditions (i.e. a &ldquo;narrow contract&rdquo;). It is a programming error
      to invoke such an API when preconditions are not met, but currently, attempts to make this
      API &ldquo;safe&rdquo;, that is, to limit the damage caused by calling it out of contract,
      suffer various shortcomings:
    </p>
    <ul>
      <li>We can leave the behaviour out of contract undefined. This is easy to specify and
        follows the Standard Library convention, but has poor safety implications, as basically
        anything could happen. The library implementer is not constrained in their implementation,
        which may or may not encounter actual undefined behaviour, or take some reasonable precautions.</li>
      <li>
        We can fully specify the behaviour out of contract (e.g. terminating or throwing an
        exception). This makes the API technically have a wide contract (as there are no
        technical preconditions), and the intended preconditions have to be communicated
        out-of-band, e.g. just in natural language.</li>
      <li>
        We can check the preconditions in a subset of cases (e.g. with <code>assert</code>
        when <code>NDEBUG</code> is not defined. This retains the narrow contract, but leaves
        the safety implications somewhat opaque.</li>
      <li>A hypothetical "contracts" facility in the language (e.g. see
        <a href="https://wg21.link/P2900R10">P2900R10</a>
        for a recent iteration) would allow a programmatic statement of the precondition, but there
        is not yet an agreed-upon definition of the precise semantics.</li>
    </ul>

    <p>With erroneous behaviour, we can get the best of both worlds: we can keep the narrow
      contract, still have well-defined, i.e. &ldquo;safe&rdquo;, behaviour in case of violation
      (because remember, erroneous behaviour is well-defined behaviour), but we make it
      unambiguously clear that a precondition violation is a programming error. Production code
      will not misbehave arbitrarily, but instead behave predictably, and debugging tools can
      actually diagnose the programming error confidently.</p>

    <p>
      To illustrate this on a simple example, let us consider a small function that computes the
      quotient of two floating point numbers and has the precondition that the denominator not
      be zero:
    </p>

    <table>
      <col><col>
      <tbody>
        <tr>
          <th>Undefined behaviour</th>
          <td><div class="code"><span class="comment">// Precondition: `den` must not be zero</span>
              float quotient(float num, float den) {
              &nbsp; return num / den;
              }</div></td>
        </tr>
        <tr>
          <th>Checked with <code>assert</code></th>
          <td><div class="code"><span class="comment">// Precondition: `den` must not be zero</span>
              float quotient(float num, float den) {
              &nbsp; <span style="color: #00B;">assert(den != 0);</span>
              &nbsp; return num / den;
              }</div></td>
        </tr>
        <tr>
          <th>Well-defined violation</th>
          <td><div class="code"><span class="comment">// Precondition: `den` must not be zero,</span>
              <span class="comment">// terminates otherwise</span>
              float quotient(float num, float den) {
              &nbsp; <span style="color: #00B;">if (den == 0) { std::abort(); }</span>
              &nbsp; return num / den;
              }</div></td>
        </tr>
        <tr>
          <th>With a &ldquo;contracts&rdquo; facility,<br>precondition:</th>
          <td><div class="code"><span class="comment">// Precondition and contract: `den` must not be zero.</span>
              float quotient(float num, float den) <span style="color: #00B;">pre(den != 0)</span> {
              &nbsp; <span class="comment">// Option #1: undefined behaviour on violation</span>
              &nbsp; /* <span class="comment">nothing</span> */

              &nbsp; <span class="comment">// Option #2: well-defined behaviour on violation</span>
              &nbsp; <span style="color: #00B;">if (den == 0) { return -2; }</span>

              &nbsp; return num / den;
              }</div></td>
        </tr>
        <tr>
          <th>With a &ldquo;contracts&rdquo; facility,<br><code>contract_assert</code>:</th>
          <td><div class="code"><span class="comment">// Precondition: `den` must not be zero.</span>
              float quotient(float num, float den) {
              &nbsp; <span style="color: #00B;">contract_assert(den != 0);</span>

              &nbsp; <span class="comment">// Option #1: undefined behaviour on violation</span>
              &nbsp; /* <span class="comment">nothing</span> */

              &nbsp; <span class="comment">// Option #2: well-defined behaviour on violation</span>
              &nbsp; <span style="color: #00B;">if (den == 0) { return -2; }</span>

              &nbsp; return num / den;
              }</div></td>
        </tr>
        <tr>
          <th><span style="color: #090;">Proposal: violation is erroneous</span></th>
          <td><div class="code"><span class="comment">// Returns the quotient `num`/`den`;</span>
              <span class="comment">// if `den` is zero, returns -2 erroneously.</span>
              float quotient(float num, float den) {
              &nbsp; if (den == 0) { <span style="color: #090;">std::erroneous()</span>; return -2; }
              &nbsp; return quotient(unsafe_unchecked, num, den);
              }

              <span class="comment">// As above, but precondition violation is undefined</span>
              float quotient(unsafe_unchecked_t, float num, float den) {
              &nbsp; return num / den;
              }</div></td>
        </tr>
      </tbody>
    </table>
    <p>
      The final row demonstrates the proposed feature: by allowing user-defined code to be
      erroneous, we can offer a safe-by-default API that has just the same preconditions as an
      unsafe API would have had, but with well-defined behaviour (or termination; see P2795R5)
      in case the user makes a mistake. The original, unchecked API can be provided as a
      separate, explicitly annotated overload.
    </p>

    <h2 id="proposal">Proposal: <code>std::erroneous</code></h2>

    <p>
      We propose a language-support function <code>std::erroneous</code> that has no effect
      other than to have erroneous behaviour.
    </p>

    <p>
      The proposed function is to <code>std::unreachable</code> as
      erroneous behaviour is to undefined behaviour:
    </p>

    <table>
      <thead>
        <tr><th>Function</th><th>Behaviour if invoked</th></tr>
      </thead>
      <tbody>
        <tr>
          <td><code>std::unreachable</code></td>
          <td>undefined behaviour</td>
        </tr>
        <tr class="final">
          <td><code>std::erroneous</code></td>
          <td>erroneous behaviour; no effect</td>
        </tr>
      </tbody>
    </table>

    <h2 id="howto">How to use <code>std::erroneous</code></h2>

    <p><strong>Step 1: Identify an opportunity.</strong> Erroneous behaviour could be used
    in operations that have preconditions. To use erroneous behaviour, two conditions have
    to be met:</p>
    <ul>
      <li>The precondition must be testable programmatically by the called code.</li>
      <li>You have to have some implementable behaviour in mind that the operation will
      exhibit when the precondition is not met. This could be something like producing
      some fixed value, some fixed control flow, throwing an exception, or terminating.</li>
    </ul>

    <p><strong>Step 2: Consider API alternatives.</strong> Consider how you would like the
    operation to incorporate the precondition. Is it definitely always a user error for the
    precondition to be violated, or should the operation handle the precondition violation
    in a specified way that a user is allowed to use and depend on? In the latter case, the
    precondition actually becomes part of the normal operation, and the normal operation
    becomes more complex. In the former case, we tell users &ldquo;not to do that&rdquo;,
    and we can use erroneous behaviour to safeguard against misuse.</p>

    <p><strong>Step 3: Create an opt-out.</strong> If you place the operation with precondition
    into a separate function, which should be clearly labelled as something like &ldquo;unsafe&rdquo;
    or &ldquo;unchecked&rdquo;, then callers who are certain that the preconditions are met can
    choose to call this implementation and not incur any performance penalty for the precondition
    check. The main, safe function can then be implemented in terms of the unsafe one.</p>

    <p><strong>Step 4: Implement.</strong> To make the operation with preconditions use
    erroneous behaviour, check the condition, and if it does not hold, call <code>std::erroneous</code>
    and then perform the fallback behaviour identified in Step&nbsp;1.</p>
    <div class="code">if (precondition) {
      &nbsp; /* call unsafe implementation */
      } else {
      &nbsp; std::erroneous();
      &nbsp; /* fallback behaviour */
    }</div>

    <p>In fact, let us have a concrete example and consider a function with preconditions.
    We have already renamed the function with the &ldquo;Unchecked&rdquo; suffix in anticipation
    of the next step.</p>
    <div class="code">// Requires: request_type must be either kType1 or kType2
    std::uint64_t FetchRequestKeyUnchecked(int request_type, std::string_view user_name);</div>
    <p>It is always a user error to pass an invalid <code>request_type</code>. We could widen
    the contract and specify a return value, but that would complicate the API only to allow
    something that is not useful in the first place (see Step&nbsp;2). But if we decide to keep
    the API narrow, we need to pick a behaviour: terminating (with some form of debug assertion)
    is a plausible option, or perhaps we have an invalid value that we know will be rejected
    elsewhere in the system and that we can return here, erroneously:
    <div class="code">// Requires: request_type must be either kType1 or kType2;
    // violation is erroneous behaviour and results in an invalid value.
    std::uint64_t FetchRequestKey(int request_type, std::string_view user_name) {
    &nbsp; switch (result_type) {
    &nbsp; &nbsp; case kType1:
    &nbsp; &nbsp; case kType2:
    &nbsp; &nbsp; &nbsp; return FetchRequestKeyUnchecked(request_type, user_name);
    &nbsp; &nbsp; default:
    &nbsp; &nbsp; &nbsp; std::erroneous();
    &nbsp; &nbsp; &nbsp; return kInvalidKey;
    &nbsp; }
    }</div>

    <p><strong>Effects.</strong></p>
    <ul>
      <li>The operation is safe in the sense that precondition violation does not result in
        undefined behaviour. It either behaves as specified, or is diagnosed by a suitable tool.
        Practically, the operation cannot run into problematic compiler optimisations, since the
        compiler cannot assume that the erroneous behaviour does not happen. This is the main
        distinction from leaving the case entirely unhandled, and effectively telling the user
        that the behaviour is undefined.
      </li>
      <li>The programming error is detectable systematically by appropriate tools such as
        runtime sanitizers, since erroneous behaviour is allowed to be diagnosed. The manner of
        diagnosis is up to the tool; e.g. it is common to terminate on the first detected
        occurrence, or to continue and collect multiple detections.</li>
      <li>A &ldquo;production&rdquo; platform would likely just perform the specified fallback
        behaviour (see also the
        <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2795r5.html#tooling">P2795R5 Tooling</a>
        section on suggested platform profiles). A high-performance, safety-uncritical platform
        could <em>assume</em> that the erroneous behaviour is not reached. On neither platform
        would calling <code>std::erronenous</code> cause termination, no more than reading an
        uninitialized variable would.
      </li>
    </ul>

    <h2 id="impact">Impact and implementability</h2>

    <p>
      Platforms that diagnose erroneous behaviour will presumably provide some builtin hook with
      which the feature can be implemented. Otherwise, it is always conforming to implement this
      feature as a no-op.
    </p>

    <h2 id="alternatives">Alternatives</h2>

    <p>An <code>[[erroneous]]</code> attribute was suggested during informal discussions, but it
      does not seem compelling: For example, it does not seem useful to annotate an entire
      function as always having erroneous behaviour. Instead, it is more composable to express
      this concern separately and once and for all, namely in the proposed
      function <code>std::erroneous</code>.</p>

    <h2 id="wording">Proposed wording</h2>

    <p>In either [support] or [diagnostics], in a header to be determined, add a new function:</p>
    <div class="insert">
      <div class="code"><ins>// User-defined erroneous behavior</ins>
        <ins>void erroneous();</ins></div>
    </div>
    <p>Add the specification:</p>
    <div class="insert">
      <p><strong>User-defined erroneous behavior</strong></p>
      <div class="code"><ins>void erroneous();</ins></div>
      <p><ins><em>Effects</em>: The behavior is erroneous; calling this function has no effect otherwise.</ins></p>
    </div>

    <h2 id="ack">Acknowledgements</h2>

    <p>Many thanks to Barry Revzin and Oliver Hunt for helpful questions and discussion, to Alisdair Meredith for review,
      and to members of SG21 for feedback on the connection to contracts.</p>

    <h2 id="references">References</h2>
    <ul>
      <li>
        Thomas K&ouml;ppe,
        <em><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2795r5.html">P2795R5</a>:
        Erroneous behaviour for uninitialized reads</em>.
      </li>
      <li>
        Thomas K&ouml;ppe,
        <em><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/n5001.pdf">N5001</a>:
        Working Draft, Programming Languages &mdash; C++</em>.
      </li>
      <li>
        <span title="who is being watched">Joshua Berne</span> et al.,
        <em><a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2900r10.pdf">P2900R10</a>:
          Contracts for C++</em>.
      </li>
    </ul>

  </body>
</html>
