<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang="" xml:lang="">
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="pandoc" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <title>P3565R1: Virtual floating-point values</title>
  <style>
    html {
      color: #1a1a1a;
      background-color: #fdfdfd;
    }
    body {
      margin: 0 auto;
      max-width: 36em;
      padding-left: 50px;
      padding-right: 50px;
      padding-top: 50px;
      padding-bottom: 50px;
      hyphens: auto;
      overflow-wrap: break-word;
      text-rendering: optimizeLegibility;
      font-kerning: normal;
    }
    @media (max-width: 600px) {
      body {
        font-size: 0.9em;
        padding: 12px;
      }
      h1 {
        font-size: 1.8em;
      }
    }
    @media print {
      html {
        background-color: white;
      }
      body {
        background-color: transparent;
        color: black;
        font-size: 12pt;
      }
      p, h2, h3 {
        orphans: 3;
        widows: 3;
      }
      h2, h3, h4 {
        page-break-after: avoid;
      }
    }
    p {
      margin: 1em 0;
    }
    a {
      color: #0645ad;
    }
    a:visited {
      color: #0645ad;
    }
    img {
      max-width: 100%;
    }
    svg {
      height: auto;
      max-width: 100%;
    }
    h1, h2, h3, h4, h5, h6 {
      margin-top: 1.4em;
    }
    h5, h6 {
      font-size: 1em;
      font-style: italic;
    }
    h6 {
      font-weight: normal;
    }
    ol, ul {
      padding-left: 1.7em;
      margin-top: 1em;
    }
    li > ol, li > ul {
      margin-top: 0;
    }
    blockquote {
      margin: 1em 0 1em 1.7em;
      padding-left: 1em;
      border-left: 2px solid #e6e6e6;
      color: #606060;
    }
    code {
      font-family: Menlo, Monaco, Consolas, 'Lucida Console', monospace;
      font-size: 85%;
      margin: 0;
      hyphens: manual;
    }
    pre {
      margin: 1em 0;
      overflow: auto;
    }
    pre code {
      padding: 0;
      overflow: visible;
      overflow-wrap: normal;
    }
    .sourceCode {
     background-color: transparent;
     overflow: visible;
    }
    hr {
      border: none;
      border-top: 1px solid #1a1a1a;
      height: 1px;
      margin: 1em 0;
    }
    table {
      margin: 1em 0;
      border-collapse: collapse;
      width: 100%;
      overflow-x: auto;
      display: block;
      font-variant-numeric: lining-nums tabular-nums;
    }
    table caption {
      margin-bottom: 0.75em;
    }
    tbody {
      margin-top: 0.5em;
      border-top: 1px solid #1a1a1a;
      border-bottom: 1px solid #1a1a1a;
    }
    th {
      border-top: 1px solid #1a1a1a;
      padding: 0.25em 0.5em 0.25em 0.5em;
    }
    td {
      padding: 0.125em 0.5em 0.25em 0.5em;
    }
    header {
      margin-bottom: 4em;
      text-align: center;
    }
    #TOC li {
      list-style: none;
    }
    #TOC ul {
      padding-left: 1.3em;
    }
    #TOC > ul {
      padding-left: 0;
    }
    #TOC a:not(:hover) {
      text-decoration: none;
    }
    code{white-space: pre-wrap;}
    span.smallcaps{font-variant: small-caps;}
    div.columns{display: flex; gap: min(4vw, 1.5em);}
    div.column{flex: auto; overflow-x: auto;}
    div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
    /* The extra [class] is a hack that increases specificity enough to
       override a similar rule in reveal.js */
    ul.task-list[class]{list-style: none;}
    ul.task-list li input[type="checkbox"] {
      font-size: inherit;
      width: 0.8em;
      margin: 0 0.8em 0.2em -1.6em;
      vertical-align: middle;
    }
    .display.math{display: block; text-align: center; margin: 0.5rem auto;}
    /* CSS for syntax highlighting */
    html { -webkit-text-size-adjust: 100%; }
    pre > code.sourceCode { white-space: pre; position: relative; }
    pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
    pre > code.sourceCode > span:empty { height: 1.2em; }
    .sourceCode { overflow: visible; }
    code.sourceCode > span { color: inherit; text-decoration: inherit; }
    div.sourceCode { margin: 1em 0; }
    pre.sourceCode { margin: 0; }
    @media screen {
    div.sourceCode { overflow: auto; }
    }
    @media print {
    pre > code.sourceCode { white-space: pre-wrap; }
    pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
    }
    pre.numberSource code
      { counter-reset: source-line 0; }
    pre.numberSource code > span
      { position: relative; left: -4em; counter-increment: source-line; }
    pre.numberSource code > span > a:first-child::before
      { content: counter(source-line);
        position: relative; left: -1em; text-align: right; vertical-align: baseline;
        border: none; display: inline-block;
        -webkit-touch-callout: none; -webkit-user-select: none;
        -khtml-user-select: none; -moz-user-select: none;
        -ms-user-select: none; user-select: none;
        padding: 0 4px; width: 4em;
        color: #aaaaaa;
      }
    pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
    div.sourceCode
      {   }
    @media screen {
    pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
    }
    code span.al { color: #ff0000; font-weight: bold; } /* Alert */
    code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
    code span.at { color: #7d9029; } /* Attribute */
    code span.bn { color: #40a070; } /* BaseN */
    code span.bu { color: #008000; } /* BuiltIn */
    code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
    code span.ch { color: #4070a0; } /* Char */
    code span.cn { color: #880000; } /* Constant */
    code span.co { color: #60a0b0; font-style: italic; } /* Comment */
    code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
    code span.do { color: #ba2121; font-style: italic; } /* Documentation */
    code span.dt { color: #902000; } /* DataType */
    code span.dv { color: #40a070; } /* DecVal */
    code span.er { color: #ff0000; font-weight: bold; } /* Error */
    code span.ex { } /* Extension */
    code span.fl { color: #40a070; } /* Float */
    code span.fu { color: #06287e; } /* Function */
    code span.im { color: #008000; font-weight: bold; } /* Import */
    code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
    code span.kw { color: #007020; font-weight: bold; } /* Keyword */
    code span.op { color: #666666; } /* Operator */
    code span.ot { color: #007020; } /* Other */
    code span.pp { color: #bc7a00; } /* Preprocessor */
    code span.sc { color: #4070a0; } /* SpecialChar */
    code span.ss { color: #bb6688; } /* SpecialString */
    code span.st { color: #4070a0; } /* String */
    code span.va { color: #19177c; } /* Variable */
    code span.vs { color: #4070a0; } /* VerbatimString */
    code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
  </style>
  <style type="text/css">
  div.del, section.del { background: #fdd; }
  div.ins, section.ins { background: #cfb; }
  .mv { background: #ddf; }
  s, .del { background: #faa; margin-right: 0.15ch; }
  u, .ins { background: #afa; }
  /* from https://stackoverflow.com/a/32456613 */
  div > blockquote, body > blockquote {
      display: list-item;
      list-style-type: "- ";
  }
  /* With a 3 em gutter and two columns, ANSI letter is 127 characters wide. */
  pre {
  		margin-left: 1.2em;
  }
  .tony {
    border-collapse: collapse;
  }
  .tony > tbody > tr {
    vertical-align: top;
  }
  tr.hr, tr.hr {
    border-bottom: thin solid #60a0b0;  /* relies on collapse */
  }
  </style>
</head>
<body>
<header id="title-block-header">
<h1 class="title">P3565R1: Virtual floating-point
values<!-- -*- c++-md -*- --></h1>
</header>
<p><em>Audience</em>: SG6<br />
S. Davis Herring &lt;<a href="mailto:herring@lanl.gov"
class="email">herring@lanl.gov</a>&gt;<br />
Los Alamos National Laboratory<br />
February 15, 2025</p>
<h1 id="history">History</h1>
<p>Since <a
href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2025/p3565r0.html">r0</a>:</p>
<ul>
<li>Added FMA example with <code>+</code> explanation</li>
<li>Discussed relationship with <code>FLT_EVAL_METHOD</code></li>
<li>Discussed effect on CWG2752</li>
<li>Discussed implementation requirements</li>
</ul>
<h1 id="introduction">Introduction</h1>
<p>The standard says very little about the actual results of
floating-point evaluations. [basic.fundamental]/12 says the “accuracy of
operations” is implementation-defined; [expr.pre]/6 makes the situation
even less clear by suggesting that floating-point operands and results
are somehow not even values of their types. Indeed, it is of practical
value that implementations often interpret expressions involving
floating-point types as mathematical expressions in order to improve the
performance and accuracy of the computation of its overall result.
Common techniques include fusing multiplications and additions,
discarding canceling terms, and temporarily using extra precision
(<em>e.g.</em>, using x87 registers). Strict application of well-defined
floating-point operations is of course critical to other numerical
algorithms; the footnote in /6 suggests that “The cast and assignment
operators” may be used for the purpose.</p>
<p>These ideas are derived from C, which additionally defines the
<code>FLT_EVAL_METHOD</code> macro to describe the implementation’s
choices about such transformations. Matthias Kretz presented <a
href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3488r1.pdf">Floating-Point
Excess Precision</a> to SG6 and EWG seeking guidance on how to most
consistently interpret these ideas (in particular about floating-point
literals, the subject of <a
href="https://cplusplus.github.io/CWG/issues/2752">CWG2752</a>) in the
context of C++’s stronger type system, constant evaluation, and the
larger set of contemporary floating-point types. No clear direction has
yet been reached, suggesting that further research may be needed.</p>
<p>This paper proposes a model for the use of extra precision (when
available) for floating-point calculations that is more appropriate for
C++ than <code>FLT_EVAL_METHOD</code> and resolves CWG2752. It does not,
however, suggest mandating any particular arithmetic specification like
IEC 60559.</p>
<h1 id="discussion">Discussion</h1>
<p>The idea that an operator result of a type does not have one of the
values of that type is obviously problematic from the perspective of
defining semantics for such an operator. Moreover, the idea that
assigning a variable forces extended precision to be discarded is
problematic in C++ because of the routine use of wrapper class types in
mathematical expressions. The creation of every such object involves the
initialization of its member variable, which seems to be just as strong
as assignment in terms of incompatibility with extralinguistic extended
precision.</p>
<p>An alternative approach is to extend the set of values for a
floating-point type beyond those that can even theoretically be stored
in the memory that an object of that type occupies. The result of the
subexpression in <code>a * b + c</code> (all <code>double</code>s) might
then have a value outside the set of values that can be stored in a
<code>double</code>’s space in memory (typically the
<code>binary64</code> set); the choice of that value conveys the
additional information needed to obtain the correctly rounded result of
the overall expression as of course implemented by an FMA instruction.
Similar careful choices of value from a larger set might capture the
bits (or finiteness) lost in <code>a + b - b</code>; for x87
implementations, the larger set is simply the values supported by the
register format. <code>FLT_EVAL_METHOD==1</code> corresponds to using
the normal set of <code>double</code>s as the extended set for
<code>float</code>.</p>
<p>The crucial specification technology is the same as used for <a
href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p2434r2.html">pointer
provenance</a>: the values are put in a many-to-one correspondence with
the value representations of the type. (The presence of multiple
rounding modes might require the formal duplication of values based on
the representable value to which they round, but this matters only if
the value representation is examined.) Note that every operation and
object still has a <em>single</em> value. Aside from merely being
tractable semantics, the stability of values prevents unfortunate
practical results like taking both branches in</p>
<div class="sourceCode" id="cb1"><pre
class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="at">const</span> <span class="dt">double</span> x <span class="op">=</span> <span class="co">/* ... */</span><span class="op">,</span> y <span class="op">=</span> x <span class="op">+</span> epsilon <span class="op">/</span> <span class="dv">4</span><span class="op">;</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span><span class="op">(</span>x <span class="op">&lt;</span> y<span class="op">)</span> <span class="op">{</span>     <span class="co">// X87 comparison</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>  <span class="co">// ...</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a><span class="co">// further operations that cause spilling...</span></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="cf">if</span><span class="op">(</span>x <span class="op">==</span> y<span class="op">)</span> <span class="op">{</span>    <span class="co">// binary64 comparison</span></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>  <span class="co">// ...</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>or failing the assert in</p>
<div class="sourceCode" id="cb2"><pre
class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="dt">float</span> id<span class="op">(</span><span class="dt">float</span> f<span class="op">)</span> <span class="op">{</span><span class="cf">return</span> f<span class="op">;}</span>  <span class="co">// no computations</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> call<span class="op">()</span> <span class="op">{</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>  <span class="at">const</span> <span class="dt">float</span> x <span class="op">=</span> <span class="co">/* ... */</span><span class="op">;</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>  <span class="ot">assert</span><span class="op">(</span><span class="dv">3</span> <span class="op">*</span> x <span class="op">==</span> <span class="dv">3</span> <span class="op">*</span> id<span class="op">(</span>x<span class="op">));</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Note the implication that if <code>id</code> is not inlined
<code>x</code> must be given a representable value for consistency. When
accuracy is more important than consistency, the unary <code>+</code>
may be used to perform a trivial calculation to be (re)rounded:</p>
<div class="sourceCode" id="cb3"><pre
class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> sink<span class="op">(</span><span class="dt">double</span><span class="op">);</span>  <span class="co">// not inlined</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="dt">double</span> tee<span class="op">(</span><span class="dt">double</span> x<span class="op">,</span> <span class="dt">double</span> y<span class="op">,</span> <span class="dt">double</span> a<span class="op">)</span> <span class="op">{</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>  <span class="at">const</span> <span class="dt">double</span> m <span class="op">=</span> x <span class="op">*</span> y<span class="op">;</span></span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a>  sink<span class="op">(+</span>m<span class="op">);</span></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>  <span class="cf">return</span> m <span class="op">+</span> a<span class="op">;</span></span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Because <code>sink</code> need not be provided the exact value of
<code>m</code>, an FMA may still be used for the return value. An
unusually aggressive implementation might exploit the unspecified
accuracy of even the <code>+</code> operation to perform the same
optimization if the <code>+</code> appeared instead in the
<code>return</code> statement, but that would not be consistent with
advertising arithmetic conforming to (say) IEC 60559.</p>
<p>For obvious practical reasons, a value that escapes past an
optimization frontier cannot actually store information beyond its bit
pattern. The stability requirement implies that any such value must be
normalized to its “memory type” upon computation. However, even the
member variables of wrapper objects can have an extended value if the
optimizer sees their entire lifetime (as is typical for temporaries in
the sorts of expressions we want to optimize freely), because they truly
are members of the type. Similarly, assignment does not need the
normalization effect described in the [expr.pre]/6 footnote; even a
value updated in a loop may be extended so long as its intermediate
values do not escape. Values passed to and returned from
standard-library mathematical functions can also be extended.</p>
<p>Even literals are subject to the same rules: those that appear as
operands or that are used only locally can preserve extra digits with
any available extra precision, while those that “escape” must be
represented in the ordinary fashion for their type for consistency. One
means of such escape is a constant variable that might be used in
multiple translation units (because it is inline, appears in an
importable module unit, or is used in a template argument), unless
whole-program optimization can arrange for some particular, more
accurate value to be used in all cases.</p>
<p>As there are no opaque functions (merely insufficiently clever
optimizers), it is only prudent to retain an explicit means of requiring
normalization; <code>static_cast</code> is the obvious candidate (the
other part of the footnote), although <code>std::memcpy</code> would
also have the effect of selecting the canonical value associated with a
value representation. (For pointers, <code>std::memcpy</code> needs to
be able to preserve the abstract-machine information to prevent
undefined behavior, but here it would be unnecessary and difficult to
specify since it does affect observable behavior.)</p>
<h2 id="implementation-issues">Implementation issues</h2>
<p>While it would be a conforming implementation of the existing
standard and of this proposal to spill all values immediately and to
never use fused multiply-adds, generating performant code with this
proposal requires the implementation to track the usage of existing
values rather than to merely round, and avoid FMA contraction, when
encountering casts and assignments. There is an analogy to escape
analysis of pointers: when a use of a value requires that it be
representable in the in-memory format (perhaps because it crosses beyond
the current optimization frontier), the implementation must arrange for
that value to be appropriately rounded when it is computed or for it to
be excluded from FMA contraction. (In a sufficiently long function, it
might be difficult to maintain that information about all values; it is
permissible to conservatively apply the treatment for values that escape
to any that have not been proven not to escape.)</p>
<p>In the SG6 discussion of this paper in Hagenberg, it was pointed out
that some existing implementations do not maintain in their intermediate
representations adequate information to identify when a value is spilled
from an X87 register (and thus rounded to its memory representation) or
involved in an FMA contraction. However, it should be noted that
existing implementations do not conform to the current wording (which is
both inadequate and questionably normative) by default, even with
options such as <code>-std=c++… -pedantic</code>; in that regard, this
proposal can be said to improve the alignment with existing
implementations (in the absence of special options like
<code>-ffp-contract=on</code> (which despite its name actually reduces
the use of FMA)) in that assignment (or initialization) is no longer
considered to necessitate the loss of any excess precision.</p>
<h1 id="proposal">Proposal</h1>
<p>Modify the definition of trivially-copyable types to allow
floating-point types to have multiple values (one of which is canonical)
per value representation. Specify that acquiring a value representation
gives a floating-point object the corresponding canonical value.</p>
<p>Replace the “greater precision and range” provision ([expr.pre]/6)
with a note about the contextual dependence of rounding. Specify that
unary <code>+</code> can, and that <code>static_cast</code> does, round
floating-point values to their corresponding canonical values.</p>
</body>
</html>
