<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 3968: std::endian::native value should be more specific about object representations</title>
<meta property="og:title" content="Issue 3968: std::endian::native value should be more specific about object representations">
<meta property="og:description" content="C++ library issue. Status: New">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue3968.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#New">New</a> status.</em></p>
<h3 id="3968"><a href="lwg-active.html#3968">3968</a>. <code>std::endian::native</code> value should be more specific about object representations</h3>
<p><b>Section:</b> 22.11.8 <a href="https://wg21.link/bit.endian">[bit.endian]</a> <b>Status:</b> <a href="lwg-active.html#New">New</a>
 <b>Submitter:</b> Brian Bi <b>Opened:</b> 2023-08-06 <b>Last modified:</b> 2024-02-22</p>
<p><b>Priority: </b>4
</p>
<p><b>View all issues with</b> <a href="lwg-status.html#New">New</a> status.</p>
<p><b>Discussion:</b></p>
<p>
22.11.8 <a href="https://wg21.link/bit.endian">[bit.endian]</a> says that "big-endian" and "little-endian" refer to whether bytes are stored 
in descending or ascending order of significance. In other words, when <code>std::endian::native</code> is either 
<code>std::endian::big</code> or <code>std::endian::little</code>, we are told something about the object representations o
f multi-byte scalar types. However, the guarantee provided in this case is not strong enough to fully specify 
the object representation, even in the common situation where padding bits are not present. It would be more 
useful to provide a stronger guarantee.
<p/>
Consider, for example, if <code>char</code> is 8 bits and there is an <code>uint32_t</code> type on the current platform. 
If <code>std::endian::native</code> is <code>std::endian::little</code>, then the program should be able to rely on the 
fact that if a <code>uint32_t</code> object is copied into an array of 4 <code>unsigned char</code>, then the value of 
the first element of that array actually equals the original value modulo 256. However, because 
<a href="https://wg21.link/P1236R1" title=" Alternative Wording for P0907R4 Signed Integers are Two's Complement">P1236R1</a> removed the core language specification of the value representation of unsigned integer 
types, the program cannot actually rely on this. It is conceivable (though unlikely), for example, that 
<code>std::endian::native</code> could be <code>std::endian::little</code> but the first byte in a <code>uint32_t</code> 
object is actually the least significant 8 bits flipped, or the least significant 8 bits permuted, or something 
like that.
</p>

<p><i>[2024-02-22; Reflector poll]</i></p>

<p>
Set priority to 4 after reflector poll in August 2023.
</p>
<p><i>[Jonathan expressed shock that <a href="https://wg21.link/P1236R1" title=" Alternative Wording for P0907R4 Signed Integers are Two's Complement">P1236R1</a> remove portability guarantees that were previously present.]</i></p>

<p><i>[Jens explained that no observable guarantees were ever present anyway, which is why Core removed the wording.]</i></p>

<p>
I agree with the thrust of the issue (i.e. the special values for
<code>std::endian</code> should permit reliance on a particular object
representation), but I disagree with the wording chosen.  The
"pure binary" phrasing that is sort-of defined in a footnote
is bad.  I think we want to say that all scalar types have no
padding bits and that the base-2 representation of
an unsigned integer type is formed by the bit concatenation
of the base-2 representations of the "unsigned char" values that
comprise the object representation of that unsigned integer type.
"bit concatenation" should best be phrased in math, e.g.
given a value <em>x</em> of some unsigned integer type and the
sequence of unsigned char values c<sup>j</sup> (each having width M)
comprising the object representation of x,
the coefficients of the base-2 representation of x are
 x<sub>i</sub> = c<sup>&lfloor;i/M&rfloor;</sup><sub>i mod M</sub>
or somesuch.  See 7.6.11 <a href="https://wg21.link/expr.bit.and">[expr.bit.and]</a> for some phrasing in this area.
</p>



<p id="res-3968"><b>Proposed resolution:</b></p>
<p>
This wording is relative to <a href="https://wg21.link/N4950" title=" Working Draft, Standard for Programming Language C++">N4950</a>.
</p>

<ol>

<li><p>Modify the 22.11.8 <a href="https://wg21.link/bit.endian">[bit.endian]</a> as indicated; using 
<a href="https://timsong-cpp.github.io/cppwp/n4659/basic.fundamental#7">removed wording from C++17</a>:</p>

<blockquote>
<p>
-2- <del>If all scalar types have size 1 byte, then all of <code>endian::little</code>, <code>endian::big</code>, 
and <code>endian::native</code> have the same value. Otherwise, <code>endian::little</code> is not equal to 
<code>endian::big</code>. If all scalar types are big-endian, <code>endian::native</code> is equal to 
<code>endian::big</code>. If all scalar types are little-endian, <code>endian::native</code> is equal to 
<code>endian::little</code>. Otherwise, <code>endian::native</code> is not equal to either <code>endian::big</code> or
<code>endian::little</code>.</del><ins><code>endian::little</code> is equal to <code>endian::big</code> if and only if 
all scalar types have size 1 byte. If the value representation (6.9 <a href="https://wg21.link/basic.types">[basic.types]</a>) of every 
unsigned integer type uses a pure binary numeration system<sup>footnote ?</sup>, then:</ins>
</p>
<ul>
<li><p><ins>If all scalar types have size 1 byte, then <code>endian::native</code> is equal to the common value 
of <code>endian::little</code> and <code>endian::big</code>.</ins></p></li>
<li><p><ins>Otherwise, if all scalar types are big-endian, <code>endian::native</code> is equal to <code>endian::big</code>.</ins></p></li>
<li><p><ins>Otherwise, if all scalar types are little-endian, <code>endian::native</code> is equal to <code>endian::little</code>.</ins></p></li>
<li><p><ins>Otherwise, <code>endian::native</code> is not equal to either <code>endian::big</code> or <code>endian::little</code>.</ins></p></li>
</ul>
<p>
<ins>Otherwise, <code>endian::native</code> is not equal to either <code>endian::big</code> or <code>endian::little</code>.</ins>
</p>
<blockquote><p>
<ins>footnote ?) A positional representation for integers that uses the binary digits 0 and 1, in which the 
values represented by successive bits are additive, begin with 1, and are multiplied by successive integral 
powers of 2, except perhaps for the bit with the highest position. (Adapted from the American National 
Dictionary for Information Processing Systems.)</ins>
</p></blockquote>
</blockquote>
</li>
</ol>





</body>
</html>
