<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 3456: Pattern used by std::from_chars is underspecified</title>
<meta property="og:title" content="Issue 3456: Pattern used by std::from_chars is underspecified">
<meta property="og:description" content="C++ library issue. Status: New">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue3456.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#New">New</a> status.</em></p>
<h3 id="3456"><a href="lwg-active.html#3456">3456</a>. Pattern used by <code>std::from_chars</code> is underspecified</h3>
<p><b>Section:</b> 28.2.3 <a href="https://wg21.link/charconv.from.chars">[charconv.from.chars]</a> <b>Status:</b> <a href="lwg-active.html#New">New</a>
 <b>Submitter:</b> Jonathan Wakely <b>Opened:</b> 2020-06-23 <b>Last modified:</b> 2020-09-06</p>
<p><b>Priority: </b>3
</p>
<p><b>View other</b> <a href="lwg-index-open.html#charconv.from.chars">active issues</a> in [charconv.from.chars].</p>
<p><b>View all other</b> <a href="lwg-index.html#charconv.from.chars">issues</a> in [charconv.from.chars].</p>
<p><b>View all issues with</b> <a href="lwg-status.html#New">New</a> status.</p>
<p><b>Discussion:</b></p>
<p>
The intention of 28.2.3 <a href="https://wg21.link/charconv.from.chars">[charconv.from.chars]</a> p7 is that the <code>fmt</code> argument modifies 
the expected pattern, so that only a specific subset of valid <code>strtod</code> patterns are recognized 
for each format. This is not clear from the wording.
<p/>
When <code>fmt == chars_format::fixed</code> no exponent is to be used, so any trailing characters that match 
the form of a <code>strtod</code> exponent are ignored. For example, <code>"1.23e4"</code> should produce the result 
<code>1.23</code> for the fixed format. The current wording says "the optional exponent part shall not appear" 
which can be interpreted to mean that <code>"1.23e4"</code> violates a precondition and so has undefined behaviour!
<p/>
When <code>fmt != chars_format::hex</code> only decimal numbers should be recognized. This means that for any
format except scientific, <code>"0x123"</code> produces <code>0.0</code> (it's invalid when 
<code>fmt == chars_format::scientific</code> because there's no exponent). The current wording only says that 
when <code>hex</code> is used the string has an assumed <code>"0x"</code> prefix, so is interpreted as a hexadecimal 
float, it doesn't say that when <code>fmt != hex</code> that the string is <em>not</em> interpreted as a 
hexadecimal float.
<p/>
Two alternative resolutions are provided, one is a minimal fix and the other attempts to make it clearer by 
not referring to a modified version of the C rules.
</p>

<p><i>[2020-07-14; Jonathan fixes the <code>strtod</code> call in Option B]</i></p>


<p><i>[2020-07-17; Priority set to 3 in telecon]</i></p>



<p id="res-3456"><b>Proposed resolution:</b></p>
<p>
This wording is relative to <a href="https://wg21.link/n4861">N4861</a>. 
</p>

<b>Option A:</b>
<ol>
<li><p>Modify 28.2.3 <a href="https://wg21.link/charconv.from.chars">[charconv.from.chars]</a> as indicated:</p>

<blockquote>
<pre>
from_chars_result from_chars(const char* first, const char* last, float&amp; value,
                             chars_format fmt = chars_format::general);
from_chars_result from_chars(const char* first, const char* last, double&amp; value,
                             chars_format fmt = chars_format::general);
from_chars_result from_chars(const char* first, const char* last, long double&amp; value,
                             chars_format fmt = chars_format::general);
</pre>
<blockquote>
<p>
-6- <i>Preconditions:</i> <code>fmt</code> has the value of one of the enumerators of <code>chars_format</code>.
<p/>
-7- <i>Effects:</i> The pattern is the expected form of the subject sequence in the "C" locale, as described for
<code>strtod</code>, except that
</p>
<ol style="list-style-type: none">
<li><p>(7.1) &mdash; the sign <code>'+'</code> may only appear in the exponent part;</p></li>
<li><p>(7.2) &mdash; if <code>fmt</code> has <code>chars_format::scientific</code> set but not <code>chars_format::fixed</code>, 
the <del>otherwise optional exponent part shall appear</del><ins>exponent part is not optional</ins>;</p></li>
<li><p>(7.3) &mdash; if <code>fmt</code> has <code>chars_format::fixed</code> set but not <code>chars_format::scientific</code>, 
<del>the optional exponent part shall not appear; and</del><ins>there is no exponent part;</ins></p></li>
<li><p><ins>(?.?) &mdash; if <code>fmt</code> is not <code>chars_format::hex</code>, only decimal digits and an 
optional <code>'.'</code> appear before the exponent part (if any); and</ins></p></li>
<li><p>(7.4) &mdash; if <code>fmt</code> is <code>chars_format::hex</code>, the prefix <code>"0x"</code> or <code>"0X"</code> is 
assumed. [<i>Example:</i> The string <code>0x123</code> is parsed to have the value <code>0</code> with remaining characters 
<code>x123</code>. &mdash; <i>end example</i>]</p></li>
</ol>
<p>
In any case, the resulting <code>value</code> is one of at most two floating-point values closest to the value of the
string matching the pattern.
</p>
</blockquote>
</blockquote>
</li>
</ol>

<b>Option B:</b>
<ol>
<li><p>Modify 28.2.3 <a href="https://wg21.link/charconv.from.chars">[charconv.from.chars]</a> as indicated:</p>

<blockquote>
<pre>
from_chars_result from_chars(const char* first, const char* last, float&amp; value,
                             chars_format fmt = chars_format::general);
from_chars_result from_chars(const char* first, const char* last, double&amp; value,
                             chars_format fmt = chars_format::general);
from_chars_result from_chars(const char* first, const char* last, long double&amp; value,
                             chars_format fmt = chars_format::general);
</pre>
<blockquote>
<p>
-6- <i>Preconditions:</i> <code>fmt</code> has the value of one of the enumerators of <code>chars_format</code>.
<p/>
-7- <i>Effects:</i> <del>The pattern is the expected form of the subject sequence in the "C" locale, as described for
<code>strtod</code>, except that</del><ins>The pattern is an optional <code>'-'</code> sign followed by one of:</ins>
</p>
<ol style="list-style-type: none">
<li><p>(7.1) &mdash; <del>the sign <code>'+'</code> may only appear in the exponent part</del><ins><code>INF</code> or 
<code>INFINITY</code>, ignoring case</ins>;</p></li>
<li><p>(7.2) &mdash; <del>if <code>fmt</code> has <code>chars_format::scientific</code> set but not <code>chars_format::fixed</code>, 
the otherwise optional exponent part shall appear</del><ins>if <code>numeric_limits&lt;T&gt;::has_quiet_NaN</code> is 
<code>true</code>, <code>NAN</code> or <code>NAN(</code><i>n-char-sequence<sub>opt</sub></i><code>)</code>, ignoring case in the 
<code>NAN</code> part, where:</ins></p>
<blockquote>
<pre>
<ins><i>n-char-sequence:</i>
       <i>digit</i>
       <i>nondigit</i>
       <i>n-char-sequence digit</i>
       <i>n-char-sequence nondigit</i></ins>
</pre>
</blockquote>
<p>;</p></li>
<li><p>(7.3) &mdash; <del>if <code>fmt</code> has <code>chars_format::fixed</code> set but not <code>chars_format::scientific</code>, 
the optional exponent part shall not appear; and</del><ins>if <code>fmt</code> is equal to <code>chars_format::scientific</code>, 
a sequence of characters matching <i>chars-format-dec exponent-part</i>, where:</ins></p>
<blockquote>
<pre>
<ins><i>chars-format-dec:</i>
         <i>fractional-constant</i>
         <i>digit-sequence</i></ins>
</pre>
</blockquote>
<p><ins>;</ins></p></li>
<li><p>(7.4) &mdash; <del>if <code>fmt</code> is <code>chars_format::hex</code>, the prefix <code>"0x"</code> or <code>"0X"</code> is 
assumed. [<i>Example:</i> The string <code>0x123</code> is parsed to have the value <code>0</code> with remaining characters 
<code>x123</code>. &mdash; <i>end example</i>]</del><ins>if <code>fmt</code> is equal to <code>chars_format::fixed</code>, a 
sequence of characters matching <i>chars-format-dec</i>;</ins></p></li>
<li><p><ins>(?.?) &mdash; if <code>fmt</code> is equal to <code>chars_format::general</code>, a sequence of characters matching 
<i>chars-format-dec exponent-part<sub>opt</sub></i>; or</ins></p></li>
<li><p><ins>(?.?) &mdash; if <code>fmt</code> is equal to <code>chars_format::hex</code>, a sequence of characters matching 
<i>chars-format-hex binary-exponent-part<sub>opt</sub></i>, where:</ins></p>
<blockquote>
<pre>
<ins><i>chars-format-hex:</i>
         <i>hexadecimal-fractional-constant</i>
         <i>hexadecimal-digit-sequence</i></ins>
</pre>
</blockquote>
<p><ins>[<i>Note:</i> The pattern is derived from the subject sequence in the <code>"C"</code> locale for <code>strtod</code>, 
with the value of <code>fmt</code> limiting which forms of the subject sequence are recognized, and with no <code>0x</code> 
or <code>0X</code> prefix recognized. &mdash; <i>end note</i>]</ins></p></li>
</ol>
<p>
<ins>For a character sequence <code>INF</code>, <code>INFINITY</code>, <code>NAN</code>, or 
<code>NAN(</code><i>n-char-sequence<sub>opt</sub></i><code>)</code> the resulting value is obtained as if by evaluating 
<code>strtod(string(first, last).c_str(), nullptr)</code> in the <code>"C"</code> locale.
In all other cases</ins><del>In any case</del>,
the resulting <code>value</code> is one of at most two floating-point values closest to the value of the
string matching the pattern.
</p>
</blockquote>
</blockquote>
</li>
</ol>





</body>
</html>
