<html>
<head>
<title>Digit Separators</title>
</head>

<body>
<h1>Digit Separators</h1>

<p>ISO/IEC JTC1 SC22 WG21 N2281 = 07-0141 - 2007-05-02

<p>Lawrence Crowl

<h2>Problem</h2>

<p>Numeric literals of more than a few digits are hard to read.
For example,
compare 237498123 with 237499123 for equality
and 237499123 with 20249472 for relative magnitude.

<h2>Solution</h2>

<p>We propose to add
the underscore character as a digit separator in numeric literals.
The pervious examples become clearer as
237_498_123 with 237_499_123 and 
237_499_123 with 20_249_472.

<h2>Alternate Solutions</h2>

<p>Bjarne Stroustrup has suggested using a space as a separator.
While this approach is consistent with some presentation styles,
it would likely make editing tools that grab "words" less reliable.
Furthermore, the preprocessor syntax would need to change
(section 2.9 below) with potential unforseen risk.

<h2>Unaddressed Issues</h2>

<p>This proposal does not address binary literals
or hexadecimal floating-point literals.
We believe those should be addressed in a separate paper.

<h2>Implementation</h2>

<p>This approach has been implemented in the Ada programming language.

<h2>Changes to the C++ Standard</h2>

The changes to the standard are minimal
and introduce no incompatibilities for correct code.
The lack of incompatibilities arises
because there is no place in the grammar
where a numeric literal is followed by an identifier.

<h3>2.9 [lex.ppnumber]</h3>

<p>To the grammar, no changes are necessary.
The preprocessing number tokens already admit an underscore
via the non-terminal <var>nondigit</var>.

<dl>
<dt><var>pp-number:</var></dt>
<dd><var>digit</var></dd>
<dd><code>.</code> <var>digit</var></dd>
<dd><var>pp-number digit</var></dd>
<dd><var>pp-number nondigit</var></dd>
<dd><var>pp-number</var> <code>e</code> <var>sign</var></dd>
<dd><var>pp-number</var> <code>E</code> <var>sign</var></dd>
<dd><var>pp-number</var> <code>.</code></dd>
</dl>

<h3>2.10 Identifiers [lex.name]</h3>

<p>To the grammar, no changes are necessary.
This section defines non-terminals used elsewhere.

<dl>
<dt><var>nondigit:</var> one of</dt>
<dd><code>a b c d e f g h i j k l m</code></dd>
<dd><code>n o p q r s t u v w x y z</code></dd>
<dd><code>A B C D E F G H I J K L M</code></dd>
<dd><code>N O P Q R S T U V W X Y Z _</code></dd>
<dt><var>digit:</var> one of</dt>
<dd><code>0 1 2 3 4 5 6 7 8 9</code></dd>
</dl>

<h3>2.13.1 Integer literals [lex.icon]</h3>

<p>To the grammar, edit as follows,
permitting underscores between digits.
(Note some renderings of HTML will overstrike
the underscore with the markup for inserted text.
Look for "more than usual" spacing in that case.)

<dl>
<dt><var>integer-literal:</var></dt>
<dd><var>decimal-literal integer-suffix<sub>opt</sub></var></dd>
<dd><var>octal-literal integer-suffix<sub>opt</sub></var></dd>
<dd><var>hexadecimal-literal integer-suffix<sub>opt</sub></var></dd>
<dt><var>decimal-literal:</var></dt>
<dd><var>nonzero-digit</var></dd>
<dd><var>decimal-literal digit</var></dd>
<dd><ins><var>decimal-literal</var> <code>_</code> <var>digit</var></ins></dd>
<dt><var>octal-literal:</var></dt>
<dd><code>0</code></dd>
<dd><var>octal-literal octal-digit</var></dd>
<dd><ins><var>octal-literal</var> <code>_</code> <var>octal-digit</var></ins></dd>
<dt><var>hexadecimal-literal:</var></dt>
<dd><code>0x</code> <var>hexadecimal-digit</var></dd>
<dd><code>0X</code> <var>hexadecimal-digit</var></dd>
<dd><var>hexadecimal-literal hexadecimal-digit</var></dd>
<dd><ins><var>hexadecimal-literal</var> <code>_</code> <var>hexadecimal-digit</var></ins></dd>
<dt><var>nonzero-digit:</var> one of</dt>
<dd><code>1 2 3 4 5 6 7 8 9</code></dd>
<dt><var>octal-digit:</var> one of</dt>
<dd><code>0 1 2 3 4 5 6 7</code></dd>
<dt><var>hexadecimal-digit:</var> one of</dt>
<dd><code>0 1 2 3 4 5 6 7 8 9</code></dd>
<dd><code>a b c d e f</code></dd>
<dd><code>A B C D E F</code></dd>
</dl>

<h3>2.13.3 Floating literals [lex.fcon]</h3>

<p>To the grammar, edit as follows,
permitting underscores between digits.
(Note some renderings of HTML will overstrike
the underscore with the markup for inserted text.
Look for "more than usual" spacing in that case.)

<dl>
<dt><var>floating-literal:</var></dt>
<dd><var>fractional-constant exponent-part<sub>opt</sub> floating-suffix<sub>opt</sub></var></dd>
<dd><var>digit-sequence exponent-part floating-suffix<sub>opt</sub></var></dd>
<dt><var>fractional-constant:</var></dt>
<dd><var>digit-sequence<sub>opt</sub></var> <code>.</code> <var>digit-sequence</var></dd>
<dd><var>digit-sequence</var> <code>.</code></dd>
<dt><var>exponent-part:</var></dt>
<dd><code>e</code> <var>sign<sub>opt</sub> digit-sequence</var></dd>
<dd><code>E</code> <var>sign<sub>opt</sub> digit-sequence</var></dd>
<dt><var>sign:</var> one of</dt>
<dd><code>+ -</code></dd>
<dt><var>digit-sequence:</var></dt>
<dd><var>digit</var></dd>
<dd><var>digit-sequence digit</var></dd>
<dd><ins><var>digit-sequence</var> <code>_</code> <var>digit</var></dd>
</dl>

</body>
</html>
