<html data-lt-installed="true">

<head>
  <meta http-equiv="content-type" content="text/html;">
  <title>Oo... adding a coherent character sequence to start octal-literals</title>
  <style type="text/css">
    p {
      text-align: justify
    }

    li {
      text-align: justify
    }

    blockquote.note {
      background-color: #E0E0E0;
      padding-left: 15px;
      padding-right: 15px;
      padding-top: 1px;
      padding-bottom: 1px;
    }

    ins {
      background-color: #A0FFA0
    }

    del {
      background-color: #FFA0A0
    }
  </style>
</head>

<body cz-shortcut-listen="true" data-new-gr-c-s-check-loaded="8.929.0" data-gr-ext-installed=""
  data-new-gr-c-s-loaded="8.929.0">
  <table>
    <tbody>
      <tr>
        <td align="left">Doc. no.</td>
        <td align="left">P0085R2</td>
      </tr>
      <tr>
        <td align="left">Audience:</td>
        <td align="left">EWG, LEWG</td>
      </tr>
      <tr>
        <td align="left">Date:</td>
        <td align="left">05-06-2025</td>
      </tr>
      <tr>
        <td align="left">Project:</td>
        <td align="left">ISO JTC1/SC22/WG21: Programming Language C++, evolution group</td>
      </tr>
      <tr>
        <td align="left">Reply to:</td>
        <td align="left">
          Jolly Chen &lt;<a href="mailto:Jolly.Chen@cern.ch">Jolly.Chen@cern.ch</a>&gt;,
          Axel Naumann &lt;<a href="mailto:Axel.Naumann@cern.ch">Axel.Naumann@cern.ch</a>&gt;,
          Michael Jonker &lt;<a href="mailto:Michael.Jonker@cern.ch">Michael.Jonker@cern.ch</a>&gt;,
        </td>
      </tr>
    </tbody>
  </table>


  <h1 align="center">Oo... adding a coherent character sequence to begin octal-literals</h1>
  <!-- style and layout copied from: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4340.html -->
  <!-- for submission instructions: https://isocpp.org/std/submit-a-proposal -->

  <a name="Revision History"></a>
  <h2>Revision History</h2>
  <h3>Since <a href="https://wg21.link/p0085r1">P0085R1</a></h3>
  <ul>
    <li>Fix mismatch in technical specification of integer literals and lexical conventions</li>
    <li>Extend examples with common traps where octal-literals are assumed to be decimals by the programmer</li>
    <li>Extend effects on existing code</li>
    <li>Fixed encoding of ’</li>
  </ul>
  <h3>Since <a href="https://wg21.link/p0085r0">P0085R0</a></h3>
  <ul>
    <li>Rebased Technical Specification onto <a href="https://wg21.link/n5008">N5008</a></li>
    <li>Align grammar with C's <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a></li>
    <li>Propose deprecation of original octal literal</li>
    <li>Removed proposed change "Technical Specification C run time library functions: strol, scanf". This has been
      handled by C in
      <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>.
    </li>
    <li>Removed proposed change to Addendum, character literals. Octals in escape sequences have been added by
      <a href="https://wg21.link/p2290">P2290R3</a>
    </li>
    <li>Added open question on <tt>std::format</tt> with the <tt>#</tt> option for octal literals</li>
  </ul>

  <a name="Proposal"></a>
  <h2>Proposal</h2>
  <p>Proposal to <b>add <tt>0o</tt></b> and<b> <tt>0O</tt></b> as an alternative (and preferred) sequence to introduce
    octal-literals and <b>deprecate use of the old prefix <tt>0</tt></b> for non-zero octal-literals.
  </p>
  <p>
    The syntax rule to interpret integer literals starting with a zero as
    octal-literals might be called a 'historical mistake'.
    It can be easily misunderstood by novice programmers and can lead to surprising errors.
  </p>
  <p>
    To allow future generations (of developers if not compilers) to correct this feature,
    we propose to add the character sequence <tt>0o</tt> and <tt>0O</tt> as preferred sequences to
    introduce an octal-literal. The prefix <tt>0o</tt> follows the model set by the prefix <tt>0x</tt>
    to introduce a hex-literal, and (since c++14) <tt>0b</tt> to introduce a binary-literal.
  </p>
  <p>
    Additionally, we propose to deprecate the use of octal integer literals with the <tt>0</tt> prefix, <i>
      apart from <tt>0</tt> itself</i>; this is similar to C declaring the use of nonzero octal integer literals
    without the prefix <tt>0o</tt> or <tt>0O</tt> an obsolescent feature (see 6.11.5 in <a
      href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3550.pdf">C2y draft</a>). This opens the door for
    compilers to warn about the use of the deprecated <tt>0</tt> prefix, and eventually (on the scale of a few
    releases), potentially interpret numbers with leading zeroes as decimals.
  </p>

  <p>
    From <a
      href="http://en.wikipedia.org/wiki/Octal">http://en.wikipedia.org/wiki/Octal//en.wikipedia.org/wiki/Octal</a>:
    <i>
      "Newer languages have been abandoning the prefix 0, as decimal numbers
      are often represented with leading zeroes.
      The prefix q was introduced to avoid the prefix o being mistaken for a
      zero,
      while the prefix 0o was introduced to avoid starting a numerical literal
      with an alphabetic character (like o or q),
      since these might cause the literal to be confused with a variable name.
      The prefix 0o also follows the model set by the prefix 0x used for
      hexadecimal literals in the C language;
      it is supported by Haskell,[11] OCaml,[12] Perl 6,[13] Python as of
      version 3.0,[14] Ruby,[15] Tcl as of version 9,[16] and it is intended
      to be supported by ECMAScript 6[17] (the prefix 0 has been discouraged
      in ECMAScript 3 and dropped in ECMAScript 5[18])."
    </i>
  </p>

  <p>
    This proposal now reflects a recent corresponding change in C
    (<a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>)
    that was triggered by an earlier version of this proposal.
  </p>


  <a name="Examples"></a>
  <h2>Examples</h2>
  <h3>Padding</h3>
  <p>Padding leading zeros are often (attempted to be) used to nicely align numbers. This leads
    to surprising results if the programmer expected the number to be interpreted as a decimal. For
    example:
  </p>

  <ul>
    <li>
      <p>Table of numbers:
      <pre>
      std::array&lt;std::array&lt;int, 2&gt;, 2&gt; table = {
        { 100, 042 },
        { 107, 000 }
      };</pre>
      In this case, the programmer intended to write the decimal numbers <tt>100, 42, 107</tt>, and <tt>0</tt>. However,
      the values <tt>000</tt> and <tt>042</tt> are interpreted as octal-literals. While the octal-literal <tt>000</tt>
      is equivalent to the intended decimal number <tt>0</tt>, the octal-literal
      <tt>042</tt> is equivalent to the decimal number <tt>34</tt>, triggering bugs that are hard to find by the
      octal-unaware
      programmer or reader of code.
      </p>
    </li>

    <li>
      <p>Examples for issued caused by padding version numbers:
      <ul>
        <li><a
            href="https://github.com/AcademySoftwareFoundation/OpenColorIO/issues/1818">https://github.com/AcademySoftwareFoundation/OpenColorIO/issues/1818</a>
        </li>
        <li><a href="https://github.com/ThePhD/sol2/issues/265">https://github.com/ThePhD/sol2/issues/265</a></p>
        </li>
      </ul>
    </li>
  </ul>


  <h3>After this proposal</h3>
  <p>Following this proposal, these literals all specify the same number:
    <!-- beautify ignore:start -->
  <table border="1" style="border-collapse:collapse; margin: 1em">
    <tr><th>Literal</th><th>Before</th><th>After</th></tr>
    <tr><td>Hex</td><td>0x2A</td><td>0x2A</td></tr>
    <tr><td>Binary</td><td>0b00101010</td><td>0b00101010</td></tr>
    <tr><td>Octal</td><td>052</td><td>0o52</td></tr>
  </table>
  <p>The old octal literal <tt>052</tt> will remain valid but deprecated.</p>
  <!-- beautify ignore:end -->
  </p>

  <a name="Effects on existing code"></a>
  <h2>Effects on existing code</h2>
  <p>
    This proposal introduces <tt>0o</tt> and <tt>0O</tt> as new prefixes for octal-literals.
    Under the current standard, any sequence starting with <tt>0o</tt> or <tt>0O</tt> is illegal.
    Consequently, the proposed additions <tt>0o</tt> and <tt>0O</tt> will not break existing code.
  </p>
  <p>
    Additionally, this proposal deprecates the existing error-prone syntax rule for non-zero integer literals starting
    with a zero, following the example of C (<a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>).
    This would affect existing, traditional octal-literals (i.e. with a leading <tt>0</tt>), for instance when defining
    POSIX file permissions (sometimes padded with multiple leading zeros), as seen in popular
    repositories (&gt;500 stars) e.g., <a
      href="https://github.com/root-project/root/blob/b690e5c5f91a2131ef8595d85644f72f87221cdf/core/base/inc/TSystem.h#L96">ROOT</a>,
    <a
      href="https://github.com/qt-creator/qt-creator/blob/ec06e808a00049352519d11dc5552ed3ec71d8e1/src/libs/gocmdbridge/client/bridgedfileaccess.cpp#L266">Qt
      Creator</a>,
    <a
      href="https://github.com/InstantWebP2P/node-android/blob/e4652cc09233c35393bf1089d7871e31f2cb1941/app/src/main/jni/constants.cpp#L60">node-android</a>,
    <a
      href="https://github.com/KDE/kdiff3/blob/2cfe5cfd2bbf79a5562099de39da27b4100578e7/src/fileaccess.cpp#L440">KDiff3</a>.
  </p>
  <p>
    We have received concerns that this deprecation leads to warnings for octal-literals that have the same
    value whether interpreted as octal or decimal. Examples of such literals are:
    <tt>01, 02, 03, 04, 05, 06, 07</tt>.
    To keep the door open for a future proposal introducing <tt>01...09</tt> as decimal literals (typical use case:
    <tt>1970y/January/01d</tt>), implementations might choose to not diagnose <tt>01..07</tt>.
  </p>

  <h2><a name="6.0">Technical Specification</a></h2>

  <p>
    To match <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>,
    make the following edits (relative to
    <a href="https://wg21.link/n5008">N5008</a>),
    highlighting the <ins>insertions</ins> and <del>removals</del>:
  </p>

  <h3><a name="5.13.2">5.13.2 Integer literals [lex.icon]</a></h3>
  <pre>
    <i>octal-literal
      <del>0</del>
      <del>octal-literal ’<sub>opt</sub> octal-digit</del>
      <ins>prefixed-octal-literal</ins>
      <ins>unprefixed-octal-literal</ins>
    </i>
    <i><ins>unprefixed-octal-literal:</ins>
      <ins>0</ins>
      <ins>0 ’<sub>opt</sub> octal-digit-sequence</ins>
    </i>
    <i><ins>prefixed-octal-literal:</ins>
      <ins>octal-prefix octal-digit-sequence</ins></i>
  </pre>

  <p>Before <i>hexadecimal-prefix</i> insert</p>
  <pre>
    <i><ins>octal-prefix: one of</ins>
      <ins>0o 0O</ins></i></pre>

  <p>Before <i>hexadecimal-digit-sequence</i> insert</p>
  <pre>
    <i><ins>octal-digit-sequence:</ins>
      <ins>octal-digit</ins>
      <ins>octal-digit-sequence ’<sub>opt</sub> octal-digit</ins></i>
  </pre>
  <p>
    <i>
      2 [Example 1 : The number twelve can be written <tt>12, <del>014</del>, <ins>0o14,</ins> 0XC</tt>, or
      <tt>0b1100</tt>. The
      integer-literals 1048576, 1’048’576,
      0X100000, 0x10’0000, and <del>0’004’000’000</del> <ins>0o0’004’000’000,</ins> all have
      the same value. — end example]
    </i>
  </p>
  <h3><a name="31.12.8.4">31.12.8.4 Enum class <tt>perm</tt> [fs.enum.perms]</a></h3>
  <!-- beautify ignore:start -->
  <p>Table 149 — Enum class perms [tab:fs.enum.perms]
  <table border="1">
    <tr><th>Name</th><th>Value (octal)</th><th>POSIX macro</th><th>Definition or notes</th></tr>
    <tr><td><tt>none</tt></td><td><tt>0<ins>o</ins></tt></td><td></td><td>There are no permissions set for the file.</td></tr>
    <tr><td><tt>owner_read</tt></td><td><tt>0<ins>o</ins>400</tt></td><td><tt>S_IRUSR</tt></td><td>Read permission, owner</td></tr>
    <tr><td><tt>owner_write</tt></td><td><tt>0<ins>o</ins>200</tt></td><td><tt>S_IWUSR</tt></td><td>Write permission, owner</td></tr>
    <tr><td><tt>owner_exec</tt></td><td><tt>0<ins>o</ins>100</tt></td><td><tt>S_IXUSR</tt></td><td>Execute/search permission, owner</td></tr>
    <tr><td><tt>owner_all</tt></td><td><tt>0<ins>o</ins>700</tt></td><td><tt>S_IRWXU</tt></td><td>Read, write, execute/search by owner; <tt>owner_read | owner_write | owner_exec</tt></td></tr>
    <tr><td><tt>group_read</tt></td><td><tt>0<ins>o</ins>40</tt></td><td><tt>S_IRGRP</tt></td><td>Read permission, group</td></tr>
    <tr><td><tt>group_write</tt></td><td><tt>0<ins>o</ins>20</tt></td><td><tt>S_IWGRP</tt></td><td>Write permission, group</td></tr>
    <tr><td><tt>group_exec</tt></td><td><tt>0<ins>o</ins>10</tt></td><td><tt>S_IXGRP</tt></td><td>Execute/search permission, group</td></tr>
    <tr><td><tt>group_all</tt></td><td><tt>0<ins>o</ins>70</tt></td><td><tt>S_IRWXG</tt></td><td>Read, write, execute/search by group; <tt>group_read | group_write | group_exec</tt></td></tr>
    <tr><td><tt>others_read</tt></td><td><tt>0<ins>o</ins>4</tt></td><td><tt>S_IROTH</tt></td><td>Read permission, others</td></tr>
    <tr><td><tt>others_write</tt></td><td><tt>0<ins>o</ins>2</tt></td><td><tt>S_IWOTH</tt></td><td>Write permission, others</td></tr>
    <tr><td><tt>others_exec</tt></td><td><tt>0<ins>o</ins>1</tt></td><td><tt>S_IXOTH</tt></td><td>Execute/search permission, others</td></tr>
    <tr><td><tt>others_all</tt></td><td><tt>0<ins>o</ins>7</tt></td><td><tt>S_IRWXO</tt></td><td>Read, write, execute/search by others; <tt>others_read | others_write | others_exec</tt></td></tr>
    <tr><td><tt>all</tt></td><td><tt>0<ins>o</ins>777</tt></td><td></td><td><tt>owner_all | group_all | others_all</tt></td></tr>
    <tr><td><tt>set_uid</tt></td><td><tt>0<ins>o</ins>4000</tt></td><td><tt>S_ISUID</tt></td><td>Set-user-ID on execution</td></tr>
    <tr><td><tt>set_gid</tt></td><td><tt>0<ins>o</ins>2000</tt></td><td><tt>S_ISGID</tt></td><td>Set-group-ID on execution</td></tr>
    <tr><td><tt>sticky_bit</tt></td><td><tt>0<ins>o</ins>1000</td><td><tt>S_ISVTX</tt></td><td>Operating system dependent.</td></tr>
    <tr><td><tt>mask</tt></td><td><tt>0<ins>o</ins>7777</tt></td><td></td><td><tt>all | set_uid | set_gid | sticky_bit</tt></td></tr>
    <tr><td><tt>unknown</tt></td><td><tt>0xFFFF</tt></td><td></td><td>The permissions are not known, such as when a <tt>file_status</tt> object is created without specifying the permissions</td></tr>
  </table>
  <!-- beautify ignore:end -->

  <h3><a name="A.3">A.3 Lexical conventions [gram.lex]</a></h3>
  <pre>
    <i>octal-literal
      <del>0</del>
      <del>octal-literal ’<sub>opt</sub> octal-digit</del>
      <ins>prefixed-octal-literal</ins>
      <ins>unprefixed-octal-literal</ins>
    </i>
    <i><ins>unprefixed-octal-literal:</ins>
      <ins>0</ins>
      <ins>0 ’<sub>opt</sub> octal-digit-sequence</ins>
    </i>
    <i><ins>prefixed-octal-literal:</ins>
      <ins>octal-prefix octal-digit-sequence</ins></i>
  </pre>

  <p>Before <i>hexadecimal-prefix</i> insert</p>
  <pre>
    <i><ins>octal-prefix: one of</ins>
      <ins>0o 0O</ins></i>
  </pre>

  <p>Before <i>hexadecimal-digit-sequence</i> insert</p>
  <pre>
    <i><ins>octal-digit-sequence:</ins>
      <ins>octal-digit</ins>
      <ins>octal-digit-sequence ’<sub>opt</sub> octal-digit</ins></i>
  </pre>

  <h3><a name="D">Annex D (normative) Compatibility features [depr]</a></h3>
  <p>Add a new section</p>

  <p>
  <h4><ins>D??. Deprecated unprefixed octal literal [depr.oct]</ins></h4>
  </p>
  <p><ins>1 A non-zero octal literal ([lex.icon], [gram.lex]) of the form</ins>
  <pre>
    <ins><i>unprefixed-octal-literal</i></ins></pre>
  <ins>is deprecated.</ins></p>

  <a name="Open Question"></a>
  <h2>Open Question: <tt>std::format</tt></h2>
  <p>In 28.5.2.2 Standard format specifiers [format.string.std], we have the following example:</p>
  <p><i>
      21 The available integer presentation types for integral types other than <tt>bool</tt> and <tt>charT</tt> are
      specified in
      Table 102.
      [Example 4 :
    </i>
  <pre>
    string s0 = format("{}", 42);                      // <i>value of</i> s0 <i>is</i> "42"
    string s1 = format("{0:b} {0:d} {0:o} {0:x}", 42); // <i>value of</i> s1 <i>is</i> "101010 42 52 2a"
    string s2 = format("{0:#x} {0:#X}", 42);           // <i>value of</i> s2 <i>is</i> "0x2a 0X2A"
    string s3 = format("{:L}", 1234);                  // <i>value of</i> s3 <i>can be</i> "1,234"
                                                       // <i>(depending on the locale)</i></pre><i>— end example]</i>
  </p>

  <p> The example shows that <tt>std::format</tt> returns a consistent base prefix for the alternate form <tt>#</tt>
    option with a
    hexadecimal type -- <tt>0x</tt> for <tt>#x</tt> and <tt>0X</tt> for <tt>#X</tt>.
    However, the same is not true for the <tt>#</tt> option with an octal type, where we would have:
  </p>

  <pre>
    std::string s1 = std::format("{:#o}", 042);  // <i>value of</i> s1 <i>is</i> "042"
    std::string s2 = std::format("{:#o}", 0o42); // <i>value of</i> s2 <i>is</i> "042"
    // The option #O does not exist
  </pre>

  <p>
    To follow the deprecation of the non-zero <i>unprefixed-octal-literal</i>, we would prefer a behavior change in the
    output
    of <tt>std::format</tt> with <tt>#o</tt> to use the prefix <tt>0o</tt>. By using the search term
    "<tt>std::format #o language:C++</tt>" on GitHub, we found that the format specifier <tt>#o</tt> was used in only
    23 files and out of those, 13 are educational examples and 5 are test suites. This could indicate that a behavior
    change will have minimal impact on existing code. Given the improved clarity of the new prefix (with potential
    security implications), we recommend this backward-incompatible change.
  </p>
  <p>
    <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a> calls out this problem, but does not
    attempt to solve it.
    As backward-compatible alternatives, we propose several options:
  </p>

  <h3>Option 1</h3>
  <p><i>Add the <tt>#O</tt> option, to return <tt>0o</tt></i></p>
  <li>Con: inconsistent with the convention that e.g., <tt>#X</tt> returns upper-case <tt>0X</tt></li>
  <li>Con: can be confused with the option <tt>#0</tt></li>

  <h3>Option 2</h3>
  <p><i>Add the <tt>#O</tt> option, to return <tt>0O</tt></i></p>
  <li>Pro: consistent with the convention that e.g., <tt>#X</tt> returns upper-case <tt>0X</tt></li>
  <li>Con: <tt>0O</tt> is less readable than <tt>0o</tt>, so we do not want to encourage its use because <tt>0O</tt> can
    be confused with the letter O</li>
  </li>

  <h3>Option 3</h3>
  <p><i>Add the <tt>##o</tt> and <tt>##O</tt> options, to return <tt>0o</tt> and <tt>0O</tt> respectively</i></p>
  <li>Rationale: <tt>#o</tt> asks for a single character prefix and <tt>##o</tt> asks for a 2 character prefix</li>
  <li>Pro: can format for both <tt>0o</tt> and <tt>0O</tt></li>
  <li>Con: maybe a bit obscure</li>

  <h3>Option 4</h3>
  <p><i>Do Nothing</i></p>
  <li>Con: <tt>std::format</tt> only outputs the deprecated octal-literal</li>

  <a name="Acknowledgements"></a>
  <h2>Acknowledgements</h2>
  <p>
    Thanks to Erich Keane for reviewing the draft.
    The document style was borrowed from Doc. no. <a
      href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4340.html">N4340</a>
  </p>

</body><grammarly-desktop-integration data-grammarly-shadow-root="true"></grammarly-desktop-integration>

</html>