<html data-lt-installed="true">
<head>
  <meta http-equiv="content-type" content="text/html; charset=windows-1252">
  <title>Oo... adding a coherent character sequence to start octal-literals</title>
  <style type="text/css">
    p {
      text-align: justify
    }

    li {
      text-align: justify
    }

    blockquote.note {
      background-color: #E0E0E0;
      padding-left: 15px;
      padding-right: 15px;
      padding-top: 1px;
      padding-bottom: 1px;
    }

    ins {
      background-color: #A0FFA0
    }

    del {
      background-color: #FFA0A0
    }
  </style>
</head>

<body cz-shortcut-listen="true" data-new-gr-c-s-check-loaded="8.929.0" data-gr-ext-installed=""
  data-new-gr-c-s-loaded="8.929.0">
  <table>
    <tbody>
      <tr>
        <td align="left">Doc. no.</td>
        <td align="left">P0085R1</td>
      </tr>
      <tr>
        <td align="left">Audience:</td>
        <td align="left">EWG, LEWG</td>
      </tr>
      <tr>
        <td align="left">Date:</td>
        <td align="left">12-05-2025</td>
      </tr>
      <tr>
        <td align="left">Project:</td>
        <td align="left">ISO JTC1/SC22/WG21: Programming Language C++, evolution group</td>
      </tr>
      <tr>
        <td align="left">Reply to:</td>
        <td align="left">
          Jolly Chen &lt;<a href="mailto:Jolly.Chen@cern.ch">Jolly.Chen@cern.ch</a>&gt;,
          Axel Naumann &lt;<a href="mailto:Axel.Naumann@cern.ch">Axel.Naumann@cern.ch</a>&gt;,
          Michael Jonker &lt;<a href="mailto:Michael.Jonker@cern.ch">Michael.Jonker@cern.ch</a>&gt;,
        </td>
      </tr>
    </tbody>
  </table>


  <h1 align="center">Oo... adding a coherent character sequence to begin octal-literals</h1>
  <!-- style and layout copied from: http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4340.html -->
  <!-- for submission instructions: https://isocpp.org/std/submit-a-proposal -->

  <a name="Revision History"></a>
  <h2>Revision History</h2>
  <h3>Since <a href="https://wg21.link/p0085r0">P0085R0</a></h3>
  <ul>
    <li>Rebased Technical Specification onto <a href="https://wg21.link/n5008">N5008</a></li>
    <li>Align grammar with C's <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a></li>
    <li>Propose deprecation of original octal literal</li>
    <li>Removed proposed change "Technical Specification C run time library functions: strol, scanf". This has been
      handled by C in
      <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>.
    </li>
    <li>Removed proposed change to Addendum, character literals. Octals in escape sequences have been added by
      <a href="https://wg21.link/p2290">P2290R3</a>
    </li>
    <li>Added open question on <tt>std::format</tt> with the <tt>#</tt> option for octal literals</li>
  </ul>

  <a name="Proposal"></a>
  <h2>Proposal</h2>
  <p>Proposal to add <tt>0o</tt> and <tt>0O</tt> as an alternative (and preferred) sequence to introduce octal-literals
    and deprecate the old prefix <tt>0</tt>.
  </p>
  <p>
    The syntax rule to interpret integer literals starting with a zero as
    octal-literals might be called a ’historical mistake’.
    It can be easily misunderstood by novice programmers and can lead to surprising errors.
  </p>
  <p>
    To allow future generations (of developers if not compilers) to correct this feature,
    we propose to add the character sequence <tt>0o</tt> and  <tt>0O</tt>  as preferred sequences to
    introduce an octal-literal. The prefix <tt>0o</tt> follows the model set by the prefix <tt>0x</tt>
    to introduce a hex-literal, and (since c++14) <tt>0b</tt> to introduce a binary-literal.
    Compilers will be able to warn about the use of the deprecated <tt>0</tt> prefix,
    and eventually (on the scale of a few releases) the old syntax can be removed.
  </p>

  <p>
    From <a
      href="http://en.wikipedia.org/wiki/Octal">http://en.wikipedia.org/wiki/Octal//en.wikipedia.org/wiki/Octal</a>:
    <i>
      "Newer languages have been abandoning the prefix 0, as decimal numbers
      are often represented with leading zeroes.
      The prefix q was introduced to avoid the prefix o being mistaken for a
      zero,
      while the prefix 0o was introduced to avoid starting a numerical literal
      with an alphabetic character (like o or q),
      since these might cause the literal to be confused with a variable name.
      The prefix 0o also follows the model set by the prefix 0x used for
      hexadecimal literals in the C language;
      it is supported by Haskell,[11] OCaml,[12] Perl 6,[13] Python as of
      version 3.0,[14] Ruby,[15] Tcl as of version 9,[16] and it is intended
      to be supported by ECMAScript 6[17] (the prefix 0 has been discouraged
      in ECMAScript 3 and dropped in ECMAScript 5[18])."
    </i>
  </p>


  <a name="Examples"></a>
  <h2>Examples</h2>
  <p>The following literals all specify the same number.</p>
  <table border="1">
    <tr><th>Literal</th><th>Before</th><th>After</th></tr>
    <tr><td>Hex</td><td>0x2A</td><td>0x2A</td></tr>
    <tr><td>Binary</td><td>0b00101010</td><td>0b00101010</td></tr>
    <tr><td>Octal</td><td>052</td><td>0o52</td></tr>
  </table>
  <p>The old octal literal <tt>052</tt> will continue to remain valid but deprecated.</p>

  <a name="Effects on existing code"></a>
  <h2>Effects on existing code</h2>
  <p>
    This proposal deprecates the existing error-prone syntax rule for integer literals starting with a zero, following
    the example of C (<a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>). It introduces 0o as a new prefix for octal-literals. Additionally, it deprecates the use
    of non-zero octal integer literals without the prefix 0o or 0O, similar to C declaring this an obsolescent feature.
  </p>
  <p>
    Under the current standard, any sequence starting with 0o is illegal.
    Consequently, the proposed additions <tt>0o</tt> and <tt>0O</tt> will not break existing code.
  </p>

  <h2><a name="6.0">Technical Specification</a></h2>

  <p>
    To match <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a>,
    make the following edits (relative to
    <a href="wg21.link/n5008">N5008</a>)
    with <ins>insertions</ins> and <del>removals</del> marked like so:
  </p>

  <h3><a name="5.13.2">5.13.2 Integer literals [lex.icon]</a></h3>
  <pre>
    <i>octal-literal
      <del>0</del>
      <del>octal-literal ’<sub>opt</sub> octal-digit</del>
      <ins>prefixed-octal-literal</ins>
      <ins>unprefixed-octal-literal</ins>
    </i>
    <i><ins>unprefixed-octal-literal:</ins>
      <ins>0</ins>
      <ins>0 ’<sub>opt</sub> octal-digit-sequence</ins>
    </i>
    <i><ins>prefixed-octal-literal:</ins>
      <ins>octal-prefix octal-digit-sequence</ins></i>
  </pre>

  <p>Before <i>hexadecimal-prefix</i> insert</p>
  <pre>
    <i><ins>octal-prefix: one of</ins>
      <ins>0o 0O</ins></i></pre>

  <p>Before <i>hexadecimal-digit-sequence</i> insert</p>
  <pre>
    <i><ins>octal-digit-sequence:</ins>
      <ins>octal-digit</ins>
      <ins>octal-digit-sequence ’<sub>opt</sub> octal-digit</ins></i>
  </pre>
  <p>
    <i>
      2 [Example 1 : The number twelve can be written <tt>12, <del>014</del>, <ins>0o14,</ins> 0XC</tt>, or
      <tt>0b1100</tt>. The
      integer-literals 1048576, 1’048’576,
      0X100000, 0x10’0000, and <del>0’004’000’000</del> <ins>0o0’004’000’000,</ins> all have
      the same value. — end example]
    </i>
  </p>
  <h3><a name="31.12.8.4">31.12.8.4 Enum class <tt>perm</tt> [fs.enum.perms]</a></h3>
  <p>Table 149 — Enum class perms [tab:fs.enum.perms]
  <table border="1">
    <tr><th>Name</th><th>Value (octal)</th><th>POSIX macro</th><th>Definition or notes</th></tr>
    <tr><td><tt>none</tt></td><td><tt>0<ins>o</ins></tt></td><td></td><td>There are no permissions set for the file.</td></tr>
    <tr><td><tt>owner_read</tt></td><td><tt>0<ins>o</ins>400</tt></td><td><tt>S_IRUSR</tt></td><td>Read permission, owner</td></tr>
    <tr><td><tt>owner_write</tt></td><td><tt>0<ins>o</ins>200</tt></td><td><tt>S_IWUSR</tt></td><td>Write permission, owner</td></tr>
    <tr><td><tt>owner_exec</tt></td><td><tt>0<ins>o</ins>100</tt></td><td><tt>S_IXUSR</tt></td><td>Execute/search permission, owner</td></tr>
    <tr><td><tt>owner_all</tt></td><td><tt>0<ins>o</ins>700</tt></td><td><tt>S_IRWXU</tt></td><td>Read, write, execute/search by owner; <tt>owner_read | owner_write | owner_exec</tt></td></tr>
    <tr><td><tt>group_read</tt></td><td><tt>0<ins>o</ins>40</tt></td><td><tt>S_IRGRP</tt></td><td>Read permission, group</td></tr>
    <tr><td><tt>group_write</tt></td><td><tt>0<ins>o</ins>20</tt></td><td><tt>S_IWGRP</tt></td><td>Write permission, group</td></tr>
    <tr><td><tt>group_exec</tt></td><td><tt>0<ins>o</ins>10</tt></td><td><tt>S_IXGRP</tt></td><td>Execute/search permission, group</td></tr>
    <tr><td><tt>group_all</tt></td><td><tt>0<ins>o</ins>70</tt></td><td><tt>S_IRWXG</tt></td><td>Read, write, execute/search by group; <tt>group_read | group_write | group_exec</tt></td></tr>
    <tr><td><tt>others_read</tt></td><td><tt>0<ins>o</ins>4</tt></td><td><tt>S_IROTH</tt></td><td>Read permission, others</td></tr>
    <tr><td><tt>others_write</tt></td><td><tt>0<ins>o</ins>2</tt></td><td><tt>S_IWOTH</tt></td><td>Write permission, others</td></tr>
    <tr><td><tt>others_exec</tt></td><td><tt>0<ins>o</ins>1</tt></td><td><tt>S_IXOTH</tt></td><td>Execute/search permission, others</td></tr>
    <tr><td><tt>others_all</tt></td><td><tt>0<ins>o</ins>7</tt></td><td><tt>S_IRWXO</tt></td><td>Read, write, execute/search by others; <tt>others_read | others_write | others_exec</tt></td></tr>
    <tr><td><tt>all</tt></td><td><tt>0<ins>o</ins>777</tt></td><td></td><td><tt>owner_all | group_all | others_all</tt></td></tr>
    <tr><td><tt>set_uid</tt></td><td><tt>0<ins>o</ins>4000</tt></td><td><tt>S_ISUID</tt></td><td>Set-user-ID on execution</td></tr>
    <tr><td><tt>set_gid</tt></td><td><tt>0<ins>o</ins>2000</tt></td><td><tt>S_ISGID</tt></td><td>Set-group-ID on execution</td></tr>
    <tr><td><tt>sticky_bit</tt></td><td><tt>0<ins>o</ins>1000</td><td><tt>S_ISVTX</tt></td><td>Operating system dependent.</td></tr>
    <tr><td><tt>mask</tt></td><td><tt>0<ins>o</ins>7777</tt></td><td></td><td><tt>all | set_uid | set_gid | sticky_bit</tt></td></tr>
    <tr><td><tt>unknown</tt></td><td><tt>0xFFFF</tt></td><td></td><td>The permissions are not known, such as when a <tt>file_status</tt> object is created without specifying the permissions</td></tr>
  </table>

  <h3><a name="A.3">A.3 Lexical conventions [gram.lex]</a></h3>
  <pre>
    <i>octal-literal
      <del>0</del>
      <del>octal-literal ’<sub>opt</sub> octal-digit</del>
      <ins>prefixed-octal-literal</ins>
      <ins>unprefixed-octal-literal</ins>
    </i>
    <i><ins>unprefixed-octal-literal:</ins>
      <ins>0</ins>
      <ins>0 ’<sub>opt</sub> octal-digit</ins>
    </i>
    <i><ins>prefixed-octal-literal:</ins>
      <ins>octal-prefix octal-digit-sequence</ins></i>
  </pre>

  <p>Before <i>hexadecimal-prefix</i> insert</p>
  <pre>
    <i><ins>octal-prefix: one of</ins>
      <ins>0o</ins>
      <ins>0O</ins></i>
  </pre>

  <p>Before <i>hexadecimal-digit-sequence</i> insert</p>
  <pre>
    <i><ins>octal-digit-sequence:</ins>
      <ins>octal-digit</ins>
      <ins>octal-digit-sequence ’<sub>opt</sub> octal-digit</ins></i>
  </pre>

  <h3><a name="D">Annex D (normative) Compatibility features [depr]</a></h3>
  <p>Add a new section</p>

  <p>
  <h4><ins>D??. Deprecated unprefixed octal literal [depr.oct]</ins></h4>
  </p>
  <p><ins>1 An octal literal ([lex.icon], [gram.lex]) of the form</ins>
  <pre>
    <ins><i>unprefixed-octal-literal</i></ins></pre>
  <ins>is deprecated.</ins></p>

  <a name="Open Question"></a>
  <h2>Open Question: <tt>std::format</tt></h2>
  <p>In 28.5.2.2 Standard format specifiers [format.string.std], we have the following example:</p>
  <p><i>
      21 The available integer presentation types for integral types other than <tt>bool</tt> and <tt>charT</tt> are
      specified in
      Table 102.
      [Example 4 :
    </i>
  <pre>
    string s0 = format("{}", 42);                      // <i>value of</i> s0 <i>is</i> "42"
    string s1 = format("{0:b} {0:d} {0:o} {0:x}", 42); // <i>value of</i> s1 <i>is</i> "101010 42 52 2a"
    string s2 = format("{0:#x} {0:#X}", 42);           // <i>value of</i> s2 <i>is</i> "0x2a 0X2A"
    string s3 = format("{:L}", 1234);                  // <i>value of</i> s3 <i>can be</i> "1,234"
                                                       // <i>(depending on the locale)</i></pre><i>— end example]</i>
  </p>

  <p> The example shows that <tt>std::format</tt> returns a consistent base prefix for the alternate form <tt>#</tt> option with a
    hexadecimal type -- <tt>0x</tt> for <tt>#x</tt> and <tt>0X</tt> for <tt>#X</tt>.
    However, the same is not true for the <tt>#</tt> option with an octal type, where we would have:
  </p>

  <pre>
    std::string s1 = std::format("{:#o}", 042);  // <i>value of</i> s1 <i>is</i> "042"
    std::string s2 = std::format("{:#o}", 0o42); // <i>value of</i> s2 <i>is</i> "042"
    // The option #O does not exist
  </pre>

  <p>
    To follow the deprecation of the <i>unprefixed-octal-literal</i>, we would prefer a behavior change in the output
    of <tt>std::format</tt> with <tt>#o</tt> to use the prefix <tt>0o</tt>. By using the search term
    "<tt>std::format #o language:C++</tt>" on GitHub, we found that the format specifier <tt>#o</tt> was used in only
    23 files and out of those, 13 are educational examples and 5 are test suites. This could indicate that a behavior
    change will have minimal impact on existing code. Given the improved clarity of the new prefix (with potential
    security implications), we recommend this backward-incompatible change.
  </p>
  <p>
    <a href="https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3353.htm">N3353</a> calls out this problem, but does not attempt to solve it.
    As backward-compatible alternatives, we propose several options:
  </p>

  <h3>Option 1</h3>
  <p><i>Add the <tt>#O</tt> option, to return <tt>0o</tt></i></p>
  <li>Con: inconsistent with the convention that e.g., <tt>#X</tt> returns upper-case <tt>0X</tt></li>
  <li>Con: can be confused with the option <tt>#0</tt></li>

  <h3>Option 2</h3>
  <p><i>Add the <tt>#O</tt> option, to return <tt>0O</tt></i></p>
  <li>Pro: consistent with the convention that e.g., <tt>#X</tt> returns upper-case <tt>0X</tt></li>
  <li>Con: <tt>0O</tt> is less readable than <tt>0o</tt>, so we do not want to encourage its use because <tt>0O</tt> can
       be confused with the letter O</li>
  </li>

  <h3>Option 3</h3>
  <p><i>Add the <tt>##o</tt> and <tt>##O</tt> options, to return <tt>0o</tt> and <tt>0O</tt> respectively</i></p>
  <li>Rationale: <tt>#o</tt> asks for a single character prefix and <tt>##o</tt> asks for a 2 character prefix</li>
  <li>Pro: can format for both <tt>0o</tt> and <tt>0O</tt></li>
  <li>Con: maybe a bit obscure</li>

  <h3>Option 4</h3>
  <p><i>Do Nothing</i></p>
  <li>Con: <tt>std::format</tt> only outputs the deprecated octal-literal</li>

  <a name="Acknowledgements"></a>
  <h2>Acknowledgements</h2>
  <p>
    Thanks to Erich Keane for reviewing the draft.
    The document style was borrowed from Doc. no. <a
      href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2015/n4340.html">N4340</a>
  </p>

</body><grammarly-desktop-integration data-grammarly-shadow-root="true"></grammarly-desktop-integration>

</html>