<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<TITLE>
    CWG Issue 1999</TITLE>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<STYLE TYPE="text/css">
  INS { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  .INS { text-decoration:none; background-color:#D0FFD0 }
  DEL { text-decoration:line-through; background-color:#FFA0A0 }
  .DEL { text-decoration:line-through; background-color: #FFD0D0 }
  @media (prefers-color-scheme: dark) {
    HTML { background-color:#202020; color:#f0f0f0; }
    A { color:#5bc0ff; }
    A:visited { color:#c6a8ff; }
    A:hover, a:focus { color:#afd7ff; }
    INS { background-color:#033a16; color:#aff5b4; }
    .INS { background-color: #033a16; }
    DEL { background-color:#67060c; color:#ffdcd7; }
    .DEL { background-color:#67060c; }
  }
  SPAN.cmnt { font-family:Times; font-style:italic }
</STYLE>
</HEAD>
<BODY>
<P><EM>This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21
  Core Issues List revision 118b.
  See http://www.open-std.org/jtc1/sc22/wg21/ for the official
  list.</EM></P>
<P>2025-09-28</P>
<HR>
<A NAME="1999"></A><H4>1999.
  
Representation of source characters as universal-character-names
</H4>
<B>Section: </B>5.2&#160; [<A href="https://wg21.link/lex.phases">lex.phases</A>]
 &#160;&#160;&#160;

 <B>Status: </B>CD4
 &#160;&#160;&#160;

 <B>Submitter: </B>Richard Smith
 &#160;&#160;&#160;

 <B>Date: </B>2014-09-09<BR>


<P>[Moved to DR at the May, 2015 meeting.]</P>



<P>According to 5.2 [<A href="https://wg21.link/lex.phases#1">lex.phases</A>] paragraph 1, first phase,</P>

<BLOCKQUOTE>

Any source file character not in the basic source character
set (5.3.1 [<A href="https://wg21.link/lex.charset">lex.charset</A>]) is replaced by the
universal-character-name that designates that character. (An
implementation may use any internal encoding, so long as an
actual extended character encountered in the source file,
and the same extended character expressed in the source file
as a universal-character-name (i.e., using
the <TT>\uXXXX</TT> notation), are handled equivalently
except where this replacement is reverted in a raw string
literal.)

</BLOCKQUOTE>

<P>This wording is obviously not intended to exclude the use of
characters with code points larger than <TT>0xffff</TT>, but
the reference to &#8220;the <TT>\uXXXX</TT> notation&#8221; might
suggest that the <TT>\Uxxxxxxxx</TT> form is not allowed.</P>

<P><B>Proposed resolution (April, 2015):</B></P>

<P>Change 5.2 [<A href="https://wg21.link/lex.phases#1">lex.phases</A>] paragraph 1 number 1 as follows:</P>

<BLOCKQUOTE>

...(An implementation may use any internal encoding, so long as
an actual extended character encountered in the source file, and
the same extended character expressed in the source file as a
universal-character-name (<DEL>i.e.</DEL> <INS>e.g.</INS>, using
the <TT>\uXXXX</TT> notation), are handled equivalently except
where this replacement is reverted in a raw string literal.)

</BLOCKQUOTE>

<BR><BR>
</BODY>
</HTML>
