<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<TITLE>
    CWG Issue 578</TITLE>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<STYLE TYPE="text/css">
  INS { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  .INS { text-decoration:none; background-color:#D0FFD0 }
  DEL { text-decoration:line-through; background-color:#FFA0A0 }
  .DEL { text-decoration:line-through; background-color: #FFD0D0 }
  @media (prefers-color-scheme: dark) {
    HTML { background-color:#202020; color:#f0f0f0; }
    A { color:#5bc0ff; }
    A:visited { color:#c6a8ff; }
    A:hover, a:focus { color:#afd7ff; }
    INS { background-color:#033a16; color:#aff5b4; }
    .INS { background-color: #033a16; }
    DEL { background-color:#67060c; color:#ffdcd7; }
    .DEL { background-color:#67060c; }
  }
  SPAN.cmnt { font-family:Times; font-style:italic }
</STYLE>
</HEAD>
<BODY>
<P><EM>This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21
  Core Issues List revision 118b.
  See http://www.open-std.org/jtc1/sc22/wg21/ for the official
  list.</EM></P>
<P>2025-09-28</P>
<HR>
<A NAME="578"></A><H4>578.
  
Phase 1 replacement of characters with <I>universal-character-name</I>s
</H4>
<B>Section: </B>5.2&#160; [<A href="https://wg21.link/lex.phases">lex.phases</A>]
 &#160;&#160;&#160;

 <B>Status: </B>CD6
 &#160;&#160;&#160;

 <B>Submitter: </B>Martin Vejn&#225;r
 &#160;&#160;&#160;

 <B>Date: </B>7 May 2006<BR>


<P>[Accepted at the October, 2021 meeting as part of paper P2314R4.]</P>

<P>According to 5.2 [<A href="https://wg21.link/lex.phases#1">lex.phases</A>] paragraph 1, in translation
phase 1,</P>

<BLOCKQUOTE>

Any source file character not in the basic source character set
(5.3.1 [<A href="https://wg21.link/lex.charset">lex.charset</A>]) is replaced by the
universal-character-name that designates that character.

</BLOCKQUOTE>

<P>If a character that is not in the basic character set is preceded
by a backslash character, for example</P>

<PRE>
    "\&#225;"
</PRE>

<P>the result is equivalent to</P>

<PRE>
    "\\u00e1"
</PRE>

<P>that is, a backslash character followed by the spelling of the
universal-character-name.  This is different from the result in C99,
which accepts characters from the extended source character set without
replacing them with universal-character-names.</P>

<P>See also <A HREF="1335.html">issue 1335</A>.</P>

<P><B>Additional note (February, 2022):</B></P>

<P>P2314R4 Character sets and encodings (approved in October, 2021)
effected changes so that extended characters are no longer translated
to UCNs in phase 1.</P>

<BR><BR>
</BODY>
</HTML>
