<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<TITLE>
    CWG Issue 558</TITLE>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<STYLE TYPE="text/css">
  INS { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  .INS { text-decoration:none; background-color:#D0FFD0 }
  DEL { text-decoration:line-through; background-color:#FFA0A0 }
  .DEL { text-decoration:line-through; background-color: #FFD0D0 }
  @media (prefers-color-scheme: dark) {
    HTML { background-color:#202020; color:#f0f0f0; }
    A { color:#5bc0ff; }
    A:visited { color:#c6a8ff; }
    A:hover, a:focus { color:#afd7ff; }
    INS { background-color:#033a16; color:#aff5b4; }
    .INS { background-color: #033a16; }
    DEL { background-color:#67060c; color:#ffdcd7; }
    .DEL { background-color:#67060c; }
  }
  SPAN.cmnt { font-family:Times; font-style:italic }
</STYLE>
</HEAD>
<BODY>
<P><EM>This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21
  Core Issues List revision 118b.
  See http://www.open-std.org/jtc1/sc22/wg21/ for the official
  list.</EM></P>
<P>2025-09-28</P>
<HR>
<A NAME="558"></A><H4>558.
  
Excluded characters in universal character names
</H4>
<B>Section: </B>5.3.1&#160; [<A href="https://wg21.link/lex.charset">lex.charset</A>]
 &#160;&#160;&#160;

 <B>Status: </B>CD1
 &#160;&#160;&#160;

 <B>Submitter: </B>Daveed Vandevoorde
 &#160;&#160;&#160;

 <B>Date: </B>8 February 2006<BR>


<P>[Moved to DR at October 2007 meeting.]</P>

<P>C99 and C++ differ in their approach to universal character
names (UCNs).</P>

<P>
<A HREF="248.html">Issue 248</A> already covers the differences
in UCNs allowed for identifiers, but a more fundamental issue is that
of UCNs that correspond to codes reserved by ISO 10676 for surrogate
pair forms.</P>

<P>Specifically, C99 does not allow UCNs whose short names
are in the range 0xD800 to 0xDFFF.  I think C++ should
have the same constraint.  If someone really wants to
place such a code in a character or string literal,
they should use a hexadecimal escape sequence instead,
for example:</P>

<PRE>
    wchar_t  w1 = L'\xD900'; // Okay.
    wchar_t  w2 = L'\uD900'; // Error, not a valid character.
</PRE>

<P>(Compare 6.4.3 paragraph 2 in ISO/IEC 9899/1999 with
5.3.1 [<A href="https://wg21.link/lex.charset#2">lex.charset</A>] paragraph 2 in the C++
standard.)</P>

<P><B>Proposed resolution (October, 2007):</B></P>

<P>This issue is resolved by the adoption of paper
J16/07-0030 = WG21 N2170.</P>

<BR><BR>
</BODY>
</HTML>
