<HTML>
<HEAD>
<meta http-equiv="Content-Type" content="text/html; charset=us-ascii">
<TITLE>
    CWG Issue 933</TITLE>
<meta name="viewport" content="width=device-width, initial-scale=1.0">
<STYLE TYPE="text/css">
  INS { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  .INS { text-decoration:none; background-color:#D0FFD0 }
  DEL { text-decoration:line-through; background-color:#FFA0A0 }
  .DEL { text-decoration:line-through; background-color: #FFD0D0 }
  @media (prefers-color-scheme: dark) {
    HTML { background-color:#202020; color:#f0f0f0; }
    A { color:#5bc0ff; }
    A:visited { color:#c6a8ff; }
    A:hover, a:focus { color:#afd7ff; }
    INS { background-color:#033a16; color:#aff5b4; }
    .INS { background-color: #033a16; }
    DEL { background-color:#67060c; color:#ffdcd7; }
    .DEL { background-color:#67060c; }
  }
  SPAN.cmnt { font-family:Times; font-style:italic }
</STYLE>
</HEAD>
<BODY>
<P><EM>This is an unofficial snapshot of the ISO/IEC JTC1 SC22 WG21
  Core Issues List revision 118b.
  See http://www.open-std.org/jtc1/sc22/wg21/ for the official
  list.</EM></P>
<P>2025-09-28</P>
<HR>
<A NAME="933"></A><H4>933.
  
32-bit UCNs with 16-bit <TT>wchar_t</TT>
</H4>
<B>Section: </B>5.13.3&#160; [<A href="https://wg21.link/lex.ccon">lex.ccon</A>]
 &#160;&#160;&#160;

 <B>Status: </B>CD2
 &#160;&#160;&#160;

 <B>Submitter: </B>Alisdair Meredith
 &#160;&#160;&#160;

 <B>Date: </B>7 July, 2009<BR>


<P>[Voted into WP at October, 2009 meeting.]</P>



<P>According to 5.13.3 [<A href="https://wg21.link/lex.ccon#2">lex.ccon</A>] paragraph 2,</P>

<BLOCKQUOTE>

A character literal that begins with the letter <TT>L</TT>, such as
<TT>L'x'</TT>, is a wide-character literal.  A wide-character literal
has type <TT>wchar_t</TT>. The value of a wide-character literal
containing a single <I>c-char</I> has value equal to the numerical
value of the encoding of the <I>c-char</I> in the execution
wide-character set.

</BLOCKQUOTE>

<P>A <I>c-char</I> that is a universal character name might, when
translated to the execution character set, result in a multi-character
sequence that is larger than can be represented in a <TT>wchar_t</TT>.
There is wording that prevents this in <TT>char16_t</TT> literals, but
not for <TT>wchar_t</TT> literals.  This seems undesirable.</P>

<P><B>Proposed resolution (July, 2009):</B></P>

<OL>
<LI><P>Change 5.13.3 [<A href="https://wg21.link/lex.ccon#2">lex.ccon</A>] paragraph 2 as follows:</P></LI>

<BLOCKQUOTE>

...The value of a wide-character literal containing a single
<I>c-char</I> has value equal to the numerical value of the encoding
of the <I>c-char</I> in the execution wide-character set<INS>, unless
the <I>c-char</I> has no representation in the execution
wide-character set, in which case the value is
implementation-defined. [<I>Note:</I> The type <TT>wchar_t</TT> is
able to represent all members of the execution wide-character set, see
6.9.2 [<A href="https://wg21.link/basic.fundamental">basic.fundamental</A>]. &#8212;<I>end note</I>]</INS>.  The value
of a wide-character literal containing multiple <I>c-char</I>s is
implementation-defined.

</BLOCKQUOTE>

<LI><P>Change 5.13.3 [<A href="https://wg21.link/lex.ccon#5">lex.ccon</A>] paragraph 5 as follows:</P></LI>

<BLOCKQUOTE>

A universal-character-name is translated to the encoding, in the
<INS>appropriate</INS> execution character set, of the character
named...

</BLOCKQUOTE>

</OL>

<BR><BR>
</BODY>
</HTML>
