<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 4043: "ASCII" is not a registered character encoding</title>
<meta property="og:title" content="Issue 4043: &quot;ASCII&quot; is not a registered character encoding">
<meta property="og:description" content="C++ library issue. Status: WP">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue4043.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#WP">WP</a> status.</em></p>
<h3 id="4043"><a href="lwg-defects.html#4043">4043</a>. <code>"ASCII"</code> is not a registered character encoding</h3>
<p><b>Section:</b> 28.4.2.2 <a href="https://wg21.link/text.encoding.general">[text.encoding.general]</a> <b>Status:</b> <a href="lwg-active.html#WP">WP</a>
 <b>Submitter:</b> Jonathan Wakely <b>Opened:</b> 2024-01-23 <b>Last modified:</b> 2024-04-02</p>
<p><b>Priority: </b>Not Prioritized
</p>
<p><b>View all issues with</b> <a href="lwg-status.html#WP">WP</a> status.</p>
<p><b>Discussion:</b></p>
<p>
The IANA Charater Sets registry does not contain "ASCII" as an alias of the
"US-ASCII" encoding. This is apparently for historical reasons, because there
used to be some ambiguity about exactly what "ASCII" meant. I don't think
those historical reasons are relevant to C++26, but the absence of "ASCII"
in the IANA registry means that it's not a registered character encoding
as defined by 28.4.2.2 <a href="https://wg21.link/text.encoding.general">[text.encoding.general]</a>.
</p>

<p>
This means that the encoding referred to by notes in the C++ standard
(31.12.6.2 <a href="https://wg21.link/fs.path.generic">[fs.path.generic]</a>, 28.3.4.4.1.3 <a href="https://wg21.link/facet.numpunct.virtuals">[facet.numpunct.virtuals]</a>)
and by an example in the <code>std::text_encoding</code> proposal
(<a href="https://wg21.link/P1885" title=" Naming Text Encodings to Demystify Them">P1885</a>) isn't actually usable in portable code.
So <code>std::text_encoding("ASCII")</code> creates an object with
<code>mib() == std::text_encoding::other</code>, which is not the same
encoding as <code>std::text_encoding("US-ASCII")</code>.
This seems surprising.
</p>

<p><i>[2024-03-12; Reflector poll]</i></p>

<p>
SG16 approved the proposed resolution.
Set status to Tentatively Ready after seven votes in favour during reflector poll.
</p>

<p><i>[Tokyo 2024-03-23; Status changed: Voting &rarr; WP.]</i></p>



<p id="res-4043"><b>Proposed resolution:</b></p>
<p>
This wording is relative to <a href="https://wg21.link/N4971" title=" Working Draft, Programming Languages — C++">N4971</a>.
</p>

<ol>
<li><p>Modify 28.4.2.2 <a href="https://wg21.link/text.encoding.general">[text.encoding.general]</a> as indicated:</p>

<blockquote>
<p>-1-
A <i>registered character encoding</i> is a character encoding scheme
in the IANA Character Sets registry.
</p>
<p>
[<i>Note 1</i>:
The IANA Character Sets registry uses the term “character sets” to refer to character encodings.
&mdash; <i>end note</i>]
</p>
<p>
The primary name of a registered character encoding is the name
of that encoding specified in the IANA Character Sets registry.
</p>

<p>-2-
The set of known registered character encodings contains every
registered character encoding specified in the IANA Character Sets registry
except for the following:
<ol style="list-style-type: none">
<li>(2.1) &ndash; NATS-DANO (33)</li>
<li>(2.2) &ndash; NATS-DANO-ADD (34)</li>
</ol>
</p>

<p>-3-
Each known registered character encoding is identified by an enumerator in
<code>text_encoding::id</code>, and has a set of zero or more <i>aliases</i>.
</p>

<p>-4-
The set of aliases of a known registered character encoding is an
implementation-defined superset of the aliases specified in the
IANA Character Sets registry.
<ins>The set of aliases for US-ASCII includes <code>"ASCII"</code>.</ins>
No two aliases or primary names of distinct registered character encodings
are equivalent when compared by <code>text_encoding::<i>comp-name</i></code>.
</p>

</blockquote>
</li>
</ol>






</body>
</html>
