<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 4156: error_category messages have unspecified encoding</title>
<meta property="og:title" content="Issue 4156: error_category messages have unspecified encoding">
<meta property="og:description" content="C++ library issue. Status: SG16">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue4156.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#SG16">SG16</a> status.</em></p>
<h3 id="4156"><a href="lwg-active.html#4156">4156</a>. <code>error_category</code> messages have unspecified encoding</h3>
<p><b>Section:</b> 19.5.3.2 <a href="https://wg21.link/syserr.errcat.virtuals">[syserr.errcat.virtuals]</a> <b>Status:</b> <a href="lwg-active.html#SG16">SG16</a>
 <b>Submitter:</b> Victor Zverovich <b>Opened:</b> 2024-09-18 <b>Last modified:</b> 2025-03-12</p>
<p><b>Priority: </b>3
</p>
<p><b>View all issues with</b> <a href="lwg-status.html#SG16">SG16</a> status.</p>
<p><b>Discussion:</b></p>
<p>
19.5.3.1 <a href="https://wg21.link/syserr.errcat.overview">[syserr.errcat.overview]</a> says:
<blockquote>
The class <code class='backtick'>error_category</code> serves as a base class for types used to identify
the source and encoding of a particular category of error code.
</blockquote>
</p>
<p>
However, this doesn't seem to be referring to a character encoding,
just something about how an error is encoded into an integer value.
The definition of <code class='backtick'>error_category::message</code>
(19.5.3.2 <a href="https://wg21.link/syserr.errcat.virtuals">[syserr.errcat.virtuals]</a> p5) just says:
<blockquote>
<pre><code>virtual string message(int ev) const = 0;</code></pre>
<p>
<i>Returns</i>:
A string that describes the error condition denoted by <code class='backtick'>ev</code>.
</p>
</blockquote>
This says nothing about character encoding either.
</p>
<p>
There is also implementation divergence:
some implementations use variants of <code class='backtick'>strerror</code> which return messages
in the current C locale encoding,
but at least one major implementation doesn't use the current C locale:
<a href="https://github.com/microsoft/STL/issues/4711">MSVC STL issue 4711</a>.
</p>
<p>
Using the current C locale is obviously problematic.
First, it is inconsistent with other C++ APIs that normally use C++ locales.
Second, because it is a global state, it may change
(possibly from another thread)
between the time the message is obtained and the time it needs to be consumed,
which may lead to mojibake.
At the very least there should be a mechanism that captures the encoding
information in a race-free manner and communicates it to the caller
if the locale encoding is used
although it is better not to use it in the first place.
</p>
<p>
This is somewhat related to LWG <a href="lwg-active.html#4087" title="Standard exception messages have unspecified encoding (Status: SG16)">4087</a><sup><a href="https://cplusplus.github.io/LWG/issue4087" title="Latest snapshot">(i)</a></sup>
but should probably be addressed first because it may affect
how some exceptions are defined.
</p>
<p>
The proposed resolution is similar to the one of LWG <a href="lwg-active.html#4087" title="Standard exception messages have unspecified encoding (Status: SG16)">4087</a><sup><a href="https://cplusplus.github.io/LWG/issue4087" title="Latest snapshot">(i)</a></sup>.
</p>

<p><i>[2024-09-18; Jonathan comments]</i></p>

<p>It might make sense to stop using the word "encoding" in
19.5.3.1 <a href="https://wg21.link/syserr.errcat.overview">[syserr.errcat.overview]</a>.
</p>


<p><i>[2025-02-07; Reflector poll]</i></p>

<p>
Set priority to 3 after reflector poll.
</p>
<p>
"Do we need to say something about <code class='backtick'>name()</code> too? Does this requirement apply
to user overrides? If it does, what's the consequence of a violation? UB?
This 'encoding for strings returned by the library' questions feels like it
should be comprehensively addressed in a paper rather than as a patchwork of
individual issues."
</p>

<p><i>[2025-03-12; update from SG16]</i></p>

<p>Would be resolved by <a href="https://wg21.link/P3395R1" title=" Fix encoding issues and add a formatter for std::error_code">P3395R1</a>.</p>



<p id="res-4156"><b>Proposed resolution:</b></p>
<p>
This wording is relative to <a href="https://wg21.link/N4988" title=" Working Draft, Programming Languages — C++">N4988</a>.
</p>

<ol>
<li><p>Modify 19.5.3.2 <a href="https://wg21.link/syserr.errcat.virtuals">[syserr.errcat.virtuals]</a> as indicated:</p>
<blockquote>
<pre><code>virtual string message(int ev) const = 0;</code></pre>
<p>-5-
<i>Returns</i>:
A string <ins>in the ordinary literal encoding</ins>
that describes the error condition denoted by <code class='backtick'>ev</code>.
</p>
</blockquote>
</li>
</ol>





</body>
</html>
