<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang xml:lang>
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="mpark/wg21" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <meta name="dcterms.date" content="2023-05-04" />
  <title>Unicode in the Library, Part 1: UTF Transcoding</title>
  <style>
      code{white-space: pre-wrap;}
      span.smallcaps{font-variant: small-caps;}
      span.underline{text-decoration: underline;}
      div.column{display: inline-block; vertical-align: top; width: 50%;}
      div.csl-block{margin-left: 1.5em;}
      ul.task-list{list-style: none;}
      pre > code.sourceCode { white-space: pre; position: relative; }
      pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
      pre > code.sourceCode > span:empty { height: 1.2em; }
      .sourceCode { overflow: visible; }
      code.sourceCode > span { color: inherit; text-decoration: inherit; }
      div.sourceCode { margin: 1em 0; }
      pre.sourceCode { margin: 0; }
      @media screen {
      div.sourceCode { overflow: auto; }
      }
      @media print {
      pre > code.sourceCode { white-space: pre-wrap; }
      pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
      }
      pre.numberSource code
        { counter-reset: source-line 0; }
      pre.numberSource code > span
        { position: relative; left: -4em; counter-increment: source-line; }
      pre.numberSource code > span > a:first-child::before
        { content: counter(source-line);
          position: relative; left: -1em; text-align: right; vertical-align: baseline;
          border: none; display: inline-block;
          -webkit-touch-callout: none; -webkit-user-select: none;
          -khtml-user-select: none; -moz-user-select: none;
          -ms-user-select: none; user-select: none;
          padding: 0 4px; width: 4em;
          color: #aaaaaa;
        }
      pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
      div.sourceCode
        {  background-color: #f6f8fa; }
      @media screen {
      pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
      }
      code span { } /* Normal */
      code span.al { color: #ff0000; } /* Alert */
      code span.an { } /* Annotation */
      code span.at { } /* Attribute */
      code span.bn { color: #9f6807; } /* BaseN */
      code span.bu { color: #9f6807; } /* BuiltIn */
      code span.cf { color: #00607c; } /* ControlFlow */
      code span.ch { color: #9f6807; } /* Char */
      code span.cn { } /* Constant */
      code span.co { color: #008000; font-style: italic; } /* Comment */
      code span.cv { color: #008000; font-style: italic; } /* CommentVar */
      code span.do { color: #008000; } /* Documentation */
      code span.dt { color: #00607c; } /* DataType */
      code span.dv { color: #9f6807; } /* DecVal */
      code span.er { color: #ff0000; font-weight: bold; } /* Error */
      code span.ex { } /* Extension */
      code span.fl { color: #9f6807; } /* Float */
      code span.fu { } /* Function */
      code span.im { } /* Import */
      code span.in { color: #008000; } /* Information */
      code span.kw { color: #00607c; } /* Keyword */
      code span.op { color: #af1915; } /* Operator */
      code span.ot { } /* Other */
      code span.pp { color: #6f4e37; } /* Preprocessor */
      code span.re { } /* RegionMarker */
      code span.sc { color: #9f6807; } /* SpecialChar */
      code span.ss { color: #9f6807; } /* SpecialString */
      code span.st { color: #9f6807; } /* String */
      code span.va { } /* Variable */
      code span.vs { color: #9f6807; } /* VerbatimString */
      code span.wa { color: #008000; font-weight: bold; } /* Warning */
      code.diff {color: #898887}
      code.diff span.va {color: #006e28}
      code.diff span.st {color: #bf0303}
  </style>
  <style type="text/css">
body {
margin: 5em;
font-family: serif;

hyphens: auto;
line-height: 1.35;
text-align: justify;
}
@media screen and (max-width: 30em) {
body {
margin: 1.5em;
}
}
div.wrapper {
max-width: 60em;
margin: auto;
}
ul {
list-style-type: none;
padding-left: 2em;
margin-top: -0.2em;
margin-bottom: -0.2em;
}
a {
text-decoration: none;
color: #4183C4;
}
a.hidden_link {
text-decoration: none;
color: inherit;
}
li {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
h1, h2, h3, h4 {
position: relative;
line-height: 1;
}
a.self-link {
position: absolute;
top: 0;
left: calc(-1 * (3.5rem - 26px));
width: calc(3.5rem - 26px);
height: 2em;
text-align: center;
border: none;
transition: opacity .2s;
opacity: .5;
font-family: sans-serif;
font-weight: normal;
font-size: 83%;
}
a.self-link:hover { opacity: 1; }
a.self-link::before { content: "§"; }
ul > li:before {
content: "\2014";
position: absolute;
margin-left: -1.5em;
}
:target { background-color: #C9FBC9; }
:target .codeblock { background-color: #C9FBC9; }
:target ul { background-color: #C9FBC9; }
.abbr_ref { float: right; }
.folded_abbr_ref { float: right; }
:target .folded_abbr_ref { display: none; }
:target .unfolded_abbr_ref { float: right; display: inherit; }
.unfolded_abbr_ref { display: none; }
.secnum { display: inline-block; min-width: 35pt; }
.header-section-number { display: inline-block; min-width: 35pt; }
.annexnum { display: block; }
div.sourceLinkParent {
float: right;
}
a.sourceLink {
position: absolute;
opacity: 0;
margin-left: 10pt;
}
a.sourceLink:hover {
opacity: 1;
}
a.itemDeclLink {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
opacity: 0;
}
a.itemDeclLink:hover { opacity: 1; }
span.marginalizedparent {
position: relative;
left: -5em;
}
li span.marginalizedparent { left: -7em; }
li ul > li span.marginalizedparent { left: -9em; }
li ul > li ul > li span.marginalizedparent { left: -11em; }
li ul > li ul > li ul > li span.marginalizedparent { left: -13em; }
div.footnoteNumberParent {
position: relative;
left: -4.7em;
}
a.marginalized {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
}
a.enumerated_item_num {
position: relative;
left: -3.5em;
display: inline-block;
margin-right: -3em;
text-align: right;
width: 3em;
}
div.para { margin-bottom: 0.6em; margin-top: 0.6em; text-align: justify; }
div.section { text-align: justify; }
div.sentence { display: inline; }
span.indexparent {
display: inline;
position: relative;
float: right;
right: -1em;
}
a.index {
position: absolute;
display: none;
}
a.index:before { content: "⟵"; }

a.index:target {
display: inline;
}
.indexitems {
margin-left: 2em;
text-indent: -2em;
}
div.itemdescr {
margin-left: 3em;
}
.bnf {
font-family: serif;
margin-left: 40pt;
margin-top: 0.5em;
margin-bottom: 0.5em;
}
.ncbnf {
font-family: serif;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
}
.ncsimplebnf {
font-family: serif;
font-style: italic;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
background: inherit; 
}
span.textnormal {
font-style: normal;
font-family: serif;
white-space: normal;
display: inline-block;
}
span.rlap {
display: inline-block;
width: 0px;
}
span.descr { font-style: normal; font-family: serif; }
span.grammarterm { font-style: italic; }
span.term { font-style: italic; }
span.terminal { font-family: monospace; font-style: normal; }
span.nonterminal { font-style: italic; }
span.tcode { font-family: monospace; font-style: normal; }
span.textbf { font-weight: bold; }
span.textsc { font-variant: small-caps; }
a.nontermdef { font-style: italic; font-family: serif; }
span.emph { font-style: italic; }
span.techterm { font-style: italic; }
span.mathit { font-style: italic; }
span.mathsf { font-family: sans-serif; }
span.mathrm { font-family: serif; font-style: normal; }
span.textrm { font-family: serif; }
span.textsl { font-style: italic; }
span.mathtt { font-family: monospace; font-style: normal; }
span.mbox { font-family: serif; font-style: normal; }
span.ungap { display: inline-block; width: 2pt; }
span.textit { font-style: italic; }
span.texttt { font-family: monospace; }
span.tcode_in_codeblock { font-family: monospace; font-style: normal; }
span.phantom { color: white; }

span.math { font-style: normal; }
span.mathblock {
display: block;
margin-left: auto;
margin-right: auto;
margin-top: 1.2em;
margin-bottom: 1.2em;
text-align: center;
}
span.mathalpha {
font-style: italic;
}
span.synopsis {
font-weight: bold;
margin-top: 0.5em;
display: block;
}
span.definition {
font-weight: bold;
display: block;
}
.codeblock {
margin-left: 1.2em;
line-height: 127%;
}
.outputblock {
margin-left: 1.2em;
line-height: 127%;
}
div.itemdecl {
margin-top: 2ex;
}
code.itemdeclcode {
white-space: pre;
display: block;
}
span.textsuperscript {
vertical-align: super;
font-size: smaller;
line-height: 0;
}
.footnotenum { vertical-align: super; font-size: smaller; line-height: 0; }
.footnote {
font-size: small;
margin-left: 2em;
margin-right: 2em;
margin-top: 0.6em;
margin-bottom: 0.6em;
}
div.minipage {
display: inline-block;
margin-right: 3em;
}
div.numberedTable {
text-align: center;
margin: 2em;
}
div.figure {
text-align: center;
margin: 2em;
}
table {
border: 1px solid black;
border-collapse: collapse;
margin-left: auto;
margin-right: auto;
margin-top: 0.8em;
text-align: left;
hyphens: none; 
}
td, th {
padding-left: 1em;
padding-right: 1em;
vertical-align: top;
}
td.empty {
padding: 0px;
padding-left: 1px;
}
td.left {
text-align: left;
}
td.right {
text-align: right;
}
td.center {
text-align: center;
}
td.justify {
text-align: justify;
}
td.border {
border-left: 1px solid black;
}
tr.rowsep, td.cline {
border-top: 1px solid black;
}
tr.even, tr.odd {
border-bottom: 1px solid black;
}
tr.capsep {
border-top: 3px solid black;
border-top-style: double;
}
tr.header {
border-bottom: 3px solid black;
border-bottom-style: double;
}
th {
border-bottom: 1px solid black;
}
span.centry {
font-weight: bold;
}
div.table {
display: block;
margin-left: auto;
margin-right: auto;
text-align: center;
width: 90%;
}
span.indented {
display: block;
margin-left: 2em;
margin-bottom: 1em;
margin-top: 1em;
}
ol.enumeratea { list-style-type: none; background: inherit; }
ol.enumerate { list-style-type: none; background: inherit; }

code.sourceCode > span { display: inline; }
</style>
  <link href="data:image/x-icon;base64,AAABAAIAEBAAAAEAIABoBAAAJgAAACAgAAABACAAqBAAAI4EAAAoAAAAEAAAACAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAVoJEAN6CRADegkQAWIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wCCRAAAgkQAAIJEAACCRAAsgkQAvoJEAP+CRAD/gkQA/4JEAP+CRADAgkQALoJEAACCRAAAgkQAAP///wD///8AgkQAAIJEABSCRACSgkQA/IJEAP99PQD/dzMA/3czAP99PQD/gkQA/4JEAPyCRACUgkQAFIJEAAD///8A////AHw+AFiBQwDqgkQA/4BBAP9/PxP/uZd6/9rJtf/bybX/upd7/39AFP+AQQD/gkQA/4FDAOqAQgBc////AP///wDKklv4jlEa/3o7AP+PWC//8+3o///////////////////////z7un/kFox/35AAP+GRwD/mVYA+v///wD///8A0Zpk+NmibP+0d0T/8evj///////+/fv/1sKz/9bCs//9/fr//////+/m2/+NRwL/nloA/5xYAPj///8A////ANKaZPjRmGH/5cKh////////////k149/3UwAP91MQD/lmQ//86rhv+USg3/m1YA/5hSAP+bVgD4////AP///wDSmmT4zpJY/+/bx///////8+TV/8mLT/+TVx//gkIA/5lVAP+VTAD/x6B//7aEVv/JpH7/s39J+P///wD///8A0ppk+M6SWP/u2sf///////Pj1f/Nj1T/2KFs/8mOUv+eWhD/lEsA/8aee/+0glT/x6F7/7J8Rvj///8A////ANKaZPjRmGH/48Cf///////+/v7/2qt//82PVP/OkFX/37KJ/86siv+USg7/mVQA/5hRAP+bVgD4////AP///wDSmmT40ppk/9CVXP/69O////////7+/v/x4M//8d/P//7+/f//////9u7n/6tnJf+XUgD/nFgA+P///wD///8A0ppk+NKaZP/RmWL/1qNy//r07///////////////////////+vXw/9akdP/Wnmn/y5FY/6JfFvj///8A////ANKaZFTSmmTo0ppk/9GYYv/Ql1//5cWm//Hg0P/x4ND/5cWm/9GXYP/RmGH/0ppk/9KaZOjVnmpY////AP///wDSmmQA0ppkEtKaZI7SmmT60ppk/9CWX//OkVb/zpFW/9CWX//SmmT/0ppk/NKaZJDSmmQS0ppkAP///wD///8A0ppkANKaZADSmmQA0ppkKtKaZLrSmmT/0ppk/9KaZP/SmmT/0ppkvNKaZCrSmmQA0ppkANKaZAD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkUtKaZNzSmmTc0ppkVNKaZADSmmQA0ppkANKaZADSmmQA////AP5/AAD4HwAA4AcAAMADAACAAQAAgAEAAIABAACAAQAAgAEAAIABAACAAQAAgAEAAMADAADgBwAA+B8AAP5/AAAoAAAAIAAAAEAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAyCRACMgkQA6oJEAOqCRACQgkQAEIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRABigkQA5oJEAP+CRAD/gkQA/4JEAP+CRADqgkQAZoJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAA4gkQAwoJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQAxIJEADyCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAP///wD///8A////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAWgkQAmIJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAJyCRAAYgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAdIJEAPCCRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAPSCRAB4gkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQASoJEANKCRAD/gkQA/4JEAP+CRAD/g0YA/39AAP9zLgD/bSQA/2shAP9rIQD/bSQA/3MuAP9/PwD/g0YA/4JEAP+CRAD/gkQA/4JEAP+CRADUgkQAToJEAACCRAAAgkQAAP///wD///8A////AP///wB+PwAAgkUAIoJEAKiCRAD/gkQA/4JEAP+CRAD/hEcA/4BBAP9sIwD/dTAA/5RfKv+viF7/vp56/76ee/+wiF7/lWAr/3YxAP9sIwD/f0AA/4RHAP+CRAD/gkQA/4JEAP+CRAD/gkQArIJEACaBQwAA////AP///wD///8A////AIBCAEBzNAD6f0EA/4NFAP+CRAD/gkQA/4VIAP92MwD/bSUA/6N1Tv/ezsL/////////////////////////////////38/D/6V3Uv9uJgD/dTEA/4VJAP+CRAD/gkQA/4JEAP+BQwD/fUAA/4FDAEj///8A////AP///wD///8AzJRd5qBlKf91NgD/dDUA/4JEAP+FSQD/cy4A/3YyAP/PuKP//////////////////////////////////////////////////////9K7qP94NQD/ciwA/4VJAP+CRAD/fkEA/35BAP+LSwD/mlYA6v///wD///8A////AP///wDdpnL/4qx3/8KJUv+PUhf/cTMA/3AsAP90LgD/4dK+/////////////////////////////////////////////////////////////////+TYxf91MAD/dTIA/31CAP+GRwD/llQA/6FcAP+gWwD8////AP///wD///8A////ANGZY/LSm2X/4ap3/92mcP+wdT3/byQA/8mwj////////////////////////////////////////////////////////////////////////////+LYxv9zLgP/jUoA/59bAP+hXAD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/RmWL/1p9q/9ubXv/XqXj////////////////////////////7+fD/vZyG/6BxS/+gcUr/vJuE//r37f//////////////////////3MOr/5dQBf+dVQD/nVkA/5xYAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmWP/yohJ//jo2P//////////////////////4NTG/4JDFf9lGAD/bSQA/20kAP9kGAD/fz8S/+Xb0f//////5NG9/6txN/+LOgD/m1QA/51aAP+cWAD/m1cA/5xYAP+cWADy////AP///wD///8A////ANKaZPLSmmT/0ppk/8+TWf/Unmv//v37//////////////////////+TWRr/VwsA/35AAP+ERgD/g0UA/4JGAP9lHgD/kFga/8KXX/+TRwD/jT4A/49CAP+VTQD/n10A/5xYAP+OQQD/lk4A/55cAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/y4tO/92yiP//////////////////////8NnE/8eCQP+rcTT/ez0A/3IyAP98PgD/gEMA/5FSAP+USwD/jj8A/5lUAP+JNwD/yqV2/694Mf+HNQD/jkAA/82rf/+laBj/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/LiUr/4byY///////////////////////gupX/0I5P/+Wuev/Lklz/l1sj/308AP+QSwD/ol0A/59aAP+aVQD/k0oA/8yoh///////+fXv/6pwO//Lp3v///////Pr4f+oay7y////AP///wD///8A////ANKaZPLSmmT/0ppk/8uJSv/hvJj//////////////////////+G7l//Jhkb/0ppk/96nc//fqXX/x4xO/6dkFP+QSQD/llEA/5xXAP+USgD/yaOA///////38uv/qG05/8ijdv//////8efb/6ZpLPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/zIxO/9yxh///////////////////////7dbA/8iEQf/Sm2X/0Zlj/9ScZv/eqHf/2KJv/7yAQf+XTgD/iToA/5lSAP+JNgD/yKFv/611LP+HNQD/jT8A/8qmeP+kZRT/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/Pk1n/1J5q//78+//////////////////+/fv/1aFv/8iEQv/Tm2b/0ppl/9GZY//Wn2z/1pZc/9eldf/Bl2b/kUcA/4w9AP+OQAD/lUwA/59eAP+cWQD/jT8A/5ZOAP+eXADy////AP///wD///8A////ANKaZPLSmmT/0ppk/9KZY//KiEn/8d/P///////////////////////47+f/05tm/8iCP//KiEj/yohJ/8eCP//RmGH//vfy///////n1sP/rXQ7/4k4AP+TTAD/nVoA/5xYAP+cVwD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/0ptl/8uLTf/aq37////////////////////////////+/fz/6c2y/961jv/etY7/6Myx//78+v//////////////////////3MWv/5xXD/+ORAD/mFQA/51ZAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmmT/0ppk/8mFRP/s1b//////////////////////////////////////////////////////////////////////////////+PD/0JFU/7NzMv+WUQD/kUsA/5tXAP+dWQDy////AP///wD///8A////ANKaZP/SmmT/0ppk/9KaZP/Sm2X/z5NZ/8yMT//z5NX/////////////////////////////////////////////////////////////////9Ofa/8yNUP/UmGH/36p5/8yTWv+qaSD/kksA/5ROAPz///8A////AP///wD///8A0ppk5NKaZP/SmmT/0ppk/9KaZP/TnGf/zY9T/82OUv/t1sD//////////////////////////////////////////////////////+7Yw//OkFX/zI5R/9OcZ//SmmP/26V0/9ymdf/BhUf/ol8R6P///wD///8A////AP///wDSmmQ80ppk9tKaZP/SmmT/0ppk/9KaZP/TnGj/zpFW/8qJSv/dson/8uHS//////////////////////////////////Lj0//etIv/y4lL/86QVf/TnGj/0ppk/9KaZP/RmWP/05xn/9ymdfjUnWdC////AP///wD///8A////ANKaZADSmmQc0ppkotKaZP/SmmT/0ppk/9KaZP/Tm2b/0Zli/8qJSf/NjlH/16Z3/+G8mP/myKr/5siq/+G8mP/Xp3f/zY5S/8qISf/RmGH/05tm/9KaZP/SmmT/0ppk/9KaZP/SmmSm0pljINWdaQD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkQtKaZMrSmmT/0ppk/9KaZP/SmmT/0ptl/9GYYf/Nj1P/y4lL/8qISP/KiEj/y4lK/82PU//RmGH/0ptl/9KaZP/SmmT/0ppk/9KaZP/SmmTO0ppkRtKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZGzSmmTu0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmTw0ppkcNKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZBLSmmSQ0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppklNKaZBTSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQy0ppkutKaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppkvtKaZDbSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkXNKaZODSmmT/0ppk/9KaZP/SmmT/0ppk5NKaZGDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkBtKaZIbSmmTo0ppk6tKaZIrSmmQK0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP/8P///+B///+AH//+AAf//AAD//AAAP/AAAA/gAAAHwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA+AAAAfwAAAP/AAAP/8AAP//gAH//+AH///4H////D//" rel="icon" />
  
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<div class="wrapper">
<header id="title-block-header">
<h1 class="title" style="text-align:center">Unicode in the Library, Part
1: UTF Transcoding</h1>
<table style="border:none;float:right">
  <tr>
    <td>Document #:</td>
    <td>P2728R3</td>
  </tr>
  <tr>
    <td>Date:</td>
    <td>2023-05-04</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Project:</td>
    <td>Programming Language C++</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Audience:</td>
    <td>
      SG-16 Unicode<br>
      LEWG-I<br>
      LEWG<br>
    </td>
  </tr>
  <tr>
    <td style="vertical-align:top">Reply-to:</td>
    <td>
      Zach Laine<br>&lt;<a href="mailto:whatwasthataddress@gmail.com" class="email">whatwasthataddress@gmail.com</a>&gt;<br>
    </td>
  </tr>
</table>
</header>
<div style="clear:both">
<div id="TOC" role="doc-toc">
<h1 id="toctitle">Contents</h1>
<ul>
<li><a href="#changelog"><span class="toc-section-number">1</span>
Changelog<span></span></a>
<ul>
<li><a href="#changes-since-r0"><span class="toc-section-number">1.1</span> Changes since
R0<span></span></a></li>
<li><a href="#changes-since-r1"><span class="toc-section-number">1.2</span> Changes since
R1<span></span></a></li>
<li><a href="#changes-since-r2"><span class="toc-section-number">1.3</span> Changes since
R2<span></span></a></li>
</ul></li>
<li><a href="#motivation"><span class="toc-section-number">2</span>
Motivation<span></span></a>
<ul>
<li><a href="#a-note-about-p1629"><span class="toc-section-number">2.1</span> A note about
P1629<span></span></a></li>
</ul></li>
<li><a href="#the-shortest-unicode-primer-imaginable"><span class="toc-section-number">3</span> The shortest Unicode primer
imaginable<span></span></a></li>
<li><a href="#use-cases"><span class="toc-section-number">4</span> Use
cases<span></span></a>
<ul>
<li><a href="#case-1-adapt-to-an-existing-range-interface-taking-a-different-utf"><span class="toc-section-number">4.1</span> Case 1: Adapt to an existing range
interface taking a different UTF<span></span></a></li>
<li><a href="#case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf"><span class="toc-section-number">4.2</span> Case 2: Adapt to an existing
iterator interface taking a different UTF<span></span></a></li>
<li><a href="#case-3-transcode-data-as-it-is-read-into-a-buffer"><span class="toc-section-number">4.3</span> Case 3: Transcode data as it is
read into a buffer<span></span></a></li>
<li><a href="#case-4-print-the-results-of-transcoding"><span class="toc-section-number">4.4</span> Case 4: Print the results of
transcoding<span></span></a></li>
</ul></li>
<li><a href="#proposed-design"><span class="toc-section-number">5</span>
Proposed design<span></span></a>
<ul>
<li><a href="#dependencies"><span class="toc-section-number">5.1</span>
Dependencies<span></span></a></li>
<li><a href="#add-concepts-that-describe-parameters-to-transcoding-apis"><span class="toc-section-number">5.2</span> Add concepts that describe
parameters to transcoding APIs<span></span></a></li>
<li><a href="#add-a-standard-null-terminated-sequence-sentinel"><span class="toc-section-number">5.3</span> Add a standard null-terminated
sequence sentinel<span></span></a></li>
<li><a href="#add-constants-and-utility-functions-that-query-the-state-of-utf-sequences-well-formedness-etc."><span class="toc-section-number">5.4</span> Add constants and utility
functions that query the state of UTF sequences (well-formedness,
etc.)<span></span></a></li>
<li><a href="#add-the-transcoding-iterators"><span class="toc-section-number">5.5</span> Add the transcoding
iterators<span></span></a></li>
<li><a href="#add-a-transcoding-view"><span class="toc-section-number">5.6</span> Add a transcoding
view<span></span></a>
<ul>
<li><a href="#add-the-view-proper"><span class="toc-section-number">5.6.1</span> Add the view
proper<span></span></a></li>
<li><a href="#add-as_utfn-view-adaptors"><span class="toc-section-number">5.6.2</span> Add
<code class="sourceCode default">as_utfN</code> view
adaptors<span></span></a></li>
<li><a href="#add-utf_view-specialization-of-formatter"><span class="toc-section-number">5.6.3</span> Add
<code class="sourceCode default">utf_view</code> specialization of
<code class="sourceCode default">formatter</code><span></span></a></li>
<li><a href="#add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking"><span class="toc-section-number">5.6.4</span> Add
<code class="sourceCode default">unpack_iterator_and_sentinel</code> CPO
for iterator “unpacking”<span></span></a></li>
</ul></li>
<li><a href="#add-a-feature-test-macro"><span class="toc-section-number">5.7</span> Add a feature test
macro<span></span></a></li>
<li><a href="#design-notes"><span class="toc-section-number">5.8</span>
Design notes<span></span></a></li>
</ul></li>
<li><a href="#implementation-experience"><span class="toc-section-number">6</span> Implementation
experience<span></span></a></li>
<li><a href="#bibliography"><span class="toc-section-number">7</span>
References<span></span></a></li>
</ul>
</div>
<h1 data-number="1" id="changelog"><span class="header-section-number">1</span> Changelog<a href="#changelog" class="self-link"></a></h1>
<h2 data-number="1.1" id="changes-since-r0"><span class="header-section-number">1.1</span> Changes since R0<a href="#changes-since-r0" class="self-link"></a></h2>
<ul>
<li>When naming code points in interfaces, use
<code class="sourceCode default">char32_t</code>.</li>
<li>When naming code units in interfaces, use
<code class="sourceCode default">charN_t</code>.</li>
<li>Remove each eager algorithm, leaving in its corresponding view.</li>
<li>Remove all the output iterators.</li>
<li>Change template parameters to
<code class="sourceCode default">utfN_view</code> to the types of the
from-range, instead of thetypes of the transcoding iterators used to
implement the view.</li>
<li>Remove all make-functions.</li>
<li>Replace the misbegotten
<code class="sourceCode default">as_utfN()</code> functions with the
<code class="sourceCode default">as_utfN</code> view adaptors that
should have been there all along.</li>
<li>Add missing
<code class="sourceCode default">transcoding_error_handler</code>
concept.</li>
<li>Turn
<code class="sourceCode default">unpack_iterator_and_sentinel</code>
into a CPO.</li>
<li>Lower the UTF iterator concepts from bidirectional to input.</li>
</ul>
<h2 data-number="1.2" id="changes-since-r1"><span class="header-section-number">1.2</span> Changes since R1<a href="#changes-since-r1" class="self-link"></a></h2>
<ul>
<li>Reintroduce the transcoding-from-a-buffer example.</li>
<li>Generalize <code class="sourceCode default">null_sentinel_t</code>
to a non-Unicode-specific facility.</li>
<li>In utility functions that search for ill-formed encoding, take a
range argument instead of a pair of iterator arguments.</li>
<li>Replace <code class="sourceCode default">utf{8,16,32}_view</code>
with a single <code class="sourceCode default">utf_view</code>.</li>
</ul>
<h2 data-number="1.3" id="changes-since-r2"><span class="header-section-number">1.3</span> Changes since R2<a href="#changes-since-r2" class="self-link"></a></h2>
<ul>
<li>Add <code class="sourceCode default">noexcept</code> where
appropriate.</li>
<li>Remove non-essential constants and utility functions, and elaborate
on the usage of the ones that remain.</li>
<li>Note differences from similar elements proposed in <span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span>.</li>
<li>Extend the examples slightly.</li>
<li>Correct an error in the description of the view adaptors’ semantics,
and provide several examples of their use.</li>
</ul>
<h1 data-number="2" id="motivation"><span class="header-section-number">2</span> Motivation<a href="#motivation" class="self-link"></a></h1>
<p>Unicode is important to many, many users in everyday software. It is
not exotic or weird. Well, it’s weird, but it’s not weird to see it
used. C and C++ are the only major production languages with essentially
no support for Unicode.</p>
<p>Let’s fix.</p>
<p>To fix, first we start with the most basic representations of strings
in Unicode: UTF. You might get a UTF string from anywhere; on Windows
you often get them from the OS, in UTF-16. In web-adjacent applications,
strings are most commonly in UTF-8. In ASCII-only applications,
everything is in UTF-8, by its definition as a superset of ASCII.</p>
<p>Often, an application needs to switch between UTFs: 8 -&gt; 16, 32
-&gt; 16, etc. In SG-16 we’ve taken to calling such UTF-N -&gt; UTF-M
operations “transcoding”.</p>
<p>I’m proposing interfaces to do transcoding that meet certain design
requirements that I think are important; I hope you’ll agree:</p>
<ul>
<li>Ranges are the future. We should have range-friendly ways of doing
transcoding. This includes support for sentinels and lazy views.</li>
<li>Iterators are the present. We should support generic programming,
whether it is done in terms of pointers, a particular iterator, or an
iterator type specified as a template parameter.</li>
<li>Transcoding cannot be a black box; sometimes you need to be able to
find where there is a break in the encoding, or to detect whether a
sequence has any broken encodings in it. We should provide utility
functions that let users investigate these states.</li>
<li>A null-terminated string should not be treated as a special case.
The ubiquity of such strings means that they should be treated as
first-class strings.</li>
<li>It is common to want to view the same text as code points and code
units at different times. It is therefore important that transcoding
iterators have a convenient way to access the underlying sequence of
code units being transcoded.</li>
</ul>
<h2 data-number="2.1" id="a-note-about-p1629"><span class="header-section-number">2.1</span> A note about P1629<a href="#a-note-about-p1629" class="self-link"></a></h2>
<p><span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span> from JeanHeyd Meneide is a much
more ambitious proposal that aims to standardize a general-purpose text
encoding conversion mechanism. This proposal is not at odds with P1629;
the two proposals have largely orthogonal aims. This proposal only
concerns itself with UTF interconversions, which is all that is required
for Unicode support. P1629 is concerned with those conversions, plus a
lot more. Accepting both proposals would not cause problems; in fact,
the APIs proposed here could be used to implement parts of the P1629
design.</p>
<p>There are some differences between the way that the transcode views
and iterators from <span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span> work and
the transcoding view and iterators from this paper work. First,
<code class="sourceCode default">std::text::transcode_view</code> has no
direct support for null-terminated strings. Second, it does not do the
unpacking described in this paper. Third, they are not printable and
streamable.</p>
<h1 data-number="3" id="the-shortest-unicode-primer-imaginable"><span class="header-section-number">3</span> The shortest Unicode primer
imaginable<a href="#the-shortest-unicode-primer-imaginable" class="self-link"></a></h1>
<p>There are multiple encoding types defined in Unicode: UTF-8, UTF-16,
and UTF-32.</p>
<p>A <em>code unit</em> is the lowest-level datum-type in your Unicode
data. Examples are a <code class="sourceCode default">char</code> in
UTF-8 and a <code class="sourceCode default">char32_t</code> in
UTF-32.</p>
<p>A <em>code point</em> is a 32-bit integral value that represents a
single Unicode value. Examples are U+0041 “A” “LATIN CAPITAL LETTER A”
and U+0308 “¨” “COMBINING DIAERESIS”.</p>
<p>A code point may be consist of multiple code units. For instance, 3
UTF-8 code units in sequence may encode a particular code point.</p>
<h1 data-number="4" id="use-cases"><span class="header-section-number">4</span> Use cases<a href="#use-cases" class="self-link"></a></h1>
<h2 data-number="4.1" id="case-1-adapt-to-an-existing-range-interface-taking-a-different-utf"><span class="header-section-number">4.1</span> Case 1: Adapt to an existing
range interface taking a different UTF<a href="#case-1-adapt-to-an-existing-range-interface-taking-a-different-utf" class="self-link"></a></h2>
<p>In this case, we have a generic range interface to transcode into, so
we use a transcoding view.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co">// A generic function that accepts sequences of UTF-16.</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>utf16_range R<span class="op">&gt;</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input<span class="op">(</span>R r<span class="op">)</span>;</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input_again<span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>ranges<span class="op">::</span>ref_view<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">&gt;&gt;</span> r<span class="op">)</span>;</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string input <span class="op">=</span> get_utf8_input<span class="op">()</span>;</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> input_utf16 <span class="op">=</span> std<span class="op">::</span>views<span class="op">::</span>all<span class="op">(</span>input<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>input_utf16<span class="op">)</span>;</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>process_input_again<span class="op">(</span>input_utf16<span class="op">)</span>;</span></code></pre></div>
<h2 data-number="4.2" id="case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf"><span class="header-section-number">4.2</span> Case 2: Adapt to an existing
iterator interface taking a different UTF<a href="#case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf" class="self-link"></a></h2>
<p>This time, we have a generic iterator interface we want to transcode
into, so we want to use the transcoding iterators.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co">// A generic function that accepts sequences of UTF-16.</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>utf16_iter I<span class="op">&gt;</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input<span class="op">(</span>I first, I last<span class="op">)</span>;</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string input <span class="op">=</span> get_utf8_input<span class="op">()</span>;</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">(</span>input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>end<span class="op">())</span>,</span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">(</span>input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>end<span class="op">()</span>, input<span class="op">.</span>end<span class="op">()))</span>;</span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a><span class="co">// Even more conveniently:</span></span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> <span class="kw">const</span> utf16_view <span class="op">=</span> std<span class="op">::</span>views<span class="op">::</span>all<span class="op">(</span>input<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>utf16_view<span class="op">.</span>begin<span class="op">()</span>, utf16<span class="op">.</span>end<span class="op">())</span>;</span></code></pre></div>
<h2 data-number="4.3" id="case-3-transcode-data-as-it-is-read-into-a-buffer"><span class="header-section-number">4.3</span> Case 3: Transcode data as it is
read into a buffer<a href="#case-3-transcode-data-as-it-is-read-into-a-buffer" class="self-link"></a></h2>
<p>Let’s say we have a wire-communications layer that knows nothing
about the UTFs, and we need to use some of the utility functions to make
sure we don’t process partially-received UTF-8 sequences.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co">// Using same size to ensure the transcode operation always has room.</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="dt">char</span> utf8_buf<span class="op">[</span>buf_size<span class="op">]</span>;</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="dt">char</span> utf16_buf<span class="op">[</span>buf_size<span class="op">]</span>;</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a><span class="dt">char</span> <span class="op">*</span> read_first <span class="op">=</span> utf8_buf;</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a><span class="cf">while</span> <span class="op">(</span><span class="kw">true</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Reads off a wire; may contain partial UTF-8 sequences at the ends of</span></span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>    <span class="co">// some reads.</span></span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span> buf_last <span class="op">=</span> read_into_utf8_buffer<span class="op">(</span>read_first, utf8_buf <span class="op">+</span> buf_size<span class="op">)</span>;</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(</span>buf_last <span class="op">==</span> read_first<span class="op">)</span></span>
<span id="cb3-12"><a href="#cb3-12" aria-hidden="true" tabindex="-1"></a>        <span class="cf">continue</span>;</span>
<span id="cb3-13"><a href="#cb3-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-14"><a href="#cb3-14" aria-hidden="true" tabindex="-1"></a>    <span class="co">// find the last whole UTF-8 sequence, so we don&#39;t feed partial sequences</span></span>
<span id="cb3-15"><a href="#cb3-15" aria-hidden="true" tabindex="-1"></a>    <span class="co">// to the algorithm below.</span></span>
<span id="cb3-16"><a href="#cb3-16" aria-hidden="true" tabindex="-1"></a>    <span class="dt">char</span> <span class="op">*</span> last <span class="op">=</span> buf_last;</span>
<span id="cb3-17"><a href="#cb3-17" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> <span class="kw">const</span> last_lead <span class="op">=</span> std<span class="op">::</span>ranges<span class="op">::</span>find_last_if<span class="op">(</span></span>
<span id="cb3-18"><a href="#cb3-18" aria-hidden="true" tabindex="-1"></a>        utf8_buf, buf_last, std<span class="op">::</span>uc<span class="op">::</span>is_lead_code_unit<span class="op">)</span>;</span>
<span id="cb3-19"><a href="#cb3-19" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="op">(!</span>last_lead<span class="op">.</span>empty<span class="op">())</span> <span class="op">{</span></span>
<span id="cb3-20"><a href="#cb3-20" aria-hidden="true" tabindex="-1"></a>        <span class="kw">auto</span> <span class="kw">const</span> dist_from_end <span class="op">=</span> buf_last <span class="op">-</span> last_lead<span class="op">.</span>begin<span class="op">()</span>;</span>
<span id="cb3-21"><a href="#cb3-21" aria-hidden="true" tabindex="-1"></a>        <span class="ot">assert</span><span class="op">(</span>dist_from_end <span class="op">&lt;=</span> <span class="dv">4</span><span class="op">)</span>;</span>
<span id="cb3-22"><a href="#cb3-22" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf8_code_units<span class="op">(*</span>last_lead<span class="op">.</span>begin<span class="op">())</span> <span class="op">!=</span> dist_from_end<span class="op">)</span></span>
<span id="cb3-23"><a href="#cb3-23" aria-hidden="true" tabindex="-1"></a>            last <span class="op">=</span> last_lead<span class="op">.</span>begin<span class="op">()</span>;</span>
<span id="cb3-24"><a href="#cb3-24" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb3-25"><a href="#cb3-25" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-26"><a href="#cb3-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> <span class="kw">const</span> result <span class="op">=</span> std<span class="op">::</span>ranges<span class="op">::</span>copy<span class="op">(</span></span>
<span id="cb3-27"><a href="#cb3-27" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">(</span>utf8_buf, last<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16,</span>
<span id="cb3-28"><a href="#cb3-28" aria-hidden="true" tabindex="-1"></a>        utf16_buf<span class="op">)</span>;</span>
<span id="cb3-29"><a href="#cb3-29" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-30"><a href="#cb3-30" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Do something with the resulting UTF-16 buffer contents.</span></span>
<span id="cb3-31"><a href="#cb3-31" aria-hidden="true" tabindex="-1"></a>    send_utf16_somewhere<span class="op">(</span>utf16_buf, result<span class="op">.</span>out<span class="op">)</span>;</span>
<span id="cb3-32"><a href="#cb3-32" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-33"><a href="#cb3-33" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Copy partial UTF-8 sequence to start of buffer.</span></span>
<span id="cb3-34"><a href="#cb3-34" aria-hidden="true" tabindex="-1"></a>    read_first <span class="op">=</span> std<span class="op">::</span>ranges<span class="op">::</span>copy_backward<span class="op">(</span>last, buf_last, utf8_buf<span class="op">).</span>out;</span>
<span id="cb3-35"><a href="#cb3-35" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h2 data-number="4.4" id="case-4-print-the-results-of-transcoding"><span class="header-section-number">4.4</span> Case 4: Print the results of
transcoding<a href="#case-4-print-the-results-of-transcoding" class="self-link"></a></h2>
<p>Text processing is pretty useless without I/O. All of the Unicode
algorithms operate on code points, and so the output of any of those
algorithms will be in code points/UTF-32. It should be easy to print the
results to a <code class="sourceCode default">std::ostream</code>, to a
<code class="sourceCode default">std::wostream</code> on Windows, or
using <code class="sourceCode default">std::print</code>.
<code class="sourceCode default">utf_view</code> is therefore printable
and streamable.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> double_print<span class="op">(</span><span class="dt">char32_t</span> <span class="kw">const</span> <span class="op">*</span> str<span class="op">)</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8 <span class="op">=</span> str <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8;</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>print<span class="op">(</span>utf8<span class="op">)</span>;</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>cerr <span class="op">&lt;&lt;</span> utf8;</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h1 data-number="5" id="proposed-design"><span class="header-section-number">5</span> Proposed design<a href="#proposed-design" class="self-link"></a></h1>
<h2 data-number="5.1" id="dependencies"><span class="header-section-number">5.1</span> Dependencies<a href="#dependencies" class="self-link"></a></h2>
<p>This proposal depends on the existence of <a href="https://isocpp.org/files/papers/P2727R0.html">P2727</a>
“std::iterator_interface”.</p>
<h2 data-number="5.2" id="add-concepts-that-describe-parameters-to-transcoding-apis"><span class="header-section-number">5.2</span> Add concepts that describe
parameters to transcoding APIs<a href="#add-concepts-that-describe-parameters-to-transcoding-apis" class="self-link"></a></h2>
<div class="sourceCode" id="cb5"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">enum</span> <span class="kw">class</span> format <span class="op">{</span> utf8 <span class="op">=</span> <span class="dv">1</span>, utf16 <span class="op">=</span> <span class="dv">2</span>, utf32 <span class="op">=</span> <span class="dv">4</span> <span class="op">}</span>;</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit <span class="op">=</span> integral<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> <span class="kw">sizeof</span><span class="op">(</span>T<span class="op">)</span> <span class="op">==</span> <span class="op">(</span><span class="dt">int</span><span class="op">)</span>F;</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_code_unit <span class="op">=</span> utf8_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_iter <span class="op">=</span></span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a>      input_iterator<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_pointer <span class="op">=</span></span>
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a>      is_pointer_v<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_range <span class="op">=</span> ranges<span class="op">::</span>input_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span></span>
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a>      code_unit<span class="op">&lt;</span>ranges<span class="op">::</span>range_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-49"><a href="#cb5-49" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-50"><a href="#cb5-50" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-51"><a href="#cb5-51" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-52"><a href="#cb5-52" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_iter <span class="op">=</span> utf8_iter<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_iter<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_iter<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-53"><a href="#cb5-53" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-54"><a href="#cb5-54" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_pointer <span class="op">=</span> utf8_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-55"><a href="#cb5-55" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-56"><a href="#cb5-56" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_range <span class="op">=</span> utf8_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_range<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-57"><a href="#cb5-57" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-58"><a href="#cb5-58" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-59"><a href="#cb5-59" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_range_like <span class="op">=</span></span>
<span id="cb5-60"><a href="#cb5-60" aria-hidden="true" tabindex="-1"></a>      utf_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">||</span> utf_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-61"><a href="#cb5-61" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-62"><a href="#cb5-62" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-63"><a href="#cb5-63" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_input_range_like <span class="op">=</span></span>
<span id="cb5-64"><a href="#cb5-64" aria-hidden="true" tabindex="-1"></a>        <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-65"><a href="#cb5-65" aria-hidden="true" tabindex="-1"></a>        utf8_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-66"><a href="#cb5-66" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-67"><a href="#cb5-67" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_input_range_like <span class="op">=</span></span>
<span id="cb5-68"><a href="#cb5-68" aria-hidden="true" tabindex="-1"></a>        <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-69"><a href="#cb5-69" aria-hidden="true" tabindex="-1"></a>        utf16_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-70"><a href="#cb5-70" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-71"><a href="#cb5-71" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_input_range_like <span class="op">=</span></span>
<span id="cb5-72"><a href="#cb5-72" aria-hidden="true" tabindex="-1"></a>        <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf32_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-73"><a href="#cb5-73" aria-hidden="true" tabindex="-1"></a>        utf32_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-74"><a href="#cb5-74" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-75"><a href="#cb5-75" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-76"><a href="#cb5-76" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_input_range_like <span class="op">=</span></span>
<span id="cb5-77"><a href="#cb5-77" aria-hidden="true" tabindex="-1"></a>        utf8_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-78"><a href="#cb5-78" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-79"><a href="#cb5-79" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-80"><a href="#cb5-80" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> transcoding_error_handler <span class="op">=</span></span>
<span id="cb5-81"><a href="#cb5-81" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span><span class="op">(</span>T t, <span class="dt">char</span> <span class="kw">const</span> <span class="op">*</span> msg<span class="op">)</span> <span class="op">{</span> <span class="op">{</span> t<span class="op">(</span>msg<span class="op">)</span> <span class="op">}</span> <span class="op">-&gt;</span> code_point; <span class="op">}</span>;</span>
<span id="cb5-82"><a href="#cb5-82" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-83"><a href="#cb5-83" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h2 data-number="5.3" id="add-a-standard-null-terminated-sequence-sentinel"><span class="header-section-number">5.3</span> Add a standard null-terminated
sequence sentinel<a href="#add-a-standard-null-terminated-sequence-sentinel" class="self-link"></a></h2>
<div class="sourceCode" id="cb6"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std <span class="op">{</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> null_sentinel_t <span class="op">{</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> null_sentinel_t base<span class="op">()</span> <span class="kw">const</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> <span class="op">{}</span>; <span class="op">}</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a>      <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span><span class="kw">const</span> T<span class="op">*</span> p, null_sentinel_t<span class="op">)</span></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>        <span class="op">{</span> <span class="cf">return</span> <span class="op">*</span>p <span class="op">==</span> T<span class="op">{}</span>; <span class="op">}</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> null_sentinel_t null_sentinel;</span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>The <code class="sourceCode default">base()</code> member bears
explanation. It is there to make iterator/sentinel pairs easy to use in
a generic context. Consider a range
<code class="sourceCode default">r1</code> of code points delimited by a
pair of <code class="sourceCode default">utf_8_to_32_iterator&lt;char const *&gt;</code>
transcoding iterators (defined later in this paper). The range of
underlying UTF-8 code units is
[<code class="sourceCode default">r1.begin().base()</code>,
<code class="sourceCode default">r1.end().base()</code>).</p>
<p>Now consider a range <code class="sourceCode default">r2</code> of
code points that is delimited by a <code class="sourceCode default">utf_8_to_32_iterator&lt;char const *&gt;</code>
transcoding iterator and a
<code class="sourceCode default">null_sentinel</code>. Now our
underlying range of UTF-8 is
[<code class="sourceCode default">r.begin().base()</code>,
<code class="sourceCode default">null_sentinel</code>).</p>
<p>Instead of making people writing generic code have to special-case
the use of <code class="sourceCode default">null_sentinel</code>,
<code class="sourceCode default">null_sentinel</code> has a
<code class="sourceCode default">base()</code> member that lets us write
<code class="sourceCode default">r.end().base()</code> instead of
<code class="sourceCode default">null_sentinel</code>. This means that
for either <code class="sourceCode default">r</code> or
<code class="sourceCode default">r2</code>, the underlying range of
UTF-8 code units is just
[<code class="sourceCode default">r1.begin().base()</code>,
<code class="sourceCode default">r1.end().base()</code>).</p>
<p>Note that this is a general-interest utility, and as such, it is in
<code class="sourceCode default">std</code>, not
<code class="sourceCode default">std::uc</code>.</p>
<h2 data-number="5.4" id="add-constants-and-utility-functions-that-query-the-state-of-utf-sequences-well-formedness-etc."><span class="header-section-number">5.4</span> Add constants and utility
functions that query the state of UTF sequences (well-formedness,
etc.)<a href="#add-constants-and-utility-functions-that-query-the-state-of-utf-sequences-well-formedness-etc." class="self-link"></a></h2>
<div class="sourceCode" id="cb7"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">char32_t</span> replacement_character <span class="op">=</span> <span class="bn">0xfffd</span>;</span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Given the first (and possibly only) code unit of a UTF-8-encoded code</span></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a>  <span class="co">// point, returns the number of bytes occupied by that code point (in the</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a>  <span class="co">// range [1, 4]).  Returns a value &lt; 0 if first_unit is not a valid</span></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a>  <span class="co">// initial UTF-8 code unit.</span></span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">int</span> utf8_code_units<span class="op">(</span><span class="dt">char8_t</span> first_unit<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff c is a UTF-8 continuation (non-lead) code unit.</span></span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> is_continuation<span class="op">(</span><span class="dt">char8_t</span> c<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Given the first (and possibly only) code unit of a UTF-16-encoded code</span></span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a>  <span class="co">// point, returns the number of code units occupied by that code point</span></span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a>  <span class="co">// (in the range [1, 2]).  Returns a value &lt; 0 if first_unit is</span></span>
<span id="cb7-16"><a href="#cb7-16" aria-hidden="true" tabindex="-1"></a>  <span class="co">// not a valid initial UTF-16 code unit.</span></span>
<span id="cb7-17"><a href="#cb7-17" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">int</span> utf16_code_units<span class="op">(</span><span class="dt">char16_t</span> first_unit<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-18"><a href="#cb7-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-19"><a href="#cb7-19" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff c is a Unicode low (non-lead) surrogate.</span></span>
<span id="cb7-20"><a href="#cb7-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> is_low_surrogate<span class="op">(</span><span class="dt">char32_t</span> c<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-21"><a href="#cb7-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-22"><a href="#cb7-22" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns the first code unit in [ranges::begin(r), ranges::end(r)) that</span></span>
<span id="cb7-23"><a href="#cb7-23" aria-hidden="true" tabindex="-1"></a>  <span class="co">// is not properly UTF-8 encoded, or ranges::begin(r) + ranges::distance(r) if</span></span>
<span id="cb7-24"><a href="#cb7-24" aria-hidden="true" tabindex="-1"></a>  <span class="co">// no such code unit is found.</span></span>
<span id="cb7-25"><a href="#cb7-25" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf8_range R<span class="op">&gt;</span></span>
<span id="cb7-26"><a href="#cb7-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>forward_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-27"><a href="#cb7-27" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> ranges<span class="op">::</span>borrowed_iterator_t<span class="op">&lt;</span>R<span class="op">&gt;</span> find_invalid_encoding<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-28"><a href="#cb7-28" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-29"><a href="#cb7-29" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns the first code unit in [ranges::begin(r), ranges::end(r)) that</span></span>
<span id="cb7-30"><a href="#cb7-30" aria-hidden="true" tabindex="-1"></a>  <span class="co">// is not properly UTF-16 encoded, or ranges::begin(r) + ranges::distance(r) if</span></span>
<span id="cb7-31"><a href="#cb7-31" aria-hidden="true" tabindex="-1"></a>  <span class="co">// no such code unit is found.</span></span>
<span id="cb7-32"><a href="#cb7-32" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf16_range R<span class="op">&gt;</span></span>
<span id="cb7-33"><a href="#cb7-33" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>forward_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-34"><a href="#cb7-34" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> ranges<span class="op">::</span>borrowed_iterator_t<span class="op">&lt;</span>R<span class="op">&gt;</span> find_invalid_encoding<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-35"><a href="#cb7-35" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-36"><a href="#cb7-36" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff r is empty or the initial UTF-8 code units in r form a valid</span></span>
<span id="cb7-37"><a href="#cb7-37" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Unicode code point.</span></span>
<span id="cb7-38"><a href="#cb7-38" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf8_range R<span class="op">&gt;</span></span>
<span id="cb7-39"><a href="#cb7-39" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>forward_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-40"><a href="#cb7-40" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> <span class="dt">bool</span> starts_encoded<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-41"><a href="#cb7-41" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-42"><a href="#cb7-42" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff r is empty or the initial UTF-16 code units in r form a valid</span></span>
<span id="cb7-43"><a href="#cb7-43" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Unicode code point.</span></span>
<span id="cb7-44"><a href="#cb7-44" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf16_range R<span class="op">&gt;</span></span>
<span id="cb7-45"><a href="#cb7-45" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>forward_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-46"><a href="#cb7-46" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> <span class="dt">bool</span> starts_encoded<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-47"><a href="#cb7-47" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-48"><a href="#cb7-48" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff r is empty or the final UTF-8 code units in r form a valid</span></span>
<span id="cb7-49"><a href="#cb7-49" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Unicode code point.</span></span>
<span id="cb7-50"><a href="#cb7-50" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf8_range R<span class="op">&gt;</span></span>
<span id="cb7-51"><a href="#cb7-51" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>bidirectional_range<span class="op">&lt;</span>R<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> ranges<span class="op">::</span>common_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-52"><a href="#cb7-52" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> <span class="dt">bool</span> ends_encoded<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-53"><a href="#cb7-53" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-54"><a href="#cb7-54" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Returns true iff r is empty or the final UTF-16 code units in r form a valid</span></span>
<span id="cb7-55"><a href="#cb7-55" aria-hidden="true" tabindex="-1"></a>  <span class="co">// Unicode code point.</span></span>
<span id="cb7-56"><a href="#cb7-56" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf16_range R<span class="op">&gt;</span></span>
<span id="cb7-57"><a href="#cb7-57" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>bidirectional_range<span class="op">&lt;</span>R<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> ranges<span class="op">::</span>common_range<span class="op">&lt;</span>R<span class="op">&gt;</span></span>
<span id="cb7-58"><a href="#cb7-58" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> <span class="dt">bool</span> ends_encoded<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="kw">noexcept</span>;</span>
<span id="cb7-59"><a href="#cb7-59" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>These utility functions are useful for finding encoding breakages in
UTF ranges.</p>
<p><code class="sourceCode default">utf8_code_units</code> can be used
to determine whether a UTF-8 code unit is an initial code unit within a
code point sequence, and if so, how many continuation code units are to
follow. <code class="sourceCode default">is_continuation</code> can then
be used to verify that the N expected code units in the code point
sequence are actually continuationm code units. This sort of inquiry is
useful in cases like Case 3 example near the top of the paper.
<code class="sourceCode default">utf16_code_units</code> and
<code class="sourceCode default">is_low_surrogate</code> form a similar
pair for UTF-16.</p>
<p>The other functions can be used to check if a given range is properly
UTF-8 or -16 encoded, either entirely, or at the beginning or end or the
range.</p>
<h2 data-number="5.5" id="add-the-transcoding-iterators"><span class="header-section-number">5.5</span> Add the transcoding iterators<a href="#add-the-transcoding-iterators" class="self-link"></a></h2>
<p>I’m using <a href="https://isocpp.org/files/papers/P2727R0.html">P2727</a>’s
<code class="sourceCode default">iterator_interface</code> here for
simplicity.</p>
<div class="sourceCode" id="cb8"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a>  <span class="co">// An error handler type that can be used with the converting iterators;</span></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a>  <span class="co">// provides the Unicode replacement character on errors.</span></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> use_replacement_character <span class="op">{</span></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char32_t</span> <span class="kw">operator</span><span class="op">()(</span><span class="kw">const</span> <span class="dt">char</span><span class="op">*)</span> <span class="kw">const</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> replacement_character; <span class="op">}</span></span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a>  </span>
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a>  <span class="kw">auto</span> <em>bidirectional-at-most</em><span class="op">()</span> <span class="op">{</span>  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> bidirectional_iterator_tag<span class="op">{}</span>;</span>
<span id="cb8-12"><a href="#cb8-12" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb8-13"><a href="#cb8-13" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> forward_iterator_tag<span class="op">{}</span>;</span>
<span id="cb8-14"><a href="#cb8-14" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>input_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb8-15"><a href="#cb8-15" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> input_iterator_tag<span class="op">{}</span>;</span>
<span id="cb8-16"><a href="#cb8-16" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb8-17"><a href="#cb8-17" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb8-18"><a href="#cb8-18" aria-hidden="true" tabindex="-1"></a>  </span>
<span id="cb8-19"><a href="#cb8-19" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb8-20"><a href="#cb8-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">using</span> <em>bidirectional-at-most-t</em> <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>bidirectional-at-most</em><span class="op">&lt;</span>I<span class="op">&gt;())</span>; <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-21"><a href="#cb8-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-22"><a href="#cb8-22" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-23"><a href="#cb8-23" aria-hidden="true" tabindex="-1"></a>    utf32_iter I,</span>
<span id="cb8-24"><a href="#cb8-24" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-25"><a href="#cb8-25" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-26"><a href="#cb8-26" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_32_to_8_iterator</span>
<span id="cb8-27"><a href="#cb8-27" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_32_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char8_t</span>, <span class="dt">char8_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-28"><a href="#cb8-28" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_8_iterator<span class="op">()</span>;</span>
<span id="cb8-29"><a href="#cb8-29" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_32_to_8_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-30"><a href="#cb8-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-31"><a href="#cb8-31" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-32"><a href="#cb8-32" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_32_to_8_iterator<span class="op">(</span></span>
<span id="cb8-33"><a href="#cb8-33" aria-hidden="true" tabindex="-1"></a>          <span class="kw">const</span> utf_32_to_8_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-34"><a href="#cb8-34" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-35"><a href="#cb8-35" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb8-36"><a href="#cb8-36" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb8-37"><a href="#cb8-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-38"><a href="#cb8-38" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char8_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> buf_<span class="op">[</span>index_<span class="op">]</span>; <span class="op">}</span></span>
<span id="cb8-39"><a href="#cb8-39" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-40"><a href="#cb8-40" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb8-41"><a href="#cb8-41" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-42"><a href="#cb8-42" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_8_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-43"><a href="#cb8-43" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_8_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-44"><a href="#cb8-44" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-45"><a href="#cb8-45" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-46"><a href="#cb8-46" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-47"><a href="#cb8-47" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_32_to_8_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler2<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-48"><a href="#cb8-48" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_32_to_8_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler2<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-49"><a href="#cb8-49" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-50"><a href="#cb8-50" aria-hidden="true" tabindex="-1"></a>          <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-51"><a href="#cb8-51" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-52"><a href="#cb8-52" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_32_to_8_iterator lhs, utf_32_to_8_iterator rhs<span class="op">)</span></span>
<span id="cb8-53"><a href="#cb8-53" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-54"><a href="#cb8-54" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-55"><a href="#cb8-55" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-56"><a href="#cb8-56" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_32_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-57"><a href="#cb8-57" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-58"><a href="#cb8-58" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char8_t</span>,</span>
<span id="cb8-59"><a href="#cb8-59" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char8_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-60"><a href="#cb8-60" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-61"><a href="#cb8-61" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-62"><a href="#cb8-62" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-63"><a href="#cb8-63" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-64"><a href="#cb8-64" aria-hidden="true" tabindex="-1"></a>    I first_;                 <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-65"><a href="#cb8-65" aria-hidden="true" tabindex="-1"></a>    I it_;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-66"><a href="#cb8-66" aria-hidden="true" tabindex="-1"></a>    S last_;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-67"><a href="#cb8-67" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> index_;               <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-68"><a href="#cb8-68" aria-hidden="true" tabindex="-1"></a>    array<span class="op">&lt;</span><span class="dt">char8_t</span>, <span class="dv">5</span><span class="op">&gt;</span> buf_;   <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-69"><a href="#cb8-69" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-70"><a href="#cb8-70" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf32_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-71"><a href="#cb8-71" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_32_to_8_iterator;</span>
<span id="cb8-72"><a href="#cb8-72" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-73"><a href="#cb8-73" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-74"><a href="#cb8-74" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-75"><a href="#cb8-75" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-76"><a href="#cb8-76" aria-hidden="true" tabindex="-1"></a>      utf_32_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span> lhs, S rhs<span class="op">)</span></span>
<span id="cb8-77"><a href="#cb8-77" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-78"><a href="#cb8-78" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-79"><a href="#cb8-79" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-80"><a href="#cb8-80" aria-hidden="true" tabindex="-1"></a>    utf8_iter I,</span>
<span id="cb8-81"><a href="#cb8-81" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-82"><a href="#cb8-82" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-83"><a href="#cb8-83" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_8_to_32_iterator</span>
<span id="cb8-84"><a href="#cb8-84" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_8_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char32_t</span>, <span class="dt">char32_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-85"><a href="#cb8-85" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_32_iterator<span class="op">()</span>;</span>
<span id="cb8-86"><a href="#cb8-86" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_8_to_32_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-87"><a href="#cb8-87" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-88"><a href="#cb8-88" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-89"><a href="#cb8-89" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_8_to_32_iterator<span class="op">(</span></span>
<span id="cb8-90"><a href="#cb8-90" aria-hidden="true" tabindex="-1"></a>          <span class="kw">const</span> utf_8_to_32_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-91"><a href="#cb8-91" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-92"><a href="#cb8-92" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb8-93"><a href="#cb8-93" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb8-94"><a href="#cb8-94" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-95"><a href="#cb8-95" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char32_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span>;</span>
<span id="cb8-96"><a href="#cb8-96" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-97"><a href="#cb8-97" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb8-98"><a href="#cb8-98" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-99"><a href="#cb8-99" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_32_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-100"><a href="#cb8-100" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_32_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-101"><a href="#cb8-101" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-102"><a href="#cb8-102" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_8_to_32_iterator lhs, utf_8_to_32_iterator rhs<span class="op">)</span></span>
<span id="cb8-103"><a href="#cb8-103" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-104"><a href="#cb8-104" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-105"><a href="#cb8-105" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-106"><a href="#cb8-106" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_8_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-107"><a href="#cb8-107" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-108"><a href="#cb8-108" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char32_t</span>,</span>
<span id="cb8-109"><a href="#cb8-109" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char32_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-110"><a href="#cb8-110" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-111"><a href="#cb8-111" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-112"><a href="#cb8-112" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-113"><a href="#cb8-113" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-114"><a href="#cb8-114" aria-hidden="true" tabindex="-1"></a>    I first_;                 <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-115"><a href="#cb8-115" aria-hidden="true" tabindex="-1"></a>    I it_;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-116"><a href="#cb8-116" aria-hidden="true" tabindex="-1"></a>    S last_;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-117"><a href="#cb8-117" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-118"><a href="#cb8-118" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf8_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-119"><a href="#cb8-119" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_8_to_16_iterator;</span>
<span id="cb8-120"><a href="#cb8-120" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-121"><a href="#cb8-121" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf8_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-122"><a href="#cb8-122" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_8_to_32_iterator;</span>
<span id="cb8-123"><a href="#cb8-123" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-124"><a href="#cb8-124" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-125"><a href="#cb8-125" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-126"><a href="#cb8-126" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-127"><a href="#cb8-127" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_8_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;&amp;</span> lhs, Sentinel rhs<span class="op">)</span></span>
<span id="cb8-128"><a href="#cb8-128" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-129"><a href="#cb8-129" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-130"><a href="#cb8-130" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-131"><a href="#cb8-131" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-132"><a href="#cb8-132" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_8_to_32_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-133"><a href="#cb8-133" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_8_to_32_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-134"><a href="#cb8-134" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span>;</span>
<span id="cb8-135"><a href="#cb8-135" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-136"><a href="#cb8-136" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-137"><a href="#cb8-137" aria-hidden="true" tabindex="-1"></a>    utf32_iter I,</span>
<span id="cb8-138"><a href="#cb8-138" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-139"><a href="#cb8-139" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-140"><a href="#cb8-140" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_32_to_16_iterator</span>
<span id="cb8-141"><a href="#cb8-141" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_32_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char16_t</span>, <span class="dt">char16_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-142"><a href="#cb8-142" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_16_iterator<span class="op">()</span>;</span>
<span id="cb8-143"><a href="#cb8-143" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_32_to_16_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-144"><a href="#cb8-144" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-145"><a href="#cb8-145" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-146"><a href="#cb8-146" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_32_to_16_iterator<span class="op">(</span></span>
<span id="cb8-147"><a href="#cb8-147" aria-hidden="true" tabindex="-1"></a>          <span class="kw">const</span> utf_32_to_16_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-148"><a href="#cb8-148" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-149"><a href="#cb8-149" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb8-150"><a href="#cb8-150" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb8-151"><a href="#cb8-151" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-152"><a href="#cb8-152" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char16_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span></span>
<span id="cb8-153"><a href="#cb8-153" aria-hidden="true" tabindex="-1"></a>    <span class="op">{</span> <span class="cf">return</span> buf_<span class="op">[</span>index_<span class="op">]</span>; <span class="op">}</span></span>
<span id="cb8-154"><a href="#cb8-154" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-155"><a href="#cb8-155" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb8-156"><a href="#cb8-156" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-157"><a href="#cb8-157" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_16_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-158"><a href="#cb8-158" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_32_to_16_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-159"><a href="#cb8-159" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-160"><a href="#cb8-160" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-161"><a href="#cb8-161" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-162"><a href="#cb8-162" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_32_to_16_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler2<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-163"><a href="#cb8-163" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_32_to_16_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler2<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-164"><a href="#cb8-164" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-165"><a href="#cb8-165" aria-hidden="true" tabindex="-1"></a>          <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-166"><a href="#cb8-166" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-167"><a href="#cb8-167" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_32_to_16_iterator lhs, utf_32_to_16_iterator rhs<span class="op">)</span></span>
<span id="cb8-168"><a href="#cb8-168" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-169"><a href="#cb8-169" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-170"><a href="#cb8-170" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-171"><a href="#cb8-171" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_32_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-172"><a href="#cb8-172" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-173"><a href="#cb8-173" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char16_t</span>,</span>
<span id="cb8-174"><a href="#cb8-174" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char16_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-175"><a href="#cb8-175" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-176"><a href="#cb8-176" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-177"><a href="#cb8-177" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-178"><a href="#cb8-178" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-179"><a href="#cb8-179" aria-hidden="true" tabindex="-1"></a>    I first_;                 <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-180"><a href="#cb8-180" aria-hidden="true" tabindex="-1"></a>    I it_;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-181"><a href="#cb8-181" aria-hidden="true" tabindex="-1"></a>    S last_;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-182"><a href="#cb8-182" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> index_;               <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-183"><a href="#cb8-183" aria-hidden="true" tabindex="-1"></a>    array<span class="op">&lt;</span><span class="dt">char16_t</span>, <span class="dv">4</span><span class="op">&gt;</span> buf_;  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-184"><a href="#cb8-184" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-185"><a href="#cb8-185" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf32_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-186"><a href="#cb8-186" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_32_to_16_iterator;</span>
<span id="cb8-187"><a href="#cb8-187" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-188"><a href="#cb8-188" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-189"><a href="#cb8-189" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-190"><a href="#cb8-190" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-191"><a href="#cb8-191" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_32_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;&amp;</span> lhs, Sentinel rhs<span class="op">)</span></span>
<span id="cb8-192"><a href="#cb8-192" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-193"><a href="#cb8-193" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-194"><a href="#cb8-194" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-195"><a href="#cb8-195" aria-hidden="true" tabindex="-1"></a>    utf16_iter I,</span>
<span id="cb8-196"><a href="#cb8-196" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-197"><a href="#cb8-197" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-198"><a href="#cb8-198" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_16_to_32_iterator</span>
<span id="cb8-199"><a href="#cb8-199" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_16_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char32_t</span>, <span class="dt">char32_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-200"><a href="#cb8-200" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_32_iterator<span class="op">()</span>;</span>
<span id="cb8-201"><a href="#cb8-201" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_16_to_32_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-202"><a href="#cb8-202" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-203"><a href="#cb8-203" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-204"><a href="#cb8-204" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_16_to_32_iterator<span class="op">(</span></span>
<span id="cb8-205"><a href="#cb8-205" aria-hidden="true" tabindex="-1"></a>          <span class="kw">const</span> utf_16_to_32_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-206"><a href="#cb8-206" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-207"><a href="#cb8-207" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb8-208"><a href="#cb8-208" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb8-209"><a href="#cb8-209" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-210"><a href="#cb8-210" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char32_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span>;</span>
<span id="cb8-211"><a href="#cb8-211" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-212"><a href="#cb8-212" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb8-213"><a href="#cb8-213" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-214"><a href="#cb8-214" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_32_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-215"><a href="#cb8-215" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_32_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-216"><a href="#cb8-216" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-217"><a href="#cb8-217" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_16_to_32_iterator lhs, utf_16_to_32_iterator rhs<span class="op">)</span></span>
<span id="cb8-218"><a href="#cb8-218" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-219"><a href="#cb8-219" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-220"><a href="#cb8-220" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-221"><a href="#cb8-221" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_16_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-222"><a href="#cb8-222" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-223"><a href="#cb8-223" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char32_t</span>,</span>
<span id="cb8-224"><a href="#cb8-224" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char32_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-225"><a href="#cb8-225" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-226"><a href="#cb8-226" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-227"><a href="#cb8-227" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-228"><a href="#cb8-228" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-229"><a href="#cb8-229" aria-hidden="true" tabindex="-1"></a>    I first_;                 <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-230"><a href="#cb8-230" aria-hidden="true" tabindex="-1"></a>    I it_;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-231"><a href="#cb8-231" aria-hidden="true" tabindex="-1"></a>    S last_;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-232"><a href="#cb8-232" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-233"><a href="#cb8-233" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf32_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-234"><a href="#cb8-234" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_32_to_16_iterator;</span>
<span id="cb8-235"><a href="#cb8-235" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-236"><a href="#cb8-236" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf16_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-237"><a href="#cb8-237" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_16_to_32_iterator;</span>
<span id="cb8-238"><a href="#cb8-238" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-239"><a href="#cb8-239" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-240"><a href="#cb8-240" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-241"><a href="#cb8-241" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-242"><a href="#cb8-242" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_32_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;&amp;</span> lhs, Sentinel rhs<span class="op">)</span></span>
<span id="cb8-243"><a href="#cb8-243" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-244"><a href="#cb8-244" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-245"><a href="#cb8-245" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-246"><a href="#cb8-246" aria-hidden="true" tabindex="-1"></a>    <span class="kw">class</span> I1, <span class="kw">class</span> S1,</span>
<span id="cb8-247"><a href="#cb8-247" aria-hidden="true" tabindex="-1"></a>    <span class="kw">class</span> I2, <span class="kw">class</span> S2,</span>
<span id="cb8-248"><a href="#cb8-248" aria-hidden="true" tabindex="-1"></a>    <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-249"><a href="#cb8-249" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-250"><a href="#cb8-250" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_32_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-251"><a href="#cb8-251" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_32_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-252"><a href="#cb8-252" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span>;</span>
<span id="cb8-253"><a href="#cb8-253" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-254"><a href="#cb8-254" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-255"><a href="#cb8-255" aria-hidden="true" tabindex="-1"></a>      utf16_iter I,</span>
<span id="cb8-256"><a href="#cb8-256" aria-hidden="true" tabindex="-1"></a>      sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-257"><a href="#cb8-257" aria-hidden="true" tabindex="-1"></a>      transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-258"><a href="#cb8-258" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_16_to_8_iterator</span>
<span id="cb8-259"><a href="#cb8-259" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_16_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char8_t</span>, <span class="dt">char8_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-260"><a href="#cb8-260" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_8_iterator<span class="op">()</span>;</span>
<span id="cb8-261"><a href="#cb8-261" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_16_to_8_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-262"><a href="#cb8-262" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-263"><a href="#cb8-263" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-264"><a href="#cb8-264" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_16_to_8_iterator<span class="op">(</span><span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I2, S2<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-265"><a href="#cb8-265" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-266"><a href="#cb8-266" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb8-267"><a href="#cb8-267" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb8-268"><a href="#cb8-268" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-269"><a href="#cb8-269" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char8_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> buf_<span class="op">[</span>index_<span class="op">]</span>; <span class="op">}</span></span>
<span id="cb8-270"><a href="#cb8-270" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-271"><a href="#cb8-271" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb8-272"><a href="#cb8-272" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-273"><a href="#cb8-273" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_8_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-274"><a href="#cb8-274" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_16_to_8_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-275"><a href="#cb8-275" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-276"><a href="#cb8-276" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-277"><a href="#cb8-277" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-278"><a href="#cb8-278" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler2<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-279"><a href="#cb8-279" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler2<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-280"><a href="#cb8-280" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-281"><a href="#cb8-281" aria-hidden="true" tabindex="-1"></a>          <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-282"><a href="#cb8-282" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-283"><a href="#cb8-283" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_16_to_8_iterator lhs, utf_16_to_8_iterator rhs<span class="op">)</span></span>
<span id="cb8-284"><a href="#cb8-284" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-285"><a href="#cb8-285" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-286"><a href="#cb8-286" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-287"><a href="#cb8-287" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_16_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-288"><a href="#cb8-288" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-289"><a href="#cb8-289" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char8_t</span>,</span>
<span id="cb8-290"><a href="#cb8-290" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char8_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-291"><a href="#cb8-291" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-292"><a href="#cb8-292" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-293"><a href="#cb8-293" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-294"><a href="#cb8-294" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-295"><a href="#cb8-295" aria-hidden="true" tabindex="-1"></a>    I first_;                 <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-296"><a href="#cb8-296" aria-hidden="true" tabindex="-1"></a>    I it_;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-297"><a href="#cb8-297" aria-hidden="true" tabindex="-1"></a>    S last_;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-298"><a href="#cb8-298" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> index_;               <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-299"><a href="#cb8-299" aria-hidden="true" tabindex="-1"></a>    array<span class="op">&lt;</span><span class="dt">char8_t</span>, <span class="dv">5</span><span class="op">&gt;</span> buf_;   <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-300"><a href="#cb8-300" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-301"><a href="#cb8-301" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf16_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-302"><a href="#cb8-302" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_16_to_8_iterator;</span>
<span id="cb8-303"><a href="#cb8-303" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-304"><a href="#cb8-304" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-305"><a href="#cb8-305" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-306"><a href="#cb8-306" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-307"><a href="#cb8-307" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;&amp;</span> lhs, Sentinel rhs<span class="op">)</span></span>
<span id="cb8-308"><a href="#cb8-308" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-309"><a href="#cb8-309" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-310"><a href="#cb8-310" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-311"><a href="#cb8-311" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-312"><a href="#cb8-312" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-313"><a href="#cb8-313" aria-hidden="true" tabindex="-1"></a>    <span class="kw">const</span> utf_16_to_8_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-314"><a href="#cb8-314" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span>;</span>
<span id="cb8-315"><a href="#cb8-315" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-316"><a href="#cb8-316" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb8-317"><a href="#cb8-317" aria-hidden="true" tabindex="-1"></a>    utf8_iter I,</span>
<span id="cb8-318"><a href="#cb8-318" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb8-319"><a href="#cb8-319" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb8-320"><a href="#cb8-320" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_8_to_16_iterator</span>
<span id="cb8-321"><a href="#cb8-321" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> iterator_interface<span class="op">&lt;</span>utf_8_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>, <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, <span class="dt">char16_t</span>, <span class="dt">char16_t</span><span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb8-322"><a href="#cb8-322" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_16_iterator<span class="op">()</span>;</span>
<span id="cb8-323"><a href="#cb8-323" aria-hidden="true" tabindex="-1"></a>    <span class="kw">explicit</span> <span class="kw">constexpr</span> utf_8_to_16_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span>;</span>
<span id="cb8-324"><a href="#cb8-324" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb8-325"><a href="#cb8-325" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb8-326"><a href="#cb8-326" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_8_to_16_iterator<span class="op">(</span></span>
<span id="cb8-327"><a href="#cb8-327" aria-hidden="true" tabindex="-1"></a>          <span class="kw">const</span> utf_8_to_16_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span>;</span>
<span id="cb8-328"><a href="#cb8-328" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-329"><a href="#cb8-329" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_<span class="op">.</span>begin<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-330"><a href="#cb8-330" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_<span class="op">.</span>end<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-331"><a href="#cb8-331" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-332"><a href="#cb8-332" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char16_t</span> <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> buf_<span class="op">[</span>index_<span class="op">]</span>; <span class="op">}</span></span>
<span id="cb8-333"><a href="#cb8-333" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-334"><a href="#cb8-334" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_<span class="op">.</span>base<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb8-335"><a href="#cb8-335" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-336"><a href="#cb8-336" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_16_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span>;</span>
<span id="cb8-337"><a href="#cb8-337" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_8_to_16_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span>;</span>
<span id="cb8-338"><a href="#cb8-338" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-339"><a href="#cb8-339" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I1, <span class="kw">class</span> S1, <span class="kw">class</span> I2, <span class="kw">class</span> S2, <span class="kw">class</span> ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-340"><a href="#cb8-340" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-341"><a href="#cb8-341" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_8_to_16_iterator<span class="op">&lt;</span>I1, S1, ErrorHandler2<span class="op">&gt;&amp;</span> lhs,</span>
<span id="cb8-342"><a href="#cb8-342" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_8_to_16_iterator<span class="op">&lt;</span>I2, S2, ErrorHandler2<span class="op">&gt;&amp;</span> rhs<span class="op">)</span></span>
<span id="cb8-343"><a href="#cb8-343" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span><span class="ch">&#39; </span><span class="er">}</span></span>
<span id="cb8-344"><a href="#cb8-344" aria-hidden="true" tabindex="-1"></a>          <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-345"><a href="#cb8-345" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-346"><a href="#cb8-346" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_8_to_16_iterator lhs, utf_8_to_16_iterator rhs<span class="op">)</span></span>
<span id="cb8-347"><a href="#cb8-347" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>base<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>index_ <span class="op">==</span> rhs<span class="op">.</span>index_; <span class="op">}</span></span>
<span id="cb8-348"><a href="#cb8-348" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-349"><a href="#cb8-349" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em> <span class="op">=</span>                <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-350"><a href="#cb8-350" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span>utf_8_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb8-351"><a href="#cb8-351" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb8-352"><a href="#cb8-352" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char16_t</span>,</span>
<span id="cb8-353"><a href="#cb8-353" aria-hidden="true" tabindex="-1"></a>                         <span class="dt">char16_t</span><span class="op">&gt;</span>;</span>
<span id="cb8-354"><a href="#cb8-354" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb8-355"><a href="#cb8-355" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>base-type</em><span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb8-356"><a href="#cb8-356" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-357"><a href="#cb8-357" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb8-358"><a href="#cb8-358" aria-hidden="true" tabindex="-1"></a>    utf_8_to_32_iterator<span class="op">&lt;</span>I, S<span class="op">&gt;</span> it_;  <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-359"><a href="#cb8-359" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> index_;                      <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-360"><a href="#cb8-360" aria-hidden="true" tabindex="-1"></a>    array<span class="op">&lt;</span><span class="dt">char16_t</span>, <span class="dv">4</span><span class="op">&gt;</span> buf_;         <span class="co">// <em>exposition only</em></span></span>
<span id="cb8-361"><a href="#cb8-361" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-362"><a href="#cb8-362" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>utf8_iter I2, sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2, transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb8-363"><a href="#cb8-363" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">struct</span> utf_8_to_16_iterator;</span>
<span id="cb8-364"><a href="#cb8-364" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-365"><a href="#cb8-365" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-366"><a href="#cb8-366" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> ErrorHandler<span class="op">&gt;</span></span>
<span id="cb8-367"><a href="#cb8-367" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span></span>
<span id="cb8-368"><a href="#cb8-368" aria-hidden="true" tabindex="-1"></a>      <span class="kw">const</span> utf_8_to_16_iterator<span class="op">&lt;</span>I, S, ErrorHandler<span class="op">&gt;&amp;</span> lhs, Sentinel rhs<span class="op">)</span></span>
<span id="cb8-369"><a href="#cb8-369" aria-hidden="true" tabindex="-1"></a>        <span class="kw">requires</span> <span class="kw">requires</span> <span class="op">{</span> lhs<span class="op">.</span>base<span class="op">()</span> <span class="op">==</span> rhs; <span class="op">}</span>;</span>
<span id="cb8-370"><a href="#cb8-370" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h2 data-number="5.6" id="add-a-transcoding-view"><span class="header-section-number">5.6</span> Add a transcoding view<a href="#add-a-transcoding-view" class="self-link"></a></h2>
<h3 data-number="5.6.1" id="add-the-view-proper"><span class="header-section-number">5.6.1</span> Add the view proper<a href="#add-the-view-proper" class="self-link"></a></h3>
<div class="sourceCode" id="cb9"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>utf-view-iter-t</em> <span class="op">=</span> <em>see below</em>;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>utf-view-sent-t</em> <span class="op">=</span> <em>see below</em>;                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>format Format, <span class="kw">class</span> Unpacked<span class="op">&gt;</span></span>
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> <em>make-utf-view-iter</em><span class="op">(</span>Unpacked unpacked<span class="op">)</span>; <span class="co">// <em>exposition only</em></span></span>
<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>format Format, <span class="kw">class</span> Unpacked<span class="op">&gt;</span></span>
<span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> <em>make-utf-view-sent</em><span class="op">(</span>Unpacked unpacked<span class="op">)</span>; <span class="co">// <em>exposition only</em></span></span>
<span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-11"><a href="#cb9-11" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>format Format, utf_range_like V<span class="op">&gt;</span></span>
<span id="cb9-12"><a href="#cb9-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">||</span> utf_pointer<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb9-13"><a href="#cb9-13" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> utf_view <span class="op">:</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&gt;</span></span>
<span id="cb9-14"><a href="#cb9-14" aria-hidden="true" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb9-15"><a href="#cb9-15" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> from_iterator <span class="op">=</span> <em>utf-view-iter-t</em><span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb9-16"><a href="#cb9-16" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> from_sentinel <span class="op">=</span> <em>utf-view-sent-t</em><span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb9-17"><a href="#cb9-17" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-18"><a href="#cb9-18" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> iterator <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>make-utf-view-iter</em><span class="op">&lt;</span>Format<span class="op">&gt;(</span></span>
<span id="cb9-19"><a href="#cb9-19" aria-hidden="true" tabindex="-1"></a>      uc<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>declval<span class="op">&lt;</span>from_iterator<span class="op">&gt;()</span>, declval<span class="op">&lt;</span>from_sentinel<span class="op">&gt;())))</span>;</span>
<span id="cb9-20"><a href="#cb9-20" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> sentinel <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>make-utf-view-iter</em><span class="op">&lt;</span>Format<span class="op">&gt;(</span></span>
<span id="cb9-21"><a href="#cb9-21" aria-hidden="true" tabindex="-1"></a>      uc<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>declval<span class="op">&lt;</span>from_iterator<span class="op">&gt;()</span>, declval<span class="op">&lt;</span>from_sentinel<span class="op">&gt;())))</span>;</span>
<span id="cb9-22"><a href="#cb9-22" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-23"><a href="#cb9-23" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_view<span class="op">()</span> <span class="op">{}</span></span>
<span id="cb9-24"><a href="#cb9-24" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_view<span class="op">(</span>V base<span class="op">)</span>;</span>
<span id="cb9-25"><a href="#cb9-25" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-26"><a href="#cb9-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> iterator begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_; <span class="op">}</span></span>
<span id="cb9-27"><a href="#cb9-27" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> sentinel end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb9-28"><a href="#cb9-28" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-29"><a href="#cb9-29" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> ostream<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">&lt;&lt;(</span>ostream<span class="op">&amp;</span> os, utf_view v<span class="op">)</span>;</span>
<span id="cb9-30"><a href="#cb9-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> wostream<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">&lt;&lt;(</span>wostream<span class="op">&amp;</span> os, utf_view v<span class="op">)</span>;</span>
<span id="cb9-31"><a href="#cb9-31" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-32"><a href="#cb9-32" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb9-33"><a href="#cb9-33" aria-hidden="true" tabindex="-1"></a>    iterator first_;</span>
<span id="cb9-34"><a href="#cb9-34" aria-hidden="true" tabindex="-1"></a>    <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> sentinel last_;</span>
<span id="cb9-35"><a href="#cb9-35" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb9-36"><a href="#cb9-36" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb9-37"><a href="#cb9-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb9-38"><a href="#cb9-38" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>ranges <span class="op">{</span></span>
<span id="cb9-39"><a href="#cb9-39" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>uc<span class="op">::</span>format Format, <span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb9-40"><a href="#cb9-40" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&gt;</span> <span class="op">=</span> <span class="kw">true</span>;</span>
<span id="cb9-41"><a href="#cb9-41" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p><code class="sourceCode default"><em>utf-view-iter-t</em></code>
evaluates to <code class="sourceCode default">V</code> if
<code class="sourceCode default">V</code> is a pointer, and <code class="sourceCode default">decltype(std::ranges::begin(std::declval&lt;V&gt;()))</code>
otherwise.
<code class="sourceCode default"><em>utf-view-sent-t</em></code>
evaluates to <code class="sourceCode default">null_sentinel_t</code> if
<code class="sourceCode default">V</code> is a pointer, and <code class="sourceCode default">decltype(std::ranges::end(std::declval&lt;V&gt;()))</code>
otherwise.</p>
<p><code class="sourceCode default"><em>make-utf-view-iter</em></code>
makes a transcoding iterator that produces the UTF format
<code class="sourceCode default">format</code> from the result of a call
to <code class="sourceCode default">std::uc::unpack_iterator_and_sentinel()</code>,
and similarly
<code class="sourceCode default"><em>make-utf-view-sent</em></code>
makes a sentinel from the result of a call to <code class="sourceCode default">std::uc::unpack_iterator_and_sentinel()</code>.</p>
<p>The <code class="sourceCode default">ostream</code> and
<code class="sourceCode default">wostream</code> stream operators
transcode the <code class="sourceCode default">utf_view</code> to UTF-8
and UTF-16 respectively (if transcoding is needed), and the
<code class="sourceCode default">wostream</code> overload is only
defined on Windows.</p>
<h3 data-number="5.6.2" id="add-as_utfn-view-adaptors"><span class="header-section-number">5.6.2</span> Add
<code class="sourceCode default">as_utfN</code> view adaptors<a href="#add-as_utfn-view-adaptors" class="self-link"></a></h3>
<p>Each <code class="sourceCode default">as_utfN</code> view adaptor
adapts a <code class="sourceCode default">utf_range_like</code> (meaning
an range or a null-terminated pointer), and returns a
<code class="sourceCode default">utf_view</code> that may do transcoding
(if the inputs are not UTF-N) or the given input (if the inputs are
UTF-N).</p>
<div class="sourceCode" id="cb10"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <em>unspecified</em> as_utf8;</span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <em>unspecified</em> as_utf16;</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <em>unspecified</em> as_utf32;</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Here is some psuedo-wording for
<code class="sourceCode default">as_utfN</code> that hopefully
clarifies.</p>
<p>Let <code class="sourceCode default">E</code> be an expression, and
let <code class="sourceCode default">T</code> be <code class="sourceCode default">remove_cvref_t&lt;decltype((E))&gt;</code>.
The expression <code class="sourceCode default">as_utfN(E)</code> is
expression-equivalent to:</p>
<ul>
<li><p>If <code class="sourceCode default">T</code> is a specialization
of <code class="sourceCode default">empty_view</code>
([range.empty.view]), then
<code class="sourceCode default">decay-copy(E)</code>.</p></li>
<li><p>Otherwise, if
<code class="sourceCode default">is_pointer_v&lt;T&gt;</code> is
<code class="sourceCode default">true</code>, and
<code class="sourceCode default">T</code> models <code class="sourceCode default">code_unit_iter&lt;format::utfN&gt;</code>,
then <code class="sourceCode default">ranges::subrange(E, uc::null_sentinel)</code>.</p></li>
<li><p>Otherwise, if
<code class="sourceCode default">ranges::iterator_t&lt;T&gt;</code>
models <code class="sourceCode default">code_unit_iter&lt;format::utfN&gt;</code>,
then <code class="sourceCode default">decay-copy(E)</code>.</p></li>
<li><p>Otherwise, <code class="sourceCode default">utf_view&lt;format::utfN, T&gt;(E)</code>.</p></li>
</ul>
<p>Examples:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>views<span class="op">::</span>all<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>ranges<span class="op">::</span>ref_view<span class="op">&lt;</span><span class="kw">const</span> <span class="dt">char8_t</span> <span class="op">[</span><span class="dv">5</span><span class="op">]&gt;&gt;&gt;)</span>;</span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string str <span class="op">=</span> <span class="st">u8&quot;text&quot;</span>;</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>views<span class="op">::</span>all<span class="op">(</span>str<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>ranges<span class="op">::</span>ref_view<span class="op">&lt;</span>std<span class="op">::</span>u8string<span class="op">&gt;&gt;&gt;)</span>;</span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>str<span class="op">.</span>c_str<span class="op">()</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, <span class="kw">const</span> <span class="dt">char8_t</span> <span class="op">*&gt;&gt;)</span>;</span>
<span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>ranges<span class="op">::</span>empty_view<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;{}</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-17"><a href="#cb11-17" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>ranges<span class="op">::</span>empty_view<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;&gt;)</span>;</span>
<span id="cb11-18"><a href="#cb11-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-19"><a href="#cb11-19" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u16string str2 <span class="op">=</span> <span class="st">u&quot;text&quot;</span>;</span>
<span id="cb11-20"><a href="#cb11-20" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-21"><a href="#cb11-21" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-22"><a href="#cb11-22" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>views<span class="op">::</span>all<span class="op">(</span>str2<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-23"><a href="#cb11-23" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>ranges<span class="op">::</span>ref_view<span class="op">&lt;</span>std<span class="op">::</span>u16string<span class="op">&gt;&gt;)</span>;</span>
<span id="cb11-24"><a href="#cb11-24" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-25"><a href="#cb11-25" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb11-26"><a href="#cb11-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>str2<span class="op">.</span>c_str<span class="op">()</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb11-27"><a href="#cb11-27" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span><span class="kw">const</span> <span class="dt">char16_t</span> <span class="op">*</span>, std<span class="op">::</span>uc<span class="op">::</span>null_sentinel_t<span class="op">&gt;&gt;)</span>;</span></code></pre></div>
<h3 data-number="5.6.3" id="add-utf_view-specialization-of-formatter"><span class="header-section-number">5.6.3</span> Add
<code class="sourceCode default">utf_view</code> specialization of
<code class="sourceCode default">formatter</code><a href="#add-utf_view-specialization-of-formatter" class="self-link"></a></h3>
<p>These should be added to the list of “the debug-enabled string type
specializations” in [format.formatter.spec]. This allows
<code class="sourceCode default">utf_view</code> to be used in
<code class="sourceCode default">std::format()</code> and
<code class="sourceCode default">std::print()</code>. The intention is
that the formatter will transcode to UTF-8 if the formatter’s
<code class="sourceCode default">charT</code> is
<code class="sourceCode default">char</code>, or to UTF-16 if the
formatter’s <code class="sourceCode default">charT</code> is
<code class="sourceCode default">wchar_t</code> – if transcoding is
necessary at all.</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>uc<span class="op">::</span>format Format, <span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;</span>, charT<span class="op">&gt;</span>;</span></code></pre></div>
<h3 data-number="5.6.4" id="add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking"><span class="header-section-number">5.6.4</span> Add
<code class="sourceCode default">unpack_iterator_and_sentinel</code> CPO
for iterator “unpacking”<a href="#add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking" class="self-link"></a></h3>
<div class="sourceCode" id="cb13"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> no_op_repacker <span class="op">{</span></span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a>    T <span class="kw">operator</span><span class="op">()(</span>T x<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> x; <span class="op">}</span></span>
<span id="cb13-4"><a href="#cb13-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb13-5"><a href="#cb13-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-6"><a href="#cb13-6" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> RepackedIterator, <span class="kw">class</span> I, <span class="kw">class</span> S, <span class="kw">class</span> Then<span class="op">&gt;</span></span>
<span id="cb13-7"><a href="#cb13-7" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> repacker <span class="op">{</span></span>
<span id="cb13-8"><a href="#cb13-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">auto</span> <span class="kw">operator</span><span class="op">()(</span>I it<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> then<span class="op">(</span>RepackedIterator<span class="op">(</span>first, it, last<span class="op">))</span>; <span class="op">}</span></span>
<span id="cb13-9"><a href="#cb13-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-10"><a href="#cb13-10" aria-hidden="true" tabindex="-1"></a>  I first;</span>
<span id="cb13-11"><a href="#cb13-11" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> S last;</span>
<span id="cb13-12"><a href="#cb13-12" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> Then then;</span>
<span id="cb13-13"><a href="#cb13-13" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb13-14"><a href="#cb13-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-15"><a href="#cb13-15" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>format FormatTag, utf_iter I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, <span class="kw">class</span> Repack<span class="op">&gt;</span></span>
<span id="cb13-16"><a href="#cb13-16" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> utf_tagged_range <span class="op">{</span></span>
<span id="cb13-17"><a href="#cb13-17" aria-hidden="true" tabindex="-1"></a>  <span class="kw">static</span> <span class="kw">constexpr</span> format format_tag <span class="op">=</span> FormatTag;</span>
<span id="cb13-18"><a href="#cb13-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-19"><a href="#cb13-19" aria-hidden="true" tabindex="-1"></a>  I first;</span>
<span id="cb13-20"><a href="#cb13-20" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> S last;</span>
<span id="cb13-21"><a href="#cb13-21" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> Repack repack;</span>
<span id="cb13-22"><a href="#cb13-22" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb13-23"><a href="#cb13-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb13-24"><a href="#cb13-24" aria-hidden="true" tabindex="-1"></a><span class="co">// CPO equivalent to:</span></span>
<span id="cb13-25"><a href="#cb13-25" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>utf_iter I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, <span class="kw">class</span> Repack <span class="op">=</span> no_op_repacker<span class="op">&gt;</span></span>
<span id="cb13-26"><a href="#cb13-26" aria-hidden="true" tabindex="-1"></a><span class="kw">constexpr</span> <span class="kw">auto</span> unpack_iterator_and_sentinel<span class="op">(</span>I first, S last, Repack repack <span class="op">=</span> Repack<span class="op">())</span>;</span></code></pre></div>
<p>A simple way to represent a transcoding view is as a pair of
transcoding iterators. However, there is a problem with that approach,
since a <code class="sourceCode default">utf_view&lt;format::utf32, utf_8_to_32_iterator&lt;char const *&gt;&gt;</code>
would be a range the size of 6 pointers. Worse yet, a <code class="sourceCode default">utf_view&lt;format::utf32, utf_8_to_16_iterator&lt;utf_16_to_32_iterator&lt;char const *&gt;&gt;&gt;</code>
would be the size of 18 pointers! Further, such a view would do a UTF-8
to UTF-16 to UTF-32 conversion, when it could have done a direct UTF-8
to UTF-32 conversion instead.</p>
<p>To solve these kinds of problems,
<code class="sourceCode default">utf_view</code> unpacks the iterators
it is given in the view it adapts, so that only the bottom-most
underlying pointer or iterator is stored:</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>string str <span class="op">=</span> <span class="st">&quot;some text&quot;</span>;</span>
<span id="cb14-2"><a href="#cb14-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb14-3"><a href="#cb14-3" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> to_16_first <span class="op">=</span> std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">::</span>iterator<span class="op">&gt;(</span></span>
<span id="cb14-4"><a href="#cb14-4" aria-hidden="true" tabindex="-1"></a>    str<span class="op">.</span>begin<span class="op">()</span>, str<span class="op">.</span>begin<span class="op">()</span>, str<span class="op">.</span>end<span class="op">())</span>;</span>
<span id="cb14-5"><a href="#cb14-5" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> to_16_last <span class="op">=</span> std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">::</span>iterator<span class="op">&gt;(</span></span>
<span id="cb14-6"><a href="#cb14-6" aria-hidden="true" tabindex="-1"></a>    str<span class="op">.</span>begin<span class="op">()</span>, str<span class="op">.</span>end<span class="op">()</span>, str<span class="op">.</span>end<span class="op">())</span>;</span>
<span id="cb14-7"><a href="#cb14-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb14-8"><a href="#cb14-8" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> to_32_first <span class="op">=</span> std<span class="op">::</span>uc<span class="op">::</span>utf_16_to_32_iterator<span class="op">&lt;</span></span>
<span id="cb14-9"><a href="#cb14-9" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">::</span>iterator<span class="op">&gt;</span></span>
<span id="cb14-10"><a href="#cb14-10" aria-hidden="true" tabindex="-1"></a><span class="op">&gt;(</span>to_16_first, to_16_first, to_16_last<span class="op">)</span>;</span>
<span id="cb14-11"><a href="#cb14-11" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> to_32_last <span class="op">=</span> std<span class="op">::</span>uc<span class="op">::</span>utf_16_to_32_iterator<span class="op">&lt;</span></span>
<span id="cb14-12"><a href="#cb14-12" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_8_to_16_iterator<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">::</span>iterator<span class="op">&gt;</span></span>
<span id="cb14-13"><a href="#cb14-13" aria-hidden="true" tabindex="-1"></a><span class="op">&gt;(</span>to_16_first, to_16_last, to_16_last<span class="op">)</span>;</span>
<span id="cb14-14"><a href="#cb14-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb14-15"><a href="#cb14-15" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> range <span class="op">=</span> std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">(</span>to_32_first, to_32_last<span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8;</span>
<span id="cb14-16"><a href="#cb14-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb14-17"><a href="#cb14-17" aria-hidden="true" tabindex="-1"></a><span class="co">// Poof!  The utf_16_to_32_iterators disappeared!</span></span>
<span id="cb14-18"><a href="#cb14-18" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same<span class="op">&lt;</span>std<span class="op">::</span>ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span><span class="kw">decltype</span><span class="op">(</span>range<span class="op">)&gt;</span>, std<span class="op">::</span>string<span class="op">::</span>iterator<span class="op">&gt;::</span>value, <span class="st">&quot;&quot;</span><span class="op">)</span>;</span></code></pre></div>
<p>Each of these views stores only a single iterator and sentinel, so
each view is typically the size of two pointers, and possibly smaller if
a sentinel is used.</p>
<p>The same unpacking logic is used in the entire proposed API. This
allows you to write
<code class="sourceCode default">r | std::uc::as_utf32</code> in a
generic context, without caring whether
<code class="sourceCode default">r</code> is a range of UTF-8, UTF-16,
or UTF-32. You do not need to care about whether
<code class="sourceCode default">r</code> is a common range or not. You
also can ignore whether <code class="sourceCode default">r</code> is
comprised of raw pointers, some other kind of iterator, or transcoding
iterators. For example, if
<code class="sourceCode default">r.begin()</code> is a
<code class="sourceCode default">utf_32_to_8_iterator</code>, the
resulting view will use
<code class="sourceCode default">r.begin().base()</code> for its
begin-iterator.</p>
<p>Sometimes, an interface might accept any UTF-N iterator, and then
transcode internally to UTF-32:</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>input_iterator I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, output_iterator<span class="op">&lt;</span><span class="dt">char8_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">requires</span><span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span>;</span></code></pre></div>
<p>For such interfaces, it can be difficult in the general case to form
an iterator of type <code class="sourceCode default">I</code> to return
to the user:</p>
<div class="sourceCode" id="cb16"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>input_iterator I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, output_iterator<span class="op">&lt;</span><span class="dt">char8_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span><span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span> <span class="op">{</span></span>
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Get the input as UTF-32.</span></span>
<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> r <span class="op">=</span> uc<span class="op">::</span>utf_view<span class="op">(</span>uc<span class="op">::</span>format<span class="op">::</span>utf32, first, last<span class="op">)</span>;</span>
<span id="cb16-6"><a href="#cb16-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb16-7"><a href="#cb16-7" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Do transcoding.</span></span>
<span id="cb16-8"><a href="#cb16-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> copy_result <span class="op">=</span> ranges<span class="op">::</span>copy<span class="op">(</span>r, out<span class="op">)</span>;</span>
<span id="cb16-9"><a href="#cb16-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb16-10"><a href="#cb16-10" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Return an in_out_result.</span></span>
<span id="cb16-11"><a href="#cb16-11" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> result<span class="op">&lt;</span>I, O<span class="op">&gt;{</span><span class="co">/* ??? */</span>, copy_result<span class="op">.</span>out<span class="op">}</span>;</span>
<span id="cb16-12"><a href="#cb16-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>What should we write for
<code class="sourceCode default">/* ??? */</code>? That is, how do we
get back from the UTF-32 iterator
<code class="sourceCode default">r.begin()</code> to an
<code class="sourceCode default">I</code> iterator? It’s harder than it
first seems; consider the case where
<code class="sourceCode default">I</code> is <code class="sourceCode default">std::uc::utf_16_to_32_iterator&lt;std::uc::utf_8_to_16_iterator&lt;std::string::iterator&gt;&gt;</code>.
The solution is for the unpacking algorithm to remember the structure of
whatever iterator it unpacks, and then rebuild the structure when
returning the result. To demonstrate, here is the implementation of
<code class="sourceCode default">transcode_to_utf32</code> from
Boost.Text:</p>
<div class="sourceCode" id="cb17"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>input_iterator I, std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, std<span class="op">::</span>output_iterator<span class="op">&lt;</span><span class="dt">char32_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span><span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>std<span class="op">::</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>std<span class="op">::</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span></span>
<span id="cb17-4"><a href="#cb17-4" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb17-5"><a href="#cb17-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> <span class="kw">const</span> r <span class="op">=</span> boost<span class="op">::</span>text<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>first, last<span class="op">)</span>;</span>
<span id="cb17-6"><a href="#cb17-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> unpacked <span class="op">=</span> detail<span class="op">::</span>transcode_to_32<span class="op">&lt;</span><span class="kw">false</span><span class="op">&gt;(</span></span>
<span id="cb17-7"><a href="#cb17-7" aria-hidden="true" tabindex="-1"></a>        detail<span class="op">::</span>tag_t<span class="op">&lt;</span>r<span class="op">.</span>format_tag<span class="op">&gt;</span>, r<span class="op">.</span>first, r<span class="op">.</span>last, <span class="op">-</span><span class="dv">1</span>, out<span class="op">)</span>;</span>
<span id="cb17-8"><a href="#cb17-8" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">{</span>r<span class="op">.</span>repack<span class="op">(</span>unpacked<span class="op">.</span>in<span class="op">)</span>, unpacked<span class="op">.</span>out<span class="op">}</span>;</span>
<span id="cb17-9"><a href="#cb17-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>If this all sounds way too complicated, it’s not that bad at all.
Here’s the unpacking/repacking implementation from Boost.Text: <a href="https://github.com/tzlaine/text/blob/develop/include/boost/text/unpack.hpp">unpack.hpp</a>.</p>
<p><code class="sourceCode default">unpack_iterator_and_sentinel</code>
is a CPO. It is intended to work with UDTs that provide ther own
unpacking implementation. It returns a
<code class="sourceCode default">utf_tagged_range</code>.</p>
<h2 data-number="5.7" id="add-a-feature-test-macro"><span class="header-section-number">5.7</span> Add a feature test macro<a href="#add-a-feature-test-macro" class="self-link"></a></h2>
<p>Add the feature test macro
<code class="sourceCode default">__cpp_lib_unicode_transcoding</code>.</p>
<h2 data-number="5.8" id="design-notes"><span class="header-section-number">5.8</span> Design notes<a href="#design-notes" class="self-link"></a></h2>
<p>None of the proposed interfaces is subject to change in future
versions of Unicode; each relates to the guaranteed-stable subset. Just
sayin’.</p>
<p>None of the proposed interfaces allocates.</p>
<p>The proposed interfaces allow users to choose amongst multiple
convenience-vs-compatibility tradeoffs. Explicitly, they are:</p>
<ul>
<li>If you need compatibility with existing iterator-based algorithms
(such as the standard algorithms), use the transcoding iterators.</li>
<li>If you want streamability or the convenience of constructing ranges
with a single <code class="sourceCode default">| as_utfN</code> adaptor
use, use the transcoding views.</li>
</ul>
<p>All the transcoding iterators allow you access to the underlying
iterator via <code class="sourceCode default">.base()</code>, following
the convention of the iterator adaptors already in the standard.</p>
<p>The transcoding views are lazy, as you’d expect. They also compose
with the standard view adaptors, so just transcoding at most 10 UTF-16
code units out of some UTF can be done with <code class="sourceCode default">foo | std::uc::as_utf16 | std::ranges::views::take(10)</code>.</p>
<p>Error handling is explicitly configurable in the transcoding
iterators. This gives complete control to those who want to do something
other than the default. The default, according to Unicode, is to produce
a replacement character (<code class="sourceCode default">0xfffd</code>)
in the output when broken UTF encoding is seen in the input. This is
what all these interfaces do, unless you configure one of the iterators
as mentioned above.</p>
<p>The production of replacement characters as error-handling strategy
is good for memory compactness and safety. It allows us to store all our
text as UTF-8 (or, less compactly, as UTF-16), and then process code
points as transcoding views. If an error occurs, the transcoding views
will simply produce a replacement character; there is no danger of
UB.</p>
<p>Code units are just numbers. All of these interfaces treat integral
types as code units of various sizes (at least the ones that are 8-,
16-, or 32-bit). Signedness is ignored.</p>
<p>A null-terminated pointer <code class="sourceCode default">p</code>
to an 8-, 16-, or 32-bit string of code units is considered the implicit
range <code class="sourceCode default">[p, null_sentinel)</code>. This
makes user code much more natural;
<code class="sourceCode default">&quot;foo&quot; | as_utf16</code>,
<code class="sourceCode default">&quot;foo&quot;sv | as_utf16</code>,
and <code class="sourceCode default">&quot;foo&quot;s | as_utf16</code>
are roughly equivalent (though the iterator type of the resulting view
may differ).</p>
<p>Iterators are constructed from more than one underlying iterator. To
do iteration in many text-handling contexts, you need to know the
beginning and the end of the range you are iterating over, just to be
able to do iteration correctly. Note that this is not a safety issue,
but a correctness one. For example, say we have a string
<code class="sourceCode default">s</code> of UTF-8 code units that we
would like to iterate over to produce UTF-32 code points. If the last
code unit in <code class="sourceCode default">s</code> is
<code class="sourceCode default">0xe0</code>, we should expect two more
code units to follow. They are not present, though, because
<code class="sourceCode default">0xe0</code> is the last code unit. Now
consider how you would implement
<code class="sourceCode default">operator++()</code> for an iterator
<code class="sourceCode default">iter</code> that transcodes from UTF-8
to UTF-32. If you advance far enough to get the next UTF-32 code point
in each call to <code class="sourceCode default">operator++()</code>,
you may run off the end of <code class="sourceCode default">s</code>
when you find <code class="sourceCode default">0xe0</code> and try to
read two more code units. Note that it does not matter that
<code class="sourceCode default">iter</code> probably comes from a range
with an end-iterator or sentinel as its mate; inside
<code class="sourceCode default">iter</code>’s
<code class="sourceCode default">operator++()</code> this is no help.
<code class="sourceCode default">iter</code> must therefore have the
end-iterator or sentinel as a data member. The same logic applies to the
other end of the range if <code class="sourceCode default">iter</code>
is bidirectional — it must also have the iterator to the start of the
underlying range as a data member. This unfortunate reality comes up
over and over in the proposed iterators, not just the ones that are UTF
transcoding iterators. This is why iterators in this proposal (and the
ones to come) usually consist of three underlying iterators.</p>
<h1 data-number="6" id="implementation-experience"><span class="header-section-number">6</span> Implementation experience<a href="#implementation-experience" class="self-link"></a></h1>
<p>All the interfaces proposed here have been implemented, and
re-implemented, several times over the last 5 years or so. They are part
of a proposed (but not yet accepted!) Boost library, <a href="https://github.com/tzlaine/text">Boost.Text</a>.</p>
<p>The library has hundreds of stars, though I’m not sure how many users
that equates to. All of the interfaces proposed here are among the
best-exercised in the library. There are comprehensive tests for all the
proposed entities, and those entities are used as the foundation upon
which all the other library entities are composed.</p>
<p>Though there are a lot of individual entities proposed here, at one
time or another I have need each one of them, though maybe not in every
UTF-N -&gt; UTF-M permutation. Those transcoding permutations are there
mostly for completeness. I have only ever needed UTF-8 &lt;-&gt;
UTF-&gt;32 in any of my work that uses Unicode. Frequent Windows users
will also need to convert to and from UTF-16 sometimes, because that is
the UTF that the OS APIs use.</p>
<h1 data-number="7" id="bibliography"><span class="header-section-number">7</span> References<a href="#bibliography" class="self-link"></a></h1>
<div id="refs" class="references csl-bib-body hanging-indent" role="doc-bibliography">
<div id="ref-P1629R1" class="csl-entry" role="doc-biblioentry">
[P1629R1] JeanHeyd Meneide. 2020-03-02. Transcoding the world - Standard
Text Encoding. <a href="https://wg21.link/p1629r1"><div class="csl-block">https://wg21.link/p1629r1</div></a>
</div>
</div>
</div>
</div>
</body>
</html>
