<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang xml:lang>
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="mpark/wg21" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <meta name="dcterms.date" content="2023-06-10" />
  <title>Unicode in the Library, Part 1: UTF Transcoding</title>
  <style>
      code{white-space: pre-wrap;}
      span.smallcaps{font-variant: small-caps;}
      span.underline{text-decoration: underline;}
      div.column{display: inline-block; vertical-align: top; width: 50%;}
      div.csl-block{margin-left: 1.5em;}
      ul.task-list{list-style: none;}
      pre > code.sourceCode { white-space: pre; position: relative; }
      pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
      pre > code.sourceCode > span:empty { height: 1.2em; }
      .sourceCode { overflow: visible; }
      code.sourceCode > span { color: inherit; text-decoration: inherit; }
      div.sourceCode { margin: 1em 0; }
      pre.sourceCode { margin: 0; }
      @media screen {
      div.sourceCode { overflow: auto; }
      }
      @media print {
      pre > code.sourceCode { white-space: pre-wrap; }
      pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
      }
      pre.numberSource code
        { counter-reset: source-line 0; }
      pre.numberSource code > span
        { position: relative; left: -4em; counter-increment: source-line; }
      pre.numberSource code > span > a:first-child::before
        { content: counter(source-line);
          position: relative; left: -1em; text-align: right; vertical-align: baseline;
          border: none; display: inline-block;
          -webkit-touch-callout: none; -webkit-user-select: none;
          -khtml-user-select: none; -moz-user-select: none;
          -ms-user-select: none; user-select: none;
          padding: 0 4px; width: 4em;
          color: #aaaaaa;
        }
      pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa;  padding-left: 4px; }
      div.sourceCode
        {  background-color: #f6f8fa; }
      @media screen {
      pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
      }
      code span { } /* Normal */
      code span.al { color: #ff0000; } /* Alert */
      code span.an { } /* Annotation */
      code span.at { } /* Attribute */
      code span.bn { color: #9f6807; } /* BaseN */
      code span.bu { color: #9f6807; } /* BuiltIn */
      code span.cf { color: #00607c; } /* ControlFlow */
      code span.ch { color: #9f6807; } /* Char */
      code span.cn { } /* Constant */
      code span.co { color: #008000; font-style: italic; } /* Comment */
      code span.cv { color: #008000; font-style: italic; } /* CommentVar */
      code span.do { color: #008000; } /* Documentation */
      code span.dt { color: #00607c; } /* DataType */
      code span.dv { color: #9f6807; } /* DecVal */
      code span.er { color: #ff0000; font-weight: bold; } /* Error */
      code span.ex { } /* Extension */
      code span.fl { color: #9f6807; } /* Float */
      code span.fu { } /* Function */
      code span.im { } /* Import */
      code span.in { color: #008000; } /* Information */
      code span.kw { color: #00607c; } /* Keyword */
      code span.op { color: #af1915; } /* Operator */
      code span.ot { } /* Other */
      code span.pp { color: #6f4e37; } /* Preprocessor */
      code span.re { } /* RegionMarker */
      code span.sc { color: #9f6807; } /* SpecialChar */
      code span.ss { color: #9f6807; } /* SpecialString */
      code span.st { color: #9f6807; } /* String */
      code span.va { } /* Variable */
      code span.vs { color: #9f6807; } /* VerbatimString */
      code span.wa { color: #008000; font-weight: bold; } /* Warning */
      code.diff {color: #898887}
      code.diff span.va {color: #006e28}
      code.diff span.st {color: #bf0303}
  </style>
  <style type="text/css">
body {
margin: 5em;
font-family: serif;

hyphens: auto;
line-height: 1.35;
text-align: justify;
}
@media screen and (max-width: 30em) {
body {
margin: 1.5em;
}
}
div.wrapper {
max-width: 60em;
margin: auto;
}
ul {
list-style-type: none;
padding-left: 2em;
margin-top: -0.2em;
margin-bottom: -0.2em;
}
a {
text-decoration: none;
color: #4183C4;
}
a.hidden_link {
text-decoration: none;
color: inherit;
}
li {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
h1, h2, h3, h4 {
position: relative;
line-height: 1;
}
a.self-link {
position: absolute;
top: 0;
left: calc(-1 * (3.5rem - 26px));
width: calc(3.5rem - 26px);
height: 2em;
text-align: center;
border: none;
transition: opacity .2s;
opacity: .5;
font-family: sans-serif;
font-weight: normal;
font-size: 83%;
}
a.self-link:hover { opacity: 1; }
a.self-link::before { content: "§"; }
ul > li:before {
content: "\2014";
position: absolute;
margin-left: -1.5em;
}
:target { background-color: #C9FBC9; }
:target .codeblock { background-color: #C9FBC9; }
:target ul { background-color: #C9FBC9; }
.abbr_ref { float: right; }
.folded_abbr_ref { float: right; }
:target .folded_abbr_ref { display: none; }
:target .unfolded_abbr_ref { float: right; display: inherit; }
.unfolded_abbr_ref { display: none; }
.secnum { display: inline-block; min-width: 35pt; }
.header-section-number { display: inline-block; min-width: 35pt; }
.annexnum { display: block; }
div.sourceLinkParent {
float: right;
}
a.sourceLink {
position: absolute;
opacity: 0;
margin-left: 10pt;
}
a.sourceLink:hover {
opacity: 1;
}
a.itemDeclLink {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
opacity: 0;
}
a.itemDeclLink:hover { opacity: 1; }
span.marginalizedparent {
position: relative;
left: -5em;
}
li span.marginalizedparent { left: -7em; }
li ul > li span.marginalizedparent { left: -9em; }
li ul > li ul > li span.marginalizedparent { left: -11em; }
li ul > li ul > li ul > li span.marginalizedparent { left: -13em; }
div.footnoteNumberParent {
position: relative;
left: -4.7em;
}
a.marginalized {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
}
a.enumerated_item_num {
position: relative;
left: -3.5em;
display: inline-block;
margin-right: -3em;
text-align: right;
width: 3em;
}
div.para { margin-bottom: 0.6em; margin-top: 0.6em; text-align: justify; }
div.section { text-align: justify; }
div.sentence { display: inline; }
span.indexparent {
display: inline;
position: relative;
float: right;
right: -1em;
}
a.index {
position: absolute;
display: none;
}
a.index:before { content: "⟵"; }

a.index:target {
display: inline;
}
.indexitems {
margin-left: 2em;
text-indent: -2em;
}
div.itemdescr {
margin-left: 3em;
}
.bnf {
font-family: serif;
margin-left: 40pt;
margin-top: 0.5em;
margin-bottom: 0.5em;
}
.ncbnf {
font-family: serif;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
}
.ncsimplebnf {
font-family: serif;
font-style: italic;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
background: inherit; 
}
span.textnormal {
font-style: normal;
font-family: serif;
white-space: normal;
display: inline-block;
}
span.rlap {
display: inline-block;
width: 0px;
}
span.descr { font-style: normal; font-family: serif; }
span.grammarterm { font-style: italic; }
span.term { font-style: italic; }
span.terminal { font-family: monospace; font-style: normal; }
span.nonterminal { font-style: italic; }
span.tcode { font-family: monospace; font-style: normal; }
span.textbf { font-weight: bold; }
span.textsc { font-variant: small-caps; }
a.nontermdef { font-style: italic; font-family: serif; }
span.emph { font-style: italic; }
span.techterm { font-style: italic; }
span.mathit { font-style: italic; }
span.mathsf { font-family: sans-serif; }
span.mathrm { font-family: serif; font-style: normal; }
span.textrm { font-family: serif; }
span.textsl { font-style: italic; }
span.mathtt { font-family: monospace; font-style: normal; }
span.mbox { font-family: serif; font-style: normal; }
span.ungap { display: inline-block; width: 2pt; }
span.textit { font-style: italic; }
span.texttt { font-family: monospace; }
span.tcode_in_codeblock { font-family: monospace; font-style: normal; }
span.phantom { color: white; }

span.math { font-style: normal; }
span.mathblock {
display: block;
margin-left: auto;
margin-right: auto;
margin-top: 1.2em;
margin-bottom: 1.2em;
text-align: center;
}
span.mathalpha {
font-style: italic;
}
span.synopsis {
font-weight: bold;
margin-top: 0.5em;
display: block;
}
span.definition {
font-weight: bold;
display: block;
}
.codeblock {
margin-left: 1.2em;
line-height: 127%;
}
.outputblock {
margin-left: 1.2em;
line-height: 127%;
}
div.itemdecl {
margin-top: 2ex;
}
code.itemdeclcode {
white-space: pre;
display: block;
}
span.textsuperscript {
vertical-align: super;
font-size: smaller;
line-height: 0;
}
.footnotenum { vertical-align: super; font-size: smaller; line-height: 0; }
.footnote {
font-size: small;
margin-left: 2em;
margin-right: 2em;
margin-top: 0.6em;
margin-bottom: 0.6em;
}
div.minipage {
display: inline-block;
margin-right: 3em;
}
div.numberedTable {
text-align: center;
margin: 2em;
}
div.figure {
text-align: center;
margin: 2em;
}
table {
border: 1px solid black;
border-collapse: collapse;
margin-left: auto;
margin-right: auto;
margin-top: 0.8em;
text-align: left;
hyphens: none; 
}
td, th {
padding-left: 1em;
padding-right: 1em;
vertical-align: top;
}
td.empty {
padding: 0px;
padding-left: 1px;
}
td.left {
text-align: left;
}
td.right {
text-align: right;
}
td.center {
text-align: center;
}
td.justify {
text-align: justify;
}
td.border {
border-left: 1px solid black;
}
tr.rowsep, td.cline {
border-top: 1px solid black;
}
tr.even, tr.odd {
border-bottom: 1px solid black;
}
tr.capsep {
border-top: 3px solid black;
border-top-style: double;
}
tr.header {
border-bottom: 3px solid black;
border-bottom-style: double;
}
th {
border-bottom: 1px solid black;
}
span.centry {
font-weight: bold;
}
div.table {
display: block;
margin-left: auto;
margin-right: auto;
text-align: center;
width: 90%;
}
span.indented {
display: block;
margin-left: 2em;
margin-bottom: 1em;
margin-top: 1em;
}
ol.enumeratea { list-style-type: none; background: inherit; }
ol.enumerate { list-style-type: none; background: inherit; }

code.sourceCode > span { display: inline; }
</style>
  <link href="data:image/x-icon;base64,AAABAAIAEBAAAAEAIABoBAAAJgAAACAgAAABACAAqBAAAI4EAAAoAAAAEAAAACAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAVoJEAN6CRADegkQAWIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wCCRAAAgkQAAIJEAACCRAAsgkQAvoJEAP+CRAD/gkQA/4JEAP+CRADAgkQALoJEAACCRAAAgkQAAP///wD///8AgkQAAIJEABSCRACSgkQA/IJEAP99PQD/dzMA/3czAP99PQD/gkQA/4JEAPyCRACUgkQAFIJEAAD///8A////AHw+AFiBQwDqgkQA/4BBAP9/PxP/uZd6/9rJtf/bybX/upd7/39AFP+AQQD/gkQA/4FDAOqAQgBc////AP///wDKklv4jlEa/3o7AP+PWC//8+3o///////////////////////z7un/kFox/35AAP+GRwD/mVYA+v///wD///8A0Zpk+NmibP+0d0T/8evj///////+/fv/1sKz/9bCs//9/fr//////+/m2/+NRwL/nloA/5xYAPj///8A////ANKaZPjRmGH/5cKh////////////k149/3UwAP91MQD/lmQ//86rhv+USg3/m1YA/5hSAP+bVgD4////AP///wDSmmT4zpJY/+/bx///////8+TV/8mLT/+TVx//gkIA/5lVAP+VTAD/x6B//7aEVv/JpH7/s39J+P///wD///8A0ppk+M6SWP/u2sf///////Pj1f/Nj1T/2KFs/8mOUv+eWhD/lEsA/8aee/+0glT/x6F7/7J8Rvj///8A////ANKaZPjRmGH/48Cf///////+/v7/2qt//82PVP/OkFX/37KJ/86siv+USg7/mVQA/5hRAP+bVgD4////AP///wDSmmT40ppk/9CVXP/69O////////7+/v/x4M//8d/P//7+/f//////9u7n/6tnJf+XUgD/nFgA+P///wD///8A0ppk+NKaZP/RmWL/1qNy//r07///////////////////////+vXw/9akdP/Wnmn/y5FY/6JfFvj///8A////ANKaZFTSmmTo0ppk/9GYYv/Ql1//5cWm//Hg0P/x4ND/5cWm/9GXYP/RmGH/0ppk/9KaZOjVnmpY////AP///wDSmmQA0ppkEtKaZI7SmmT60ppk/9CWX//OkVb/zpFW/9CWX//SmmT/0ppk/NKaZJDSmmQS0ppkAP///wD///8A0ppkANKaZADSmmQA0ppkKtKaZLrSmmT/0ppk/9KaZP/SmmT/0ppkvNKaZCrSmmQA0ppkANKaZAD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkUtKaZNzSmmTc0ppkVNKaZADSmmQA0ppkANKaZADSmmQA////AP5/AAD4HwAA4AcAAMADAACAAQAAgAEAAIABAACAAQAAgAEAAIABAACAAQAAgAEAAMADAADgBwAA+B8AAP5/AAAoAAAAIAAAAEAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAyCRACMgkQA6oJEAOqCRACQgkQAEIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRABigkQA5oJEAP+CRAD/gkQA/4JEAP+CRADqgkQAZoJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAA4gkQAwoJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQAxIJEADyCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAP///wD///8A////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAWgkQAmIJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAJyCRAAYgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAdIJEAPCCRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAPSCRAB4gkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQASoJEANKCRAD/gkQA/4JEAP+CRAD/g0YA/39AAP9zLgD/bSQA/2shAP9rIQD/bSQA/3MuAP9/PwD/g0YA/4JEAP+CRAD/gkQA/4JEAP+CRADUgkQAToJEAACCRAAAgkQAAP///wD///8A////AP///wB+PwAAgkUAIoJEAKiCRAD/gkQA/4JEAP+CRAD/hEcA/4BBAP9sIwD/dTAA/5RfKv+viF7/vp56/76ee/+wiF7/lWAr/3YxAP9sIwD/f0AA/4RHAP+CRAD/gkQA/4JEAP+CRAD/gkQArIJEACaBQwAA////AP///wD///8A////AIBCAEBzNAD6f0EA/4NFAP+CRAD/gkQA/4VIAP92MwD/bSUA/6N1Tv/ezsL/////////////////////////////////38/D/6V3Uv9uJgD/dTEA/4VJAP+CRAD/gkQA/4JEAP+BQwD/fUAA/4FDAEj///8A////AP///wD///8AzJRd5qBlKf91NgD/dDUA/4JEAP+FSQD/cy4A/3YyAP/PuKP//////////////////////////////////////////////////////9K7qP94NQD/ciwA/4VJAP+CRAD/fkEA/35BAP+LSwD/mlYA6v///wD///8A////AP///wDdpnL/4qx3/8KJUv+PUhf/cTMA/3AsAP90LgD/4dK+/////////////////////////////////////////////////////////////////+TYxf91MAD/dTIA/31CAP+GRwD/llQA/6FcAP+gWwD8////AP///wD///8A////ANGZY/LSm2X/4ap3/92mcP+wdT3/byQA/8mwj////////////////////////////////////////////////////////////////////////////+LYxv9zLgP/jUoA/59bAP+hXAD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/RmWL/1p9q/9ubXv/XqXj////////////////////////////7+fD/vZyG/6BxS/+gcUr/vJuE//r37f//////////////////////3MOr/5dQBf+dVQD/nVkA/5xYAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmWP/yohJ//jo2P//////////////////////4NTG/4JDFf9lGAD/bSQA/20kAP9kGAD/fz8S/+Xb0f//////5NG9/6txN/+LOgD/m1QA/51aAP+cWAD/m1cA/5xYAP+cWADy////AP///wD///8A////ANKaZPLSmmT/0ppk/8+TWf/Unmv//v37//////////////////////+TWRr/VwsA/35AAP+ERgD/g0UA/4JGAP9lHgD/kFga/8KXX/+TRwD/jT4A/49CAP+VTQD/n10A/5xYAP+OQQD/lk4A/55cAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/y4tO/92yiP//////////////////////8NnE/8eCQP+rcTT/ez0A/3IyAP98PgD/gEMA/5FSAP+USwD/jj8A/5lUAP+JNwD/yqV2/694Mf+HNQD/jkAA/82rf/+laBj/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/LiUr/4byY///////////////////////gupX/0I5P/+Wuev/Lklz/l1sj/308AP+QSwD/ol0A/59aAP+aVQD/k0oA/8yoh///////+fXv/6pwO//Lp3v///////Pr4f+oay7y////AP///wD///8A////ANKaZPLSmmT/0ppk/8uJSv/hvJj//////////////////////+G7l//Jhkb/0ppk/96nc//fqXX/x4xO/6dkFP+QSQD/llEA/5xXAP+USgD/yaOA///////38uv/qG05/8ijdv//////8efb/6ZpLPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/zIxO/9yxh///////////////////////7dbA/8iEQf/Sm2X/0Zlj/9ScZv/eqHf/2KJv/7yAQf+XTgD/iToA/5lSAP+JNgD/yKFv/611LP+HNQD/jT8A/8qmeP+kZRT/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/Pk1n/1J5q//78+//////////////////+/fv/1aFv/8iEQv/Tm2b/0ppl/9GZY//Wn2z/1pZc/9eldf/Bl2b/kUcA/4w9AP+OQAD/lUwA/59eAP+cWQD/jT8A/5ZOAP+eXADy////AP///wD///8A////ANKaZPLSmmT/0ppk/9KZY//KiEn/8d/P///////////////////////47+f/05tm/8iCP//KiEj/yohJ/8eCP//RmGH//vfy///////n1sP/rXQ7/4k4AP+TTAD/nVoA/5xYAP+cVwD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/0ptl/8uLTf/aq37////////////////////////////+/fz/6c2y/961jv/etY7/6Myx//78+v//////////////////////3MWv/5xXD/+ORAD/mFQA/51ZAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmmT/0ppk/8mFRP/s1b//////////////////////////////////////////////////////////////////////////////+PD/0JFU/7NzMv+WUQD/kUsA/5tXAP+dWQDy////AP///wD///8A////ANKaZP/SmmT/0ppk/9KaZP/Sm2X/z5NZ/8yMT//z5NX/////////////////////////////////////////////////////////////////9Ofa/8yNUP/UmGH/36p5/8yTWv+qaSD/kksA/5ROAPz///8A////AP///wD///8A0ppk5NKaZP/SmmT/0ppk/9KaZP/TnGf/zY9T/82OUv/t1sD//////////////////////////////////////////////////////+7Yw//OkFX/zI5R/9OcZ//SmmP/26V0/9ymdf/BhUf/ol8R6P///wD///8A////AP///wDSmmQ80ppk9tKaZP/SmmT/0ppk/9KaZP/TnGj/zpFW/8qJSv/dson/8uHS//////////////////////////////////Lj0//etIv/y4lL/86QVf/TnGj/0ppk/9KaZP/RmWP/05xn/9ymdfjUnWdC////AP///wD///8A////ANKaZADSmmQc0ppkotKaZP/SmmT/0ppk/9KaZP/Tm2b/0Zli/8qJSf/NjlH/16Z3/+G8mP/myKr/5siq/+G8mP/Xp3f/zY5S/8qISf/RmGH/05tm/9KaZP/SmmT/0ppk/9KaZP/SmmSm0pljINWdaQD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkQtKaZMrSmmT/0ppk/9KaZP/SmmT/0ptl/9GYYf/Nj1P/y4lL/8qISP/KiEj/y4lK/82PU//RmGH/0ptl/9KaZP/SmmT/0ppk/9KaZP/SmmTO0ppkRtKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZGzSmmTu0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmTw0ppkcNKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZBLSmmSQ0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppklNKaZBTSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQy0ppkutKaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppkvtKaZDbSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkXNKaZODSmmT/0ppk/9KaZP/SmmT/0ppk5NKaZGDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkBtKaZIbSmmTo0ppk6tKaZIrSmmQK0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP/8P///+B///+AH//+AAf//AAD//AAAP/AAAA/gAAAHwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA+AAAAfwAAAP/AAAP/8AAP//gAH//+AH///4H////D//" rel="icon" />
  
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<div class="wrapper">
<header id="title-block-header">
<h1 class="title" style="text-align:center">Unicode in the Library, Part
1: UTF Transcoding</h1>
<table style="border:none;float:right">
  <tr>
    <td>Document #:</td>
    <td>P2728R4</td>
  </tr>
  <tr>
    <td>Date:</td>
    <td>2023-06-10</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Project:</td>
    <td>Programming Language C++</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Audience:</td>
    <td>
      SG-16 Unicode<br>
      LEWG-I<br>
      LEWG<br>
    </td>
  </tr>
  <tr>
    <td style="vertical-align:top">Reply-to:</td>
    <td>
      Zach Laine<br>&lt;<a href="mailto:whatwasthataddress@gmail.com" class="email">whatwasthataddress@gmail.com</a>&gt;<br>
    </td>
  </tr>
</table>
</header>
<div style="clear:both">
<div id="TOC" role="doc-toc">
<h1 id="toctitle">Contents</h1>
<ul>
<li><a href="#changelog" id="toc-changelog"><span class="toc-section-number">1</span> Changelog<span></span></a>
<ul>
<li><a href="#changes-since-r0" id="toc-changes-since-r0"><span class="toc-section-number">1.1</span> Changes since
R0<span></span></a></li>
<li><a href="#changes-since-r1" id="toc-changes-since-r1"><span class="toc-section-number">1.2</span> Changes since
R1<span></span></a></li>
<li><a href="#changes-since-r2" id="toc-changes-since-r2"><span class="toc-section-number">1.3</span> Changes since
R2<span></span></a></li>
<li><a href="#changes-since-r3" id="toc-changes-since-r3"><span class="toc-section-number">1.4</span> Changes since
R3<span></span></a></li>
</ul></li>
<li><a href="#motivation" id="toc-motivation"><span class="toc-section-number">2</span> Motivation<span></span></a>
<ul>
<li><a href="#a-note-about-p1629" id="toc-a-note-about-p1629"><span class="toc-section-number">2.1</span> A note about
P1629<span></span></a></li>
</ul></li>
<li><a href="#the-shortest-unicode-primer-imaginable" id="toc-the-shortest-unicode-primer-imaginable"><span class="toc-section-number">3</span> The shortest Unicode primer
imaginable<span></span></a></li>
<li><a href="#a-few-examples" id="toc-a-few-examples"><span class="toc-section-number">4</span> A few examples<span></span></a>
<ul>
<li><a href="#case-1-adapt-to-an-existing-range-interface-taking-a-different-utf" id="toc-case-1-adapt-to-an-existing-range-interface-taking-a-different-utf"><span class="toc-section-number">4.1</span> Case 1: Adapt to an existing range
interface taking a different UTF<span></span></a></li>
<li><a href="#case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf" id="toc-case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf"><span class="toc-section-number">4.2</span> Case 2: Adapt to an existing
iterator interface taking a different UTF<span></span></a></li>
<li><a href="#case-3-adapt-a-range-of-non-character-type-values" id="toc-case-3-adapt-a-range-of-non-character-type-values"><span class="toc-section-number">4.3</span> Case 3: Adapt a range of
non-character-type values<span></span></a></li>
<li><a href="#case-4-print-the-results-of-transcoding" id="toc-case-4-print-the-results-of-transcoding"><span class="toc-section-number">4.4</span> Case 4: Print the results of
transcoding<span></span></a></li>
</ul></li>
<li><a href="#proposed-design" id="toc-proposed-design"><span class="toc-section-number">5</span> Proposed design<span></span></a>
<ul>
<li><a href="#dependencies" id="toc-dependencies"><span class="toc-section-number">5.1</span> Dependencies<span></span></a></li>
<li><a href="#add-concepts-that-describe-parameters-to-transcoding-apis" id="toc-add-concepts-that-describe-parameters-to-transcoding-apis"><span class="toc-section-number">5.2</span> Add concepts that describe
parameters to transcoding APIs<span></span></a>
<ul>
<li><a href="#code-unit-option-1" id="toc-code-unit-option-1"><span class="toc-section-number">5.2.1</span> Code unit option
1<span></span></a></li>
<li><a href="#code-unit-option-2" id="toc-code-unit-option-2"><span class="toc-section-number">5.2.2</span> Code unit option
2<span></span></a></li>
<li><a href="#the-impact-of-options-1-and-2" id="toc-the-impact-of-options-1-and-2"><span class="toc-section-number">5.2.3</span> The impact of options 1 and
2<span></span></a></li>
</ul></li>
<li><a href="#add-a-null-terminated-sequence-sentinel" id="toc-add-a-null-terminated-sequence-sentinel"><span class="toc-section-number">5.3</span> Add a null-terminated sequence
sentinel<span></span></a></li>
<li><a href="#add-the-transcoding-iterator-template" id="toc-add-the-transcoding-iterator-template"><span class="toc-section-number">5.4</span> Add the transcoding iterator
template<span></span></a>
<ul>
<li><a href="#why-utf_iterator-is-constrained-the-way-it-is" id="toc-why-utf_iterator-is-constrained-the-way-it-is"><span class="toc-section-number">5.4.1</span> Why
<code class="sourceCode default">utf_iterator</code> is constrained the
way it is<span></span></a></li>
<li><a href="#why-utf_iterator-is-not-a-nested-type-within-utf_view" id="toc-why-utf_iterator-is-not-a-nested-type-within-utf_view"><span class="toc-section-number">5.4.2</span> Why
<code class="sourceCode default">utf_iterator</code> is not a nested
type within
<code class="sourceCode default">utf_view</code><span></span></a></li>
<li><a href="#optional-add-aliases-for-common-utf_iterator-specializations" id="toc-optional-add-aliases-for-common-utf_iterator-specializations"><span class="toc-section-number">5.4.3</span> Optional: Add aliases for common
<code class="sourceCode default">utf_iterator</code>
specializations<span></span></a></li>
<li><a href="#add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking" id="toc-add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking"><span class="toc-section-number">5.4.4</span> Add
<code class="sourceCode default">unpack_iterator_and_sentinel</code> CPO
for iterator “unpacking”<span></span></a></li>
<li><a href="#why-input-iterators-are-not-unpackable" id="toc-why-input-iterators-are-not-unpackable"><span class="toc-section-number">5.4.5</span> Why input iterators are not
unpackable<span></span></a></li>
</ul></li>
<li><a href="#add-code-unit-views-and-adaptors" id="toc-add-code-unit-views-and-adaptors"><span class="toc-section-number">5.5</span> Add code unit views and
adaptors<span></span></a>
<ul>
<li><a href="#why-as_charn_t-requires-utf_pointer" id="toc-why-as_charn_t-requires-utf_pointer"><span class="toc-section-number">5.5.1</span> Why
<code class="sourceCode default">as_charN_t</code> requires
<code class="sourceCode default">utf_pointer</code><span></span></a></li>
</ul></li>
<li><a href="#add-transcoding-views-and-adaptors" id="toc-add-transcoding-views-and-adaptors"><span class="toc-section-number">5.6</span> Add transcoding views and
adaptors<span></span></a>
<ul>
<li><a href="#why-there-are-three-utfn_views-views-plus-utf_view" id="toc-why-there-are-three-utfn_views-views-plus-utf_view"><span class="toc-section-number">5.6.1</span> Why there are three
<code class="sourceCode default">utfN_view</code>s views plus
<code class="sourceCode default">utf_view</code><span></span></a></li>
<li><a href="#unpacking_owning_view" id="toc-unpacking_owning_view"><span class="toc-section-number">5.6.2</span>
<code class="sourceCode default">unpacking_owning_view</code><span></span></a></li>
<li><a href="#more-examples" id="toc-more-examples"><span class="toc-section-number">5.6.3</span> More
examples<span></span></a></li>
<li><a href="#why-utf_view-always-uses-utf_iterator-even-in-utf-n-to-utf-n-cases" id="toc-why-utf_view-always-uses-utf_iterator-even-in-utf-n-to-utf-n-cases"><span class="toc-section-number">5.6.4</span> Why
<code class="sourceCode default">utf_view</code> always uses
<code class="sourceCode default">utf_iterator</code>, even in UTF-N to
UTF-N cases<span></span></a></li>
<li><a href="#add-utf_view-specialization-of-formatter" id="toc-add-utf_view-specialization-of-formatter"><span class="toc-section-number">5.6.5</span> Add
<code class="sourceCode default">utf_view</code> specialization of
<code class="sourceCode default">formatter</code><span></span></a></li>
</ul></li>
<li><a href="#add-a-feature-test-macro" id="toc-add-a-feature-test-macro"><span class="toc-section-number">5.7</span> Add a feature test
macro<span></span></a></li>
<li><a href="#design-notes" id="toc-design-notes"><span class="toc-section-number">5.8</span> Design notes<span></span></a></li>
</ul></li>
<li><a href="#implementation-experience" id="toc-implementation-experience"><span class="toc-section-number">6</span> Implementation
experience<span></span></a></li>
<li><a href="#bibliography" id="toc-bibliography"><span class="toc-section-number">7</span> References<span></span></a></li>
</ul>
</div>
<h1 data-number="1" id="changelog"><span class="header-section-number">1</span> Changelog<a href="#changelog" class="self-link"></a></h1>
<h2 data-number="1.1" id="changes-since-r0"><span class="header-section-number">1.1</span> Changes since R0<a href="#changes-since-r0" class="self-link"></a></h2>
<ul>
<li>When naming code points in interfaces, use
<code class="sourceCode default">char32_t</code>.</li>
<li>When naming code units in interfaces, use
<code class="sourceCode default">charN_t</code>.</li>
<li>Remove each eager algorithm, leaving in its corresponding view.</li>
<li>Remove all the output iterators.</li>
<li>Change template parameters to
<code class="sourceCode default">utfN_view</code> to the types of the
from-range, instead of the types of the transcoding iterators used to
implement the view.</li>
<li>Remove all make-functions.</li>
<li>Replace the misbegotten
<code class="sourceCode default">as_utfN()</code> functions with the
<code class="sourceCode default">as_utfN</code> view adaptors that
should have been there all along.</li>
<li>Add missing
<code class="sourceCode default">transcoding_error_handler</code>
concept.</li>
<li>Turn
<code class="sourceCode default">unpack_iterator_and_sentinel</code>
into a CPO.</li>
<li>Lower the UTF iterator concepts from bidirectional to input.</li>
</ul>
<h2 data-number="1.2" id="changes-since-r1"><span class="header-section-number">1.2</span> Changes since R1<a href="#changes-since-r1" class="self-link"></a></h2>
<ul>
<li>Reintroduce the transcoding-from-a-buffer example.</li>
<li>Generalize <code class="sourceCode default">null_sentinel_t</code>
to a non-Unicode-specific facility.</li>
<li>In utility functions that search for ill-formed encoding, take a
range argument instead of a pair of iterator arguments.</li>
<li>Replace <code class="sourceCode default">utf{8,16,32}_view</code>
with a single <code class="sourceCode default">utf_view</code>.</li>
</ul>
<h2 data-number="1.3" id="changes-since-r2"><span class="header-section-number">1.3</span> Changes since R2<a href="#changes-since-r2" class="self-link"></a></h2>
<ul>
<li>Add <code class="sourceCode default">noexcept</code> where
appropriate.</li>
<li>Remove non-essential constants and utility functions, and elaborate
on the usage of the ones that remain.</li>
<li>Note differences from similar elements proposed in <span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span>.</li>
<li>Extend the examples slightly.</li>
<li>Correct an error in the description of the view adaptors’ semantics,
and provide several examples of their use.</li>
</ul>
<h2 data-number="1.4" id="changes-since-r3"><span class="header-section-number">1.4</span> Changes since R3<a href="#changes-since-r3" class="self-link"></a></h2>
<ul>
<li>Changed the definition of the
<code class="sourceCode default">code_unit</code> concept, and added
<code class="sourceCode default">as_charN_t</code> adaptors.</li>
<li>Removed the utility functions and Unicode-related constants, except
<code class="sourceCode default">replacement_character</code>.</li>
<li>Changed the constraint on
<code class="sourceCode default">utf_iterator</code> slightly.</li>
<li>Change <code class="sourceCode default">null_sentinel_t</code> back
to being Unicode-specific.</li>
</ul>
<h1 data-number="2" id="motivation"><span class="header-section-number">2</span> Motivation<a href="#motivation" class="self-link"></a></h1>
<p>Unicode is important to many, many users in everyday software. It is
not exotic or weird. Well, it’s weird, but it’s not weird to see it
used. C and C++ are the only major production languages with essentially
no support for Unicode.</p>
<p>Let’s fix.</p>
<p>To fix, first we start with the most basic representations of strings
in Unicode: UTF. You might get a UTF string from anywhere; on Windows
you often get them from the OS, in UTF-16. In web-adjacent applications,
strings are most commonly in UTF-8. In ASCII-only applications,
everything is in UTF-8, by its definition as a superset of ASCII.</p>
<p>Often, an application needs to switch between UTFs: 8 -&gt; 16, 32
-&gt; 16, etc. In SG-16 we’ve taken to calling such UTF-N -&gt; UTF-M
operations “transcoding”.</p>
<p>I’m proposing interfaces to do transcoding that meet certain design
requirements that I think are important; I hope you’ll agree:</p>
<ul>
<li>Ranges are the future. We should have range-friendly ways of doing
transcoding. This includes support for sentinels and lazy views.</li>
<li>Iterators are the present. We should support generic programming,
whether it is done in terms of pointers, a particular iterator, or an
iterator type specified as a template parameter.</li>
<li>A null-terminated string should not be treated as a special case.
The ubiquity of such strings means that they should be treated as
first-class strings.</li>
<li>It is common to want to view the same text as code points and code
units at different times. It is therefore important that transcoding
iterators have a convenient way to access the underlying sequence of
code units being transcoded.</li>
<li>Memory safety is important. Ensuring that the Unicode part of the
standard library is as memory safe as possible should be a
priority.</li>
</ul>
<h2 data-number="2.1" id="a-note-about-p1629"><span class="header-section-number">2.1</span> A note about P1629<a href="#a-note-about-p1629" class="self-link"></a></h2>
<p><span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span> from JeanHeyd Meneide is a much
more ambitious proposal that aims to standardize a general-purpose text
encoding conversion mechanism. This proposal is not at odds with P1629;
the two proposals have largely orthogonal aims. This proposal only
concerns itself with UTF interconversions, which is all that is required
for Unicode support. P1629 is concerned with those conversions, plus a
lot more. Accepting both proposals would not cause problems; in fact,
the APIs proposed here could be used to implement parts of the P1629
design.</p>
<p>There are some differences between the way that the transcode views
and iterators from <span class="citation" data-cites="P1629R1">[<a href="#ref-P1629R1" role="doc-biblioref">P1629R1</a>]</span> work and
the transcoding view and iterators from this paper work. First,
<code class="sourceCode default">std::text::transcode_view</code> has no
direct support for null-terminated strings. Second, it does not do the
unpacking described in this paper. Third, it is not printable and
streamable.</p>
<h1 data-number="3" id="the-shortest-unicode-primer-imaginable"><span class="header-section-number">3</span> The shortest Unicode primer
imaginable<a href="#the-shortest-unicode-primer-imaginable" class="self-link"></a></h1>
<p>There are multiple encoding types defined in Unicode: UTF-8, UTF-16,
and UTF-32.</p>
<p>A <em>code unit</em> is the lowest-level datum-type in your Unicode
data. Examples are a <code class="sourceCode default">char8_t</code> in
UTF-8 and a <code class="sourceCode default">char32_t</code> in
UTF-32.</p>
<p>A <em>code point</em> is a 32-bit integral value that represents a
single Unicode value. Examples are U+0041 “A” “LATIN CAPITAL LETTER A”
and U+0308 “¨” “COMBINING DIAERESIS”.</p>
<p>A code point may be consist of multiple code units. For instance, 3
UTF-8 code units in sequence may encode a particular code point.</p>
<h1 data-number="4" id="a-few-examples"><span class="header-section-number">4</span> A few examples<a href="#a-few-examples" class="self-link"></a></h1>
<h2 data-number="4.1" id="case-1-adapt-to-an-existing-range-interface-taking-a-different-utf"><span class="header-section-number">4.1</span> Case 1: Adapt to an existing
range interface taking a different UTF<a href="#case-1-adapt-to-an-existing-range-interface-taking-a-different-utf" class="self-link"></a></h2>
<p>In this case, we have a generic range interface to transcode into, so
we use a transcoding view.</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="co">// A generic function that accepts sequences of UTF-16.</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>utf16_range R<span class="op">&gt;</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input<span class="op">(</span>R r<span class="op">)</span>;</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input_again<span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>ranges<span class="op">::</span>ref_view<span class="op">&lt;</span>std<span class="op">::</span>string<span class="op">&gt;&gt;</span> r<span class="op">)</span>;</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string input <span class="op">=</span> get_utf8_input<span class="op">()</span>;</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> input_utf16 <span class="op">=</span> input <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>input_utf16<span class="op">)</span>;</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>process_input_again<span class="op">(</span>input_utf16<span class="op">)</span>;</span></code></pre></div>
<h2 data-number="4.2" id="case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf"><span class="header-section-number">4.2</span> Case 2: Adapt to an existing
iterator interface taking a different UTF<a href="#case-2-adapt-to-an-existing-iterator-interface-taking-a-different-utf" class="self-link"></a></h2>
<p>This time, we have a generic iterator interface we want to transcode
into, so we want to use the transcoding iterators.</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="co">// A generic function that accepts sequences of UTF-16.</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>utf16_iter I<span class="op">&gt;</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input<span class="op">(</span>I first, I last<span class="op">)</span>;</span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string input <span class="op">=</span> get_utf8_input<span class="op">()</span>;</span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-7"><a href="#cb2-7" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span></span>
<span id="cb2-8"><a href="#cb2-8" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>u8string<span class="op">::</span>iterator<span class="op">&gt;(</span></span>
<span id="cb2-9"><a href="#cb2-9" aria-hidden="true" tabindex="-1"></a>        input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>end<span class="op">())</span>,</span>
<span id="cb2-10"><a href="#cb2-10" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>u8string<span class="op">::</span>iterator<span class="op">&gt;(</span></span>
<span id="cb2-11"><a href="#cb2-11" aria-hidden="true" tabindex="-1"></a>        input<span class="op">.</span>begin<span class="op">()</span>, input<span class="op">.</span>end<span class="op">()</span>, input<span class="op">.</span>end<span class="op">()))</span>;</span>
<span id="cb2-12"><a href="#cb2-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb2-13"><a href="#cb2-13" aria-hidden="true" tabindex="-1"></a><span class="co">// Even more conveniently:</span></span>
<span id="cb2-14"><a href="#cb2-14" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> <span class="kw">const</span> utf16_view <span class="op">=</span> input <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb2-15"><a href="#cb2-15" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>utf16_view<span class="op">.</span>begin<span class="op">()</span>, utf16<span class="op">.</span>end<span class="op">())</span>;</span></code></pre></div>
<h2 data-number="4.3" id="case-3-adapt-a-range-of-non-character-type-values"><span class="header-section-number">4.3</span> Case 3: Adapt a range of
non-character-type values<a href="#case-3-adapt-a-range-of-non-character-type-values" class="self-link"></a></h2>
<p>Let’s say that we want to take code points that we got from ICU, and
transcode them to UTF-8. The problem is that ICU’s code point type is
<code class="sourceCode default">int</code>. Since
<code class="sourceCode default">int</code> is not a character type,
it’s not deduced by <code class="sourceCode default">as_utf8</code> to
be UTF-32 data.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="co">// A generic function that accepts sequences of UTF-16.</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>utf8_range R<span class="op">&gt;</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> process_input<span class="op">(</span>R r<span class="op">)</span>;</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>vector<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;</span> input <span class="op">=</span> get_icu_code_points<span class="op">()</span>;</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a><span class="co">// This is ill formed without the as_char32_t adaptation.</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> input_utf8 <span class="op">=</span> input <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_char32_t <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8;</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>process_input<span class="op">(</span>input_utf8<span class="op">)</span>;</span></code></pre></div>
<h2 data-number="4.4" id="case-4-print-the-results-of-transcoding"><span class="header-section-number">4.4</span> Case 4: Print the results of
transcoding<a href="#case-4-print-the-results-of-transcoding" class="self-link"></a></h2>
<p>Text processing is pretty useless without I/O. All of the Unicode
algorithms operate on code points, and so the output of any of those
algorithms will be in code points/UTF-32. It should be easy to print the
results to a <code class="sourceCode default">std::ostream</code>, to a
<code class="sourceCode default">std::wostream</code> on Windows, or
using <code class="sourceCode default">std::format</code> and
<code class="sourceCode default">std::print</code>.
<code class="sourceCode default">utf_view</code> is therefore printable
and streamable.</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> double_print<span class="op">(</span><span class="dt">char32_t</span> <span class="kw">const</span> <span class="op">*</span> str<span class="op">)</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8 <span class="op">=</span> str <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8;</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>print<span class="op">(</span><span class="st">&quot;{}&quot;</span>, utf8<span class="op">)</span>;</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>cerr <span class="op">&lt;&lt;</span> utf8;</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h1 data-number="5" id="proposed-design"><span class="header-section-number">5</span> Proposed design<a href="#proposed-design" class="self-link"></a></h1>
<h2 data-number="5.1" id="dependencies"><span class="header-section-number">5.1</span> Dependencies<a href="#dependencies" class="self-link"></a></h2>
<p>This proposal depends on the existence of <a href="https://isocpp.org/files/papers/P2727R0.html">P2727</a>
“std::iterator_interface”.</p>
<h2 data-number="5.2" id="add-concepts-that-describe-parameters-to-transcoding-apis"><span class="header-section-number">5.2</span> Add concepts that describe
parameters to transcoding APIs<a href="#add-concepts-that-describe-parameters-to-transcoding-apis" class="self-link"></a></h2>
<p>The macro
<code class="sourceCode default">CODE_UNIT_CONCEPT_OPTION_2</code> is
used below to indicate the two options for how to define
<code class="sourceCode default">code_unit</code>. See below for a
description of the two options.</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">enum</span> <span class="kw">class</span> format <span class="op">{</span> utf8 <span class="op">=</span> <span class="dv">1</span>, utf16 <span class="op">=</span> <span class="dv">2</span>, utf32 <span class="op">=</span> <span class="dv">4</span> <span class="op">}</span>;</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> format <em>wchar-t-format</em> <span class="op">=</span> <em>see below</em>;       <span class="co">// <em>exposition only</em></span></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit <span class="op">=</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char8_t</span><span class="op">&gt;</span> <span class="op">&amp;&amp;</span> F <span class="op">==</span> format<span class="op">::</span>utf8<span class="op">)</span> <span class="op">||</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a>                        <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char16_t</span><span class="op">&gt;</span> <span class="op">&amp;&amp;</span> F <span class="op">==</span> format<span class="op">::</span>utf16<span class="op">)</span> <span class="op">||</span></span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>                        <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char32_t</span><span class="op">&gt;</span> <span class="op">&amp;&amp;</span> F <span class="op">==</span> format<span class="op">::</span>utf32<span class="op">)</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a><span class="pp">#if CODE_UNIT_CONCEPT_OPTION_2</span></span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>                        <span class="op">||</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char</span><span class="op">&gt;</span> <span class="op">&amp;&amp;</span> F <span class="op">==</span> format<span class="op">::</span>utf8<span class="op">)</span></span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a>                        <span class="op">||</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">wchar_t</span><span class="op">&gt;</span> <span class="op">&amp;&amp;</span> F <span class="op">==</span> <em>wchar-t-format</em><span class="op">)</span></span>
<span id="cb5-14"><a href="#cb5-14" aria-hidden="true" tabindex="-1"></a><span class="pp">#endif</span></span>
<span id="cb5-15"><a href="#cb5-15" aria-hidden="true" tabindex="-1"></a>        ;</span>
<span id="cb5-16"><a href="#cb5-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-17"><a href="#cb5-17" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-18"><a href="#cb5-18" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-19"><a href="#cb5-19" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-20"><a href="#cb5-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-21"><a href="#cb5-21" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-22"><a href="#cb5-22" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-23"><a href="#cb5-23" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-24"><a href="#cb5-24" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_code_unit <span class="op">=</span> code_unit<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-25"><a href="#cb5-25" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-26"><a href="#cb5-26" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-27"><a href="#cb5-27" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_code_unit <span class="op">=</span> utf8_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_code_unit<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-28"><a href="#cb5-28" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-29"><a href="#cb5-29" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-30"><a href="#cb5-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_iter <span class="op">=</span></span>
<span id="cb5-31"><a href="#cb5-31" aria-hidden="true" tabindex="-1"></a>      input_iterator<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-32"><a href="#cb5-32" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-33"><a href="#cb5-33" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_pointer <span class="op">=</span></span>
<span id="cb5-34"><a href="#cb5-34" aria-hidden="true" tabindex="-1"></a>      is_pointer_v<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-35"><a href="#cb5-35" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T, format F<span class="op">&gt;</span></span>
<span id="cb5-36"><a href="#cb5-36" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> code_unit_range <span class="op">=</span> ranges<span class="op">::</span>input_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">&amp;&amp;</span></span>
<span id="cb5-37"><a href="#cb5-37" aria-hidden="true" tabindex="-1"></a>      code_unit<span class="op">&lt;</span>ranges<span class="op">::</span>range_value_t<span class="op">&lt;</span>T<span class="op">&gt;</span>, F<span class="op">&gt;</span>;</span>
<span id="cb5-38"><a href="#cb5-38" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-39"><a href="#cb5-39" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-40"><a href="#cb5-40" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-41"><a href="#cb5-41" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-42"><a href="#cb5-42" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-43"><a href="#cb5-43" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-44"><a href="#cb5-44" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf8<span class="op">&gt;</span>;</span>
<span id="cb5-45"><a href="#cb5-45" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-46"><a href="#cb5-46" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-47"><a href="#cb5-47" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-48"><a href="#cb5-48" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-49"><a href="#cb5-49" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-50"><a href="#cb5-50" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-51"><a href="#cb5-51" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf16<span class="op">&gt;</span>;</span>
<span id="cb5-52"><a href="#cb5-52" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-53"><a href="#cb5-53" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-54"><a href="#cb5-54" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_iter <span class="op">=</span> code_unit_iter<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-55"><a href="#cb5-55" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-56"><a href="#cb5-56" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_pointer <span class="op">=</span> code_unit_pointer<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-57"><a href="#cb5-57" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-58"><a href="#cb5-58" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_range <span class="op">=</span> code_unit_range<span class="op">&lt;</span>T, format<span class="op">::</span>utf32<span class="op">&gt;</span>;</span>
<span id="cb5-59"><a href="#cb5-59" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-60"><a href="#cb5-60" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-61"><a href="#cb5-61" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_iter <span class="op">=</span> utf8_iter<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_iter<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_iter<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-62"><a href="#cb5-62" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-63"><a href="#cb5-63" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_pointer <span class="op">=</span> utf8_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_pointer<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-64"><a href="#cb5-64" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-65"><a href="#cb5-65" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_range <span class="op">=</span> utf8_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_range<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-66"><a href="#cb5-66" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-67"><a href="#cb5-67" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-68"><a href="#cb5-68" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_range_like <span class="op">=</span></span>
<span id="cb5-69"><a href="#cb5-69" aria-hidden="true" tabindex="-1"></a>      utf_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">||</span> utf_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-70"><a href="#cb5-70" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-71"><a href="#cb5-71" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-72"><a href="#cb5-72" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf8_input_range_like <span class="op">=</span></span>
<span id="cb5-73"><a href="#cb5-73" aria-hidden="true" tabindex="-1"></a>      <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-74"><a href="#cb5-74" aria-hidden="true" tabindex="-1"></a>      utf8_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-75"><a href="#cb5-75" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-76"><a href="#cb5-76" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf16_input_range_like <span class="op">=</span></span>
<span id="cb5-77"><a href="#cb5-77" aria-hidden="true" tabindex="-1"></a>      <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-78"><a href="#cb5-78" aria-hidden="true" tabindex="-1"></a>      utf16_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-79"><a href="#cb5-79" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-80"><a href="#cb5-80" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf32_input_range_like <span class="op">=</span></span>
<span id="cb5-81"><a href="#cb5-81" aria-hidden="true" tabindex="-1"></a>      <span class="op">(</span>ranges<span class="op">::</span>input_range<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">&amp;&amp;</span> utf32_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">||</span></span>
<span id="cb5-82"><a href="#cb5-82" aria-hidden="true" tabindex="-1"></a>      utf32_pointer<span class="op">&lt;</span>remove_reference_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span>;</span>
<span id="cb5-83"><a href="#cb5-83" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-84"><a href="#cb5-84" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-85"><a href="#cb5-85" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> utf_input_range_like <span class="op">=</span></span>
<span id="cb5-86"><a href="#cb5-86" aria-hidden="true" tabindex="-1"></a>      utf8_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf16_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> utf32_input_range_like<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb5-87"><a href="#cb5-87" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-88"><a href="#cb5-88" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb5-89"><a href="#cb5-89" aria-hidden="true" tabindex="-1"></a>    <span class="kw">concept</span> transcoding_error_handler <span class="op">=</span></span>
<span id="cb5-90"><a href="#cb5-90" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> <span class="op">(</span>T t, string_view msg<span class="op">)</span> <span class="op">{</span> <span class="op">{</span> t<span class="op">(</span>msg<span class="op">)</span> <span class="op">}</span> <span class="op">-&gt;</span> same_as<span class="op">&lt;</span><span class="dt">char32_t</span><span class="op">&gt;</span>; <span class="op">}</span>;</span>
<span id="cb5-91"><a href="#cb5-91" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb5-92"><a href="#cb5-92" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>There are two options for how the
<code class="sourceCode default">code_unit</code> concept is
defined.</p>
<h3 data-number="5.2.1" id="code-unit-option-1"><span class="header-section-number">5.2.1</span> Code unit option 1<a href="#code-unit-option-1" class="self-link"></a></h3>
<p>This is represented by
<code class="sourceCode default">CODE_UNIT_CONCEPT_OPTION_2 == 0</code>
in the code above. In this option, a code unit must be one of
<code class="sourceCode default">char8_t</code>,
<code class="sourceCode default">char16_t</code>, and
<code class="sourceCode default">char32_t</code>.</p>
<h3 data-number="5.2.2" id="code-unit-option-2"><span class="header-section-number">5.2.2</span> Code unit option 2<a href="#code-unit-option-2" class="self-link"></a></h3>
<p>This is represented by
<code class="sourceCode default">CODE_UNIT_CONCEPT_OPTION_2 == 1</code>
in the code above. In this option, a code unit must be a character type.
This includes the <code class="sourceCode default">charN_t</code>
character types from Option 1, plus
<code class="sourceCode default">char</code> and
<code class="sourceCode default">wchar_t</code>. The value of
<code class="sourceCode default"><em>wchar-t-format</em></code> is
implementation defined, but must be
<code class="sourceCode default">uc::format::utf16</code> or
<code class="sourceCode default">uc::format::utf32</code>.</p>
<h3 data-number="5.2.3" id="the-impact-of-options-1-and-2"><span class="header-section-number">5.2.3</span> The impact of options 1 and
2<a href="#the-impact-of-options-1-and-2" class="self-link"></a></h3>
<p>Here are some examples of the differences between Options 1 and 2.
Note the use of <code class="sourceCode default">charN_t</code> below
with <code class="sourceCode default">std::wstring</code>. That’s there
because whether you write
<code class="sourceCode default">as_char16_t</code> or
<code class="sourceCode default">as_char32_t</code> is
implementation-dependent.</p>
<table>
<thead>
<tr class="header">
<th><div style="text-align:center">
<strong>Option 1</strong>
</div></th>
<th><div style="text-align:center">
<strong>Option 2</strong>
</div></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><div>

<div class="sourceCode" id="cb6"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="kw">using</span> <span class="kw">namespace</span> std<span class="op">::</span>uc;</span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v1  <span class="op">=</span> <span class="st">u8&quot;text&quot;</span> <span class="op">|</span> as_utf32;  <span class="co">// Ok.</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v2  <span class="op">=</span> <span class="st">u&quot;text&quot;</span>  <span class="op">|</span> as_utf8;   <span class="co">// Ok.</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v3  <span class="op">=</span> <span class="st">U&quot;text&quot;</span>  <span class="op">|</span> as_utf16;  <span class="co">// Ok.</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v4  <span class="op">=</span> std<span class="op">::</span>u8string<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf32;  <span class="co">// Ok.</span></span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v5  <span class="op">=</span> std<span class="op">::</span>u16string<span class="op">(</span><span class="st">u&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf8;   <span class="co">// Ok.</span></span>
<span id="cb6-9"><a href="#cb6-9" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v6  <span class="op">=</span> std<span class="op">::</span>u32string<span class="op">(</span><span class="st">U&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf16;  <span class="co">// Ok.</span></span>
<span id="cb6-10"><a href="#cb6-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-11"><a href="#cb6-11" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v7  <span class="op">=</span> std<span class="op">::</span>string  <span class="op">|</span> as_utf32; <span class="co">// Error; ill-formed.</span></span>
<span id="cb6-12"><a href="#cb6-12" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v8  <span class="op">=</span> std<span class="op">::</span>wstring <span class="op">|</span> as_utf8;  <span class="co">// Error; ill-formed.</span></span>
<span id="cb6-13"><a href="#cb6-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb6-14"><a href="#cb6-14" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v9  <span class="op">=</span> std<span class="op">::</span>string  <span class="op">|</span> as_char8_t <span class="op">|</span> as_utf32; <span class="co">// Ok.</span></span>
<span id="cb6-15"><a href="#cb6-15" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v10 <span class="op">=</span> std<span class="op">::</span>wstring <span class="op">|</span> as_charN_t <span class="op">|</span> as_utf8;  <span class="co">// Ok.</span></span></code></pre></div>

</div></td>
<td><div>

<div class="sourceCode" id="cb7"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb7-1"><a href="#cb7-1" aria-hidden="true" tabindex="-1"></a><span class="kw">using</span> <span class="kw">namespace</span> std<span class="op">::</span>uc;</span>
<span id="cb7-2"><a href="#cb7-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-3"><a href="#cb7-3" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v1  <span class="op">=</span> <span class="st">u8&quot;text&quot;</span> <span class="op">|</span> as_utf32;  <span class="co">// Ok.</span></span>
<span id="cb7-4"><a href="#cb7-4" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v2  <span class="op">=</span> <span class="st">u&quot;text&quot;</span>  <span class="op">|</span> as_utf8;   <span class="co">// Ok.</span></span>
<span id="cb7-5"><a href="#cb7-5" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v3  <span class="op">=</span> <span class="st">U&quot;text&quot;</span>  <span class="op">|</span> as_utf16;  <span class="co">// Ok.</span></span>
<span id="cb7-6"><a href="#cb7-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v4  <span class="op">=</span> std<span class="op">::</span>u8string<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf32;  <span class="co">// Ok.</span></span>
<span id="cb7-8"><a href="#cb7-8" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v5  <span class="op">=</span> std<span class="op">::</span>u16string<span class="op">(</span><span class="st">u&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf8;   <span class="co">// Ok.</span></span>
<span id="cb7-9"><a href="#cb7-9" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v6  <span class="op">=</span> std<span class="op">::</span>u32string<span class="op">(</span><span class="st">U&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> as_utf16;  <span class="co">// Ok.</span></span>
<span id="cb7-10"><a href="#cb7-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-11"><a href="#cb7-11" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v7  <span class="op">=</span> std<span class="op">::</span>string  <span class="op">|</span> as_utf32; <span class="co">// Ok.</span></span>
<span id="cb7-12"><a href="#cb7-12" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v8  <span class="op">=</span> std<span class="op">::</span>wstring <span class="op">|</span> as_utf8;  <span class="co">// Ok.</span></span>
<span id="cb7-13"><a href="#cb7-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb7-14"><a href="#cb7-14" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v9  <span class="op">=</span> std<span class="op">::</span>string  <span class="op">|</span> as_char8_t <span class="op">|</span> as_utf32; <span class="co">// Ok.</span></span>
<span id="cb7-15"><a href="#cb7-15" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v10 <span class="op">=</span> std<span class="op">::</span>wstring <span class="op">|</span> as_charN_t <span class="op">|</span> as_utf8;  <span class="co">// Ok.</span></span></code></pre></div>

</div></td>
</tr>
</tbody>
</table>
<p>In short, Option 1 forces you to write
“<code class="sourceCode default">| as_char8_t</code>” everywhere you
want to use a <code class="sourceCode default">std::string</code> with
the interfaces proposed in this paper.</p>
<p>Option 1 is supported by most of SG-16. Here is the relevant SG-16
poll:</p>
<p><em>UTF transcoding interfaces provided by the C++ standard library
should operate on charN_t types, with support for other types provided
by adapters, possibly with a special case for char and wchar_t when
their associated literal encodings are UTF.</em></p>
<table style="width:31%;">
<colgroup>
<col style="width: 6%" />
<col style="width: 5%" />
<col style="width: 5%" />
<col style="width: 5%" />
<col style="width: 6%" />
</colgroup>
<thead>
<tr class="header">
<th><div style="text-align:center">
<strong>SF</strong>
</div></th>
<th><div style="text-align:center">
<strong>F</strong>
</div></th>
<th><div style="text-align:center">
<strong>N</strong>
</div></th>
<th><div style="text-align:center">
<strong>A</strong>
</div></th>
<th><div style="text-align:center">
<strong>SA</strong>
</div></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td>6</td>
<td>1</td>
<td>0</td>
<td>0</td>
<td>1</td>
</tr>
</tbody>
</table>
<p>(I have chosen to ignore the “possibly with a special case for char
and wchar_t when their associated literal encodings are UTF” part.
Making the evaluation of a concept change based on the literal encoding
seems like a flaky move to me; the literal encoding can change TU to
TU.)</p>
<p>The feeling in SG-16 is that the
<code class="sourceCode default">charN_t</code> types are designed to
represent UTF encodings, and
<code class="sourceCode default">char</code> is not. A
<code class="sourceCode default">char const *</code> string could be in
any one of dozens (hundreds?) of encodings. The addition of
“<code class="sourceCode default">| as_char8_t</code>” to adapt ranges
of <code class="sourceCode default">char</code> is meant to act as a
lexical indicator of user intent.</p>
<p>I believe this decision is a mistake. I would very, very much
<em>not</em> like to standardize Unicode interfaces that do not easily
interoperate with <code class="sourceCode default">std::string</code>.
This is my reasoning:</p>
<p>First, <code class="sourceCode default">char</code> and
<code class="sourceCode default">char8_t</code> maintain exactly the
same set of invariants – the empty set. Note that this is true even for
string literals. The encoding of
<code class="sourceCode default">u8&quot;text&quot;</code> is not
necessarily UTF-8! It depends on the flags you pass to your compiler.
Those flags are allowed to vary TU by TU. I have been bitten by the
“<code class="sourceCode default">u8</code> does not necessarily mean
UTF-8” oddity of MSVC before.</p>
<p>Second, “<code class="sourceCode default">| as_char8_t</code>” is a
no-op when used with
<code class="sourceCode default">utfN_view</code>/<code class="sourceCode default">utf_view</code>.
It does not actually do anything to help you get your program’s text
into UTF-8 encoding, nor to detect that you have non-UTF-8 encoded text
in your program.</p>
<p>Third, people use <code class="sourceCode default">std::string</code>
a lot. They use <code class="sourceCode default">char</code> string
literals a lot. They use
<code class="sourceCode default">std::u8string</code> and
<code class="sourceCode default">char8_t</code> string literals almost
not at all. Using Github Code Search, I found 15.3M references to
<code class="sourceCode default">std::string</code> and 6.7k references
to <code class="sourceCode default">std::u8string</code>. Even were
everyone to switch from
<code class="sourceCode default">std::string</code> to
<code class="sourceCode default">std::u8string</code> today, we should
still have to deal with lots and lots of
<code class="sourceCode default">char const *</code> strings for C API
compatibility.</p>
<p>Finally, whether a given range of code units is properly UTF encoded
may be a precondition of a given API that the user writes, but it is not
a precondition of <em>any</em> API proposed in this paper, nor is it a
precondition of any API I’m proposing in the papers that will follow
this one.</p>
<p>In short, I think <code class="sourceCode default">&quot;text&quot; | std::uc::as_utf32</code>
should “just work”. Making users write <code class="sourceCode default">&quot;text&quot; | std::uc::as_char8_t | std::uc::as_utf32</code>,
when that does not increase correctness or efficiency seems wrongheaded
to me. Users that want to can still write the longer version under both
options.</p>
<h2 data-number="5.3" id="add-a-null-terminated-sequence-sentinel"><span class="header-section-number">5.3</span> Add a null-terminated sequence
sentinel<a href="#add-a-null-terminated-sequence-sentinel" class="self-link"></a></h2>
<div class="sourceCode" id="cb8"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb8-2"><a href="#cb8-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> null_sentinel_t <span class="op">{</span></span>
<span id="cb8-3"><a href="#cb8-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> null_sentinel_t base<span class="op">()</span> <span class="kw">const</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> <span class="op">{}</span>; <span class="op">}</span></span>
<span id="cb8-4"><a href="#cb8-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-5"><a href="#cb8-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb8-6"><a href="#cb8-6" aria-hidden="true" tabindex="-1"></a>      <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span><span class="kw">const</span> T<span class="op">*</span> p, null_sentinel_t<span class="op">)</span></span>
<span id="cb8-7"><a href="#cb8-7" aria-hidden="true" tabindex="-1"></a>        <span class="op">{</span> <span class="cf">return</span> <span class="op">*</span>p <span class="op">==</span> T<span class="op">{}</span>; <span class="op">}</span></span>
<span id="cb8-8"><a href="#cb8-8" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb8-9"><a href="#cb8-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb8-10"><a href="#cb8-10" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> null_sentinel_t null_sentinel;</span>
<span id="cb8-11"><a href="#cb8-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>The <code class="sourceCode default">base()</code> member bears
explanation. It is there to make iterator/sentinel pairs easy to use in
a generic context. Consider a range
<code class="sourceCode default">r1</code> of code points delimited by a
pair of <code class="sourceCode default">utf_iterator&lt;format::utf8, format::utf32, char const *&gt;</code>
transcoding iterators (defined later in this paper). The range of
underlying UTF-8 code units is
[<code class="sourceCode default">r1.begin().base()</code>,
<code class="sourceCode default">r1.end().base()</code>).</p>
<p>Now consider a range <code class="sourceCode default">r2</code> of
code points that is delimited by a <code class="sourceCode default">utf_iterator&lt;format::utf8, format::utf32, char const *, null_sentinel_t&gt;</code>
transcoding iterator and a
<code class="sourceCode default">null_sentinel</code>. Now our
underlying range of UTF-8 is
[<code class="sourceCode default">r.begin().base()</code>,
<code class="sourceCode default">null_sentinel</code>).</p>
<p>Instead of making people writing generic code have to special-case
the use of <code class="sourceCode default">null_sentinel</code>,
<code class="sourceCode default">null_sentinel</code> has a
<code class="sourceCode default">base()</code> member that lets us write
<code class="sourceCode default">r.end().base()</code> instead of
<code class="sourceCode default">null_sentinel</code>. This means that
for either <code class="sourceCode default">r</code> or
<code class="sourceCode default">r2</code>, the underlying range of
UTF-8 code units is just
[<code class="sourceCode default">r1.begin().base()</code>,
<code class="sourceCode default">r1.end().base()</code>).</p>
<table>
<thead>
<tr class="header">
<th><div style="text-align:center">
<strong>Without
<code class="sourceCode default">null_sentinel_t::base()</code></strong>
</div></th>
<th><div style="text-align:center">
<strong>With
<code class="sourceCode default">null_sentinel_t::base()</code></strong>
</div></th>
</tr>
</thead>
<tbody>
<tr class="odd">
<td><div>

<div class="sourceCode" id="cb9"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb9-1"><a href="#cb9-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> UTF8To32Iter1, <span class="kw">typename</span> UTF8To32Iter2<span class="op">&gt;</span></span>
<span id="cb9-2"><a href="#cb9-2" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> f<span class="op">(</span>UTF8To32Iter1 first, UTF8To32Iter2 last<span class="op">)</span> <span class="op">{</span></span>
<span id="cb9-3"><a href="#cb9-3" aria-hidden="true" tabindex="-1"></a>  <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>std<span class="op">::</span>same_as<span class="op">&lt;</span>UTF8To32Iter1, UTF8To32Iter2<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb9-4"><a href="#cb9-4" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8_first <span class="op">=</span> first<span class="op">.</span>base<span class="op">()</span>;</span>
<span id="cb9-5"><a href="#cb9-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8_last <span class="op">=</span> last;</span>
<span id="cb9-6"><a href="#cb9-6" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="co">/* use utf8_{first,last} ... */</span>;</span>
<span id="cb9-7"><a href="#cb9-7" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb9-8"><a href="#cb9-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8_first <span class="op">=</span> first<span class="op">.</span>base<span class="op">()</span>;</span>
<span id="cb9-9"><a href="#cb9-9" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> utf8_last <span class="op">=</span> last<span class="op">.</span>base<span class="op">()</span>;</span>
<span id="cb9-10"><a href="#cb9-10" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="co">/* use utf8_{first,last} ... */</span>;</span>
<span id="cb9-11"><a href="#cb9-11" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb9-12"><a href="#cb9-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>

</div></td>
<td><div>

<div class="sourceCode" id="cb10"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb10-1"><a href="#cb10-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> UTF8To32Iter1, <span class="kw">typename</span> UTF8To32Iter2<span class="op">&gt;</span></span>
<span id="cb10-2"><a href="#cb10-2" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> f<span class="op">(</span>UTF8To32Iter1 first, UTF8To32Iter2 last<span class="op">)</span> <span class="op">{</span></span>
<span id="cb10-3"><a href="#cb10-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">auto</span> utf8_first <span class="op">=</span> first<span class="op">.</span>base<span class="op">()</span>;</span>
<span id="cb10-4"><a href="#cb10-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">auto</span> utf8_last <span class="op">=</span> last<span class="op">.</span>base<span class="op">()</span>;</span>
<span id="cb10-5"><a href="#cb10-5" aria-hidden="true" tabindex="-1"></a>  <span class="cf">return</span> <span class="co">/* use utf8_{first,last} ... */</span>;</span>
<span id="cb10-6"><a href="#cb10-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>

</div></td>
</tr>
</tbody>
</table>
<p>Without
<code class="sourceCode default">null_sentinel_t::base()</code>, we have
to account for the case in which the null sentinel is passed for the
second function parameter. This makes our very simple logic for getting
the underlying range out of an iterator/sentinel pair more than twice as
long.</p>
<h2 data-number="5.4" id="add-the-transcoding-iterator-template"><span class="header-section-number">5.4</span> Add the transcoding iterator
template<a href="#add-the-transcoding-iterator-template" class="self-link"></a></h2>
<p>I’m using <a href="https://isocpp.org/files/papers/P2727R0.html">P2727</a>’s
<code class="sourceCode default">iterator_interface</code> here for
simplicity.</p>
<p>First, the synopsis:</p>
<div class="sourceCode" id="cb11"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb11-1"><a href="#cb11-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb11-2"><a href="#cb11-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">char32_t</span> replacement_character <span class="op">=</span> <span class="bn">0xfffd</span>;</span>
<span id="cb11-3"><a href="#cb11-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-4"><a href="#cb11-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> use_replacement_character <span class="op">{</span></span>
<span id="cb11-5"><a href="#cb11-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char32_t</span> <span class="kw">operator</span><span class="op">()(</span>string_view error_msg<span class="op">)</span> <span class="kw">const</span> <span class="kw">noexcept</span>;</span>
<span id="cb11-6"><a href="#cb11-6" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb11-7"><a href="#cb11-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-8"><a href="#cb11-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>format Format<span class="op">&gt;</span></span>
<span id="cb11-9"><a href="#cb11-9" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="kw">auto</span> <em>format-to-type</em><span class="op">()</span> <span class="op">{</span>                                   <span class="co">// <em>exposition only</em></span></span>
<span id="cb11-10"><a href="#cb11-10" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>Format <span class="op">==</span> format<span class="op">::</span>utf8<span class="op">)</span> <span class="op">{</span></span>
<span id="cb11-11"><a href="#cb11-11" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> <span class="dt">char8_t</span><span class="op">{}</span>;</span>
<span id="cb11-12"><a href="#cb11-12" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>Format <span class="op">==</span> format<span class="op">::</span>utf16<span class="op">)</span> <span class="op">{</span></span>
<span id="cb11-13"><a href="#cb11-13" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> <span class="dt">char16_t</span><span class="op">{}</span>;</span>
<span id="cb11-14"><a href="#cb11-14" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb11-15"><a href="#cb11-15" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> <span class="dt">char32_t</span><span class="op">{}</span>;</span>
<span id="cb11-16"><a href="#cb11-16" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb11-17"><a href="#cb11-17" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb11-18"><a href="#cb11-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-19"><a href="#cb11-19" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb11-20"><a href="#cb11-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">using</span> <em>format-to-type-t</em> <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>format-to-type</em><span class="op">&lt;</span>I<span class="op">&gt;())</span>;             <span class="co">// <em>exposition only</em></span></span>
<span id="cb11-21"><a href="#cb11-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb11-22"><a href="#cb11-22" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb11-23"><a href="#cb11-23" aria-hidden="true" tabindex="-1"></a>    format FromFormat,</span>
<span id="cb11-24"><a href="#cb11-24" aria-hidden="true" tabindex="-1"></a>    format ToFormat,</span>
<span id="cb11-25"><a href="#cb11-25" aria-hidden="true" tabindex="-1"></a>    input_iterator I,</span>
<span id="cb11-26"><a href="#cb11-26" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb11-27"><a href="#cb11-27" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb11-28"><a href="#cb11-28" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;</span>, <em>format-to-type-t</em><span class="op">&lt;</span>FromFormat<span class="op">&gt;&gt;</span></span>
<span id="cb11-29"><a href="#cb11-29" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf_iterator;</span>
<span id="cb11-30"><a href="#cb11-30" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Then the definitions:</p>
<div class="sourceCode" id="cb12"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb12-1"><a href="#cb12-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb12-2"><a href="#cb12-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb12-3"><a href="#cb12-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="kw">auto</span> <em>bidirectional-at-most</em><span class="op">()</span> <span class="op">{</span>    <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-4"><a href="#cb12-4" aria-hidden="true" tabindex="-1"></a>    <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-5"><a href="#cb12-5" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> bidirectional_iterator_tag<span class="op">{}</span>;</span>
<span id="cb12-6"><a href="#cb12-6" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-7"><a href="#cb12-7" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> forward_iterator_tag<span class="op">{}</span>;</span>
<span id="cb12-8"><a href="#cb12-8" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>input_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-9"><a href="#cb12-9" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> input_iterator_tag<span class="op">{}</span>;</span>
<span id="cb12-10"><a href="#cb12-10" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-11"><a href="#cb12-11" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span></span>
<span id="cb12-12"><a href="#cb12-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-13"><a href="#cb12-13" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb12-14"><a href="#cb12-14" aria-hidden="true" tabindex="-1"></a>  <span class="kw">using</span> <em>bidirectional-at-most-t</em> <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>bidirectional-at-most</em><span class="op">&lt;</span>I<span class="op">&gt;())</span>; <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-15"><a href="#cb12-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-16"><a href="#cb12-16" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> I, <span class="dt">bool</span> SupportReverse <span class="op">=</span> bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span></span>
<span id="cb12-17"><a href="#cb12-17" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> <em>first-and-curr</em> <span class="op">{</span>                         <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-18"><a href="#cb12-18" aria-hidden="true" tabindex="-1"></a>    <em>first-and-curr</em><span class="op">()</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb12-19"><a href="#cb12-19" aria-hidden="true" tabindex="-1"></a>    <em>first-and-curr</em><span class="op">(</span>I curr<span class="op">)</span> <span class="op">:</span> curr<span class="op">{</span>curr<span class="op">}</span> <span class="op">{}</span></span>
<span id="cb12-20"><a href="#cb12-20" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2<span class="op">&gt;</span></span>
<span id="cb12-21"><a href="#cb12-21" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span></span>
<span id="cb12-22"><a href="#cb12-22" aria-hidden="true" tabindex="-1"></a>        <em>first-and-curr</em><span class="op">(</span><span class="kw">const</span> <em>first-and-curr</em><span class="op">&lt;</span>I2<span class="op">&gt;&amp;</span> other<span class="op">)</span> <span class="op">:</span> curr<span class="op">{</span>other<span class="op">.</span>curr<span class="op">}</span> <span class="op">{}</span></span>
<span id="cb12-23"><a href="#cb12-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-24"><a href="#cb12-24" aria-hidden="true" tabindex="-1"></a>    I curr;</span>
<span id="cb12-25"><a href="#cb12-25" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb12-26"><a href="#cb12-26" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> I<span class="op">&gt;</span></span>
<span id="cb12-27"><a href="#cb12-27" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> <em>first-and-curr</em><span class="op">&lt;</span>I, <span class="kw">true</span><span class="op">&gt;</span> <span class="op">{</span>                <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-28"><a href="#cb12-28" aria-hidden="true" tabindex="-1"></a>    <em>first-and-curr</em><span class="op">()</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb12-29"><a href="#cb12-29" aria-hidden="true" tabindex="-1"></a>    <em>first-and-curr</em><span class="op">(</span>I first, I curr<span class="op">)</span> <span class="op">:</span> first<span class="op">{</span>first<span class="op">}</span>, curr<span class="op">{</span>curr<span class="op">}</span> <span class="op">{}</span></span>
<span id="cb12-30"><a href="#cb12-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2<span class="op">&gt;</span></span>
<span id="cb12-31"><a href="#cb12-31" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span></span>
<span id="cb12-32"><a href="#cb12-32" aria-hidden="true" tabindex="-1"></a>        <em>first-and-curr</em><span class="op">(</span><span class="kw">const</span> <em>first-and-curr</em><span class="op">&lt;</span>I2<span class="op">&gt;&amp;</span> other<span class="op">)</span> <span class="op">:</span> first<span class="op">{</span>other<span class="op">.</span>first<span class="op">}</span>, curr<span class="op">{</span>other<span class="op">.</span>curr<span class="op">}</span> <span class="op">{}</span></span>
<span id="cb12-33"><a href="#cb12-33" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-34"><a href="#cb12-34" aria-hidden="true" tabindex="-1"></a>    I first;</span>
<span id="cb12-35"><a href="#cb12-35" aria-hidden="true" tabindex="-1"></a>    I curr;</span>
<span id="cb12-36"><a href="#cb12-36" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb12-37"><a href="#cb12-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-38"><a href="#cb12-38" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> use_replacement_character <span class="op">{</span></span>
<span id="cb12-39"><a href="#cb12-39" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">char32_t</span> <span class="kw">operator</span><span class="op">()(</span>string_view<span class="op">)</span> <span class="kw">const</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> replacement_character; <span class="op">}</span></span>
<span id="cb12-40"><a href="#cb12-40" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb12-41"><a href="#cb12-41" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-42"><a href="#cb12-42" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb12-43"><a href="#cb12-43" aria-hidden="true" tabindex="-1"></a>    format FromFormat,</span>
<span id="cb12-44"><a href="#cb12-44" aria-hidden="true" tabindex="-1"></a>    format ToFormat,</span>
<span id="cb12-45"><a href="#cb12-45" aria-hidden="true" tabindex="-1"></a>    input_iterator I,</span>
<span id="cb12-46"><a href="#cb12-46" aria-hidden="true" tabindex="-1"></a>    sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S,</span>
<span id="cb12-47"><a href="#cb12-47" aria-hidden="true" tabindex="-1"></a>    transcoding_error_handler ErrorHandler<span class="op">&gt;</span></span>
<span id="cb12-48"><a href="#cb12-48" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;</span>, <em>format-to-type-t</em><span class="op">&lt;</span>FromFormat<span class="op">&gt;&gt;</span></span>
<span id="cb12-49"><a href="#cb12-49" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf_iterator <span class="op">:</span> <span class="kw">public</span> iterator_interface<span class="op">&lt;</span></span>
<span id="cb12-50"><a href="#cb12-50" aria-hidden="true" tabindex="-1"></a>                         utf_iterator<span class="op">&lt;</span>FromFormat, ToFormat, I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb12-51"><a href="#cb12-51" aria-hidden="true" tabindex="-1"></a>                         <em>bidirectional-at-most</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb12-52"><a href="#cb12-52" aria-hidden="true" tabindex="-1"></a>                         <em>format-to-type-t</em><span class="op">&lt;</span>ToFormat<span class="op">&gt;</span>,</span>
<span id="cb12-53"><a href="#cb12-53" aria-hidden="true" tabindex="-1"></a>                         <em>format-to-type-t</em><span class="op">&lt;</span>ToFormat<span class="op">&gt;&gt;</span> <span class="op">{</span></span>
<span id="cb12-54"><a href="#cb12-54" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb12-55"><a href="#cb12-55" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> value_type <span class="op">=</span> <em>format-to-type-t</em><span class="op">&lt;</span>ToFormat<span class="op">&gt;</span>;</span>
<span id="cb12-56"><a href="#cb12-56" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-57"><a href="#cb12-57" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_iterator<span class="op">()</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb12-58"><a href="#cb12-58" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-59"><a href="#cb12-59" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_iterator<span class="op">(</span>I first, I it, S last<span class="op">)</span> <span class="kw">requires</span> bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span></span>
<span id="cb12-60"><a href="#cb12-60" aria-hidden="true" tabindex="-1"></a>      <span class="op">:</span> first_and_curr_<span class="op">{</span>first, it<span class="op">}</span>, last_<span class="op">(</span>last<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-61"><a href="#cb12-61" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="op">(</span>curr<span class="op">()</span> <span class="op">!=</span> last_<span class="op">)</span></span>
<span id="cb12-62"><a href="#cb12-62" aria-hidden="true" tabindex="-1"></a>        read<span class="op">()</span>;</span>
<span id="cb12-63"><a href="#cb12-63" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-64"><a href="#cb12-64" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_iterator<span class="op">(</span>I it, S last<span class="op">)</span> <span class="kw">requires</span> <span class="op">(!</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span></span>
<span id="cb12-65"><a href="#cb12-65" aria-hidden="true" tabindex="-1"></a>      <span class="op">:</span> first_and_curr_<span class="op">{</span>it<span class="op">}</span>, last_<span class="op">(</span>last<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-66"><a href="#cb12-66" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="op">(</span>curr<span class="op">()</span> <span class="op">!=</span> last_<span class="op">)</span></span>
<span id="cb12-67"><a href="#cb12-67" aria-hidden="true" tabindex="-1"></a>        read<span class="op">()</span>;</span>
<span id="cb12-68"><a href="#cb12-68" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-69"><a href="#cb12-69" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-70"><a href="#cb12-70" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I2, <span class="kw">class</span> S2<span class="op">&gt;</span></span>
<span id="cb12-71"><a href="#cb12-71" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> convertible_to<span class="op">&lt;</span>I2, I<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>S2, S<span class="op">&gt;</span></span>
<span id="cb12-72"><a href="#cb12-72" aria-hidden="true" tabindex="-1"></a>        <span class="kw">constexpr</span> utf_iterator<span class="op">(</span><span class="kw">const</span> utf_iterator<span class="op">&lt;</span>FromFormat, ToFormat, I2, S2, ErrorHandler<span class="op">&gt;&amp;</span> other<span class="op">)</span> <span class="op">:</span></span>
<span id="cb12-73"><a href="#cb12-73" aria-hidden="true" tabindex="-1"></a>      buf_<span class="op">(</span>other<span class="op">.</span>buf_<span class="op">)</span>,</span>
<span id="cb12-74"><a href="#cb12-74" aria-hidden="true" tabindex="-1"></a>      first_and_curr_<span class="op">(</span>other<span class="op">.</span>first_and_curr_<span class="op">)</span>,</span>
<span id="cb12-75"><a href="#cb12-75" aria-hidden="true" tabindex="-1"></a>      buf_index_<span class="op">(</span>other<span class="op">.</span>buf_index_<span class="op">)</span>,</span>
<span id="cb12-76"><a href="#cb12-76" aria-hidden="true" tabindex="-1"></a>      buf_last_<span class="op">(</span>other<span class="op">.</span>buf_last_<span class="op">)</span>,</span>
<span id="cb12-77"><a href="#cb12-77" aria-hidden="true" tabindex="-1"></a>      last_<span class="op">(</span>other<span class="op">.</span>last_<span class="op">)</span></span>
<span id="cb12-78"><a href="#cb12-78" aria-hidden="true" tabindex="-1"></a>    <span class="op">{}</span></span>
<span id="cb12-79"><a href="#cb12-79" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-80"><a href="#cb12-80" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I begin<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span> <span class="op">{</span> <span class="cf">return</span> first<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb12-81"><a href="#cb12-81" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> S end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> last_; <span class="op">}</span></span>
<span id="cb12-82"><a href="#cb12-82" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-83"><a href="#cb12-83" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I base<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span> <span class="op">{</span> <span class="cf">return</span> curr<span class="op">()</span>; <span class="op">}</span></span>
<span id="cb12-84"><a href="#cb12-84" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-85"><a href="#cb12-85" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> value_type <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> buf_<span class="op">[</span>buf_index_<span class="op">]</span>; <span class="op">}</span></span>
<span id="cb12-86"><a href="#cb12-86" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-87"><a href="#cb12-87" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">++()</span> <span class="op">{</span></span>
<span id="cb12-88"><a href="#cb12-88" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="op">(</span>buf_index_ <span class="op">+</span> <span class="dv">1</span> <span class="op">==</span> buf_last_ <span class="op">&amp;&amp;</span> curr<span class="op">()</span> <span class="op">!=</span> last_<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-89"><a href="#cb12-89" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-90"><a href="#cb12-90" aria-hidden="true" tabindex="-1"></a>          advance<span class="op">(</span>curr<span class="op">()</span>, to_increment_<span class="op">)</span>;</span>
<span id="cb12-91"><a href="#cb12-91" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb12-92"><a href="#cb12-92" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>curr<span class="op">()</span> <span class="op">==</span> last_<span class="op">)</span></span>
<span id="cb12-93"><a href="#cb12-93" aria-hidden="true" tabindex="-1"></a>          buf_index_ <span class="op">=</span> <span class="dv">0</span>;</span>
<span id="cb12-94"><a href="#cb12-94" aria-hidden="true" tabindex="-1"></a>        <span class="cf">else</span></span>
<span id="cb12-95"><a href="#cb12-95" aria-hidden="true" tabindex="-1"></a>          read<span class="op">()</span>;</span>
<span id="cb12-96"><a href="#cb12-96" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>buf_index_ <span class="op">+</span> <span class="dv">1</span> <span class="op">&lt;=</span> buf_last_<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-97"><a href="#cb12-97" aria-hidden="true" tabindex="-1"></a>        <span class="op">++</span>buf_index_;</span>
<span id="cb12-98"><a href="#cb12-98" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb12-99"><a href="#cb12-99" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> <span class="op">*</span><span class="kw">this</span>;</span>
<span id="cb12-100"><a href="#cb12-100" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-101"><a href="#cb12-101" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-102"><a href="#cb12-102" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_iterator<span class="op">&amp;</span> <span class="kw">operator</span><span class="op">--()</span> <span class="kw">requires</span> bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb12-103"><a href="#cb12-103" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="op">(!</span>buf_index_ <span class="op">&amp;&amp;</span> curr<span class="op">()</span> <span class="op">!=</span> first<span class="op">())</span></span>
<span id="cb12-104"><a href="#cb12-104" aria-hidden="true" tabindex="-1"></a>        read_reverse<span class="op">()</span>;</span>
<span id="cb12-105"><a href="#cb12-105" aria-hidden="true" tabindex="-1"></a>      <span class="cf">else</span> <span class="cf">if</span> <span class="op">(</span>buf_index_<span class="op">)</span></span>
<span id="cb12-106"><a href="#cb12-106" aria-hidden="true" tabindex="-1"></a>        <span class="op">--</span>buf_index_;</span>
<span id="cb12-107"><a href="#cb12-107" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> <span class="op">*</span><span class="kw">this</span>;</span>
<span id="cb12-108"><a href="#cb12-108" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-109"><a href="#cb12-109" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-110"><a href="#cb12-110" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_iterator lhs, utf_iterator rhs<span class="op">)</span></span>
<span id="cb12-111"><a href="#cb12-111" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span> <span class="op">||</span> <span class="kw">requires</span> <span class="op">(</span>I i<span class="op">)</span> <span class="op">{</span> i <span class="op">!=</span> i; <span class="op">}</span> <span class="op">{</span></span>
<span id="cb12-112"><a href="#cb12-112" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-113"><a href="#cb12-113" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> lhs<span class="op">.</span>curr<span class="op">()</span> <span class="op">==</span> rhs<span class="op">.</span>curr<span class="op">()</span> <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>buf_index_ <span class="op">==</span> rhs<span class="op">.</span>buf_index_;</span>
<span id="cb12-114"><a href="#cb12-114" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb12-115"><a href="#cb12-115" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>lhs<span class="op">.</span>curr<span class="op">()</span> <span class="op">!=</span> rhs<span class="op">.</span>curr<span class="op">())</span></span>
<span id="cb12-116"><a href="#cb12-116" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> <span class="kw">false</span>;</span>
<span id="cb12-117"><a href="#cb12-117" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-118"><a href="#cb12-118" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="op">(</span>lhs<span class="op">.</span>buf_index_ <span class="op">==</span> rhs<span class="op">.</span>buf_index_ <span class="op">&amp;&amp;</span></span>
<span id="cb12-119"><a href="#cb12-119" aria-hidden="true" tabindex="-1"></a>          lhs<span class="op">.</span>buf_last_ <span class="op">==</span> rhs<span class="op">.</span>buf_last_<span class="op">)</span> <span class="op">{</span></span>
<span id="cb12-120"><a href="#cb12-120" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> <span class="kw">true</span>;</span>
<span id="cb12-121"><a href="#cb12-121" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb12-122"><a href="#cb12-122" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-123"><a href="#cb12-123" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> lhs<span class="op">.</span>buf_index_ <span class="op">==</span> lhs<span class="op">.</span>buf_last_ <span class="op">&amp;&amp;</span></span>
<span id="cb12-124"><a href="#cb12-124" aria-hidden="true" tabindex="-1"></a>             rhs<span class="op">.</span>buf_index_ <span class="op">==</span> rhs<span class="op">.</span>buf_last_;</span>
<span id="cb12-125"><a href="#cb12-125" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb12-126"><a href="#cb12-126" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-127"><a href="#cb12-127" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-128"><a href="#cb12-128" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span>utf_iterator lhs, S rhs<span class="op">)</span> <span class="kw">requires</span> <span class="op">(!</span>same_as<span class="op">&lt;</span>I, S<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-129"><a href="#cb12-129" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb12-130"><a href="#cb12-130" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> lhs<span class="op">.</span>curr<span class="op">()</span> <span class="op">==</span> rhs;</span>
<span id="cb12-131"><a href="#cb12-131" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb12-132"><a href="#cb12-132" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> lhs<span class="op">.</span>curr<span class="op">()</span> <span class="op">==</span> rhs <span class="op">&amp;&amp;</span> lhs<span class="op">.</span>buf_index_ <span class="op">==</span> lhs<span class="op">.</span>buf_last_;</span>
<span id="cb12-133"><a href="#cb12-133" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb12-134"><a href="#cb12-134" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb12-135"><a href="#cb12-135" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-136"><a href="#cb12-136" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> base_type <span class="op">=</span>                   <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-137"><a href="#cb12-137" aria-hidden="true" tabindex="-1"></a>      iterator_interface<span class="op">&lt;</span></span>
<span id="cb12-138"><a href="#cb12-138" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>FromFormat, ToFormat, I, S, ErrorHandler<span class="op">&gt;</span>,</span>
<span id="cb12-139"><a href="#cb12-139" aria-hidden="true" tabindex="-1"></a>        <em>bidirectional-at-most-t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>,</span>
<span id="cb12-140"><a href="#cb12-140" aria-hidden="true" tabindex="-1"></a>        value_type,</span>
<span id="cb12-141"><a href="#cb12-141" aria-hidden="true" tabindex="-1"></a>        value_type<span class="op">&gt;</span>;</span>
<span id="cb12-142"><a href="#cb12-142" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> base_type<span class="op">::</span><span class="kw">operator</span><span class="op">++</span>;</span>
<span id="cb12-143"><a href="#cb12-143" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> base_type<span class="op">::</span><span class="kw">operator</span><span class="op">--</span>;</span>
<span id="cb12-144"><a href="#cb12-144" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-145"><a href="#cb12-145" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb12-146"><a href="#cb12-146" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">void</span> read<span class="op">()</span>;                                            <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-147"><a href="#cb12-147" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">void</span> read_reverse<span class="op">()</span>;                                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-148"><a href="#cb12-148" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-149"><a href="#cb12-149" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I first<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;</span>      <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-150"><a href="#cb12-150" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> first_and_curr_<span class="op">.</span>first; <span class="op">}</span></span>
<span id="cb12-151"><a href="#cb12-151" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I<span class="op">&amp;</span> curr<span class="op">()</span> <span class="op">{</span> <span class="cf">return</span> first_and_curr_<span class="op">.</span>curr; <span class="op">}</span>              <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-152"><a href="#cb12-152" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> I curr<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> first_and_curr_<span class="op">.</span>curr; <span class="op">}</span>         <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-153"><a href="#cb12-153" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-154"><a href="#cb12-154" aria-hidden="true" tabindex="-1"></a>    array<span class="op">&lt;</span>value_type, <span class="dv">4</span> <span class="op">/</span> <span class="kw">static_cast</span><span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;(</span>ToFormat<span class="op">)&gt;</span> buf_;           <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-155"><a href="#cb12-155" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-156"><a href="#cb12-156" aria-hidden="true" tabindex="-1"></a>    <em>first-and-curr</em><span class="op">&lt;</span>I<span class="op">&gt;</span> first_and_curr_;                                <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-157"><a href="#cb12-157" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-158"><a href="#cb12-158" aria-hidden="true" tabindex="-1"></a>    <span class="dt">uint8_t</span> buf_index_ <span class="op">=</span> <span class="dv">0</span>;                                           <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-159"><a href="#cb12-159" aria-hidden="true" tabindex="-1"></a>    <span class="dt">uint8_t</span> buf_last_ <span class="op">=</span> <span class="dv">0</span>;                                            <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-160"><a href="#cb12-160" aria-hidden="true" tabindex="-1"></a>    <span class="dt">uint8_t</span> to_increment_ <span class="op">=</span> <span class="dv">0</span>;                                        <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-161"><a href="#cb12-161" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-162"><a href="#cb12-162" aria-hidden="true" tabindex="-1"></a>    <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> S last_;                                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb12-163"><a href="#cb12-163" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb12-164"><a href="#cb12-164" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb12-165"><a href="#cb12-165" aria-hidden="true" tabindex="-1"></a>      format FromFormat2,</span>
<span id="cb12-166"><a href="#cb12-166" aria-hidden="true" tabindex="-1"></a>      format ToFormat2,</span>
<span id="cb12-167"><a href="#cb12-167" aria-hidden="true" tabindex="-1"></a>      code_unit_iter<span class="op">&lt;</span>FromFormat2<span class="op">&gt;</span> I2,</span>
<span id="cb12-168"><a href="#cb12-168" aria-hidden="true" tabindex="-1"></a>      sentinel_for<span class="op">&lt;</span>I2<span class="op">&gt;</span> S2,</span>
<span id="cb12-169"><a href="#cb12-169" aria-hidden="true" tabindex="-1"></a>      transcoding_error_handler ErrorHandler2<span class="op">&gt;</span></span>
<span id="cb12-170"><a href="#cb12-170" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">class</span> utf_iterator;</span>
<span id="cb12-171"><a href="#cb12-171" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb12-172"><a href="#cb12-172" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p><code class="sourceCode default">use_replacement_character</code> is
an error handler type that can be used with
<code class="sourceCode default">utf_iterator</code>. It accepts a
<code class="sourceCode default">string_view</code> error message, and
returns the replacement character. The user can substitute their own
type here, which may throw, abort, log, etc.</p>
<p><code class="sourceCode default">utf_iterator</code> is an iterator
that transcodes from UTF-N to UTF-M, where N and M are each one of 8,
16, or 32. N may equal M. UTF-N to UTF-N operation invokes the error
handler as appropriate, but does not change format.
<code class="sourceCode default">utf_iterator</code> does its work by
adapting an underlying range of code units. Each code point
<code class="sourceCode default">c</code> to be transcoded is decoded
from <code class="sourceCode default">FromFormat</code> in the
underlying range. <code class="sourceCode default">c</code> is then
encoded to <code class="sourceCode default">ToFormat</code> into an
internal buffer. If ill-formed UTF is encountered during the decoding
step, <code class="sourceCode default">c</code> is whatever invoking the
error handler returns; using the default error handler, this is
<code class="sourceCode default">replacement_character</code>.</p>
<p><code class="sourceCode default">utf_iterator</code> maintains
certain invariants; the invariants differ based on whether
<code class="sourceCode default">utf_iterator</code> is an input
iterator.</p>
<p>For input iterators the invariant is: if
<code class="sourceCode default">*this</code> is at the end of the range
being adapted, then <code class="sourceCode default">curr()</code> ==
<code class="sourceCode default">last_</code>; otherwise, the position
of <code class="sourceCode default">curr()</code> is always at the end
of the current code point <code class="sourceCode default">c</code>
within the range being adapted, and
<code class="sourceCode default">buf_</code> contains the code units in
<code class="sourceCode default">ToFormat</code> that comprise
<code class="sourceCode default">c</code>.</p>
<p>For forward and bidirectional iterators, the invariant is: if
<code class="sourceCode default">*this</code> is at the end of the range
being adapted, then <code class="sourceCode default">curr()</code> ==
<code class="sourceCode default">last_</code>; otherwise, the position
of <code class="sourceCode default">curr()</code> is always at the
beginning of the current code point
<code class="sourceCode default">c</code> within the range being
adapted, and <code class="sourceCode default">buf_</code> contains the
code units in <code class="sourceCode default">ToFormat</code> that
comprise <code class="sourceCode default">c</code>.</p>
<p>When ill-formed UTF is encountered in the range being adapted,
<code class="sourceCode default">utf_iterator</code> calls
<code class="sourceCode default">ErrorHandler{}.operator()</code> to
produce a character to represent the ill-formed sequence. The number and
position of error handler invocations within the transcoded output is
the same, whether the range being adapted is traversed forward or
backward. The number and position of the error handler invocations
should use the “substitution of maximal subparts” approach described in
Chapter 3 of the Unicode standard.</p>
<p>Besides the constructors, no member function of
<code class="sourceCode default">utf_iterator</code> has preconditions.
As long as a <code class="sourceCode default">utf_iterator</code>
<code class="sourceCode default">i</code> is constructed with proper
arguments, all subsequent operations on
<code class="sourceCode default">i</code> are memory safe. This includes
decrementing a <code class="sourceCode default">utf_iterator</code> at
the beginning of the range being adapted, and incrementing or
dereferencing a <code class="sourceCode default">utf_iterator</code> at
the end of the range being adapted.</p>
<p>If <code class="sourceCode default">FromFormat</code> and
<code class="sourceCode default">ToFormat</code> are not each one of
<code class="sourceCode default">format::utf8</code>,
<code class="sourceCode default">format::utf16</code>, or
<code class="sourceCode default">format::utf32</code>, the program is
ill-formed.</p>
<p>If <code class="sourceCode default">input_iterator&lt;I&gt;</code> is
<code class="sourceCode default">true</code>, <code class="sourceCode default">noexcept(ErrorHandler{}(&quot;&quot;)))</code>
must be <code class="sourceCode default">true</code> as well; otherwise,
the program is ill-formed.</p>
<p>The exposition-only member function
<code class="sourceCode default">read</code> decodes the code point
<code class="sourceCode default">c</code> as
<code class="sourceCode default">FromFormat</code> starting from
position <code class="sourceCode default">curr()</code> in the range
being adapted (<code class="sourceCode default">c</code> may be
<code class="sourceCode default">replacement_character</code>); sets
<code class="sourceCode default">to_increment_</code> to the number of
code units read while decoding
<code class="sourceCode default">c</code>; encodes
<code class="sourceCode default">c</code> as
<code class="sourceCode default">ToFormat</code> into
<code class="sourceCode default">buf_</code>; sets
<code class="sourceCode default">buf_index_</code> to
<code class="sourceCode default">0</code>; and sets
<code class="sourceCode default">buf_last_</code> to the number of code
units encoded into <code class="sourceCode default">buf_</code>. If
<code class="sourceCode default">forward_iterator&lt;I&gt;</code> is
<code class="sourceCode default">true</code>,
<code class="sourceCode default">curr()</code> is set to the position it
had before <code class="sourceCode default">read</code> was called. If
an exception is thrown during a call to
<code class="sourceCode default">read</code>, the call to
<code class="sourceCode default">read</code> has no effect.</p>
<p>The exposition-only member function
<code class="sourceCode default">read_reverse</code> decodes the code
point <code class="sourceCode default">c</code> as
<code class="sourceCode default">FromFormat</code> ending at position
<code class="sourceCode default">curr()</code> in the range being
adapted (<code class="sourceCode default">c</code> may be
<code class="sourceCode default">replacement_character</code>); sets
<code class="sourceCode default">to_increment_</code> to the number of
code units read while decoding
<code class="sourceCode default">c</code>; encodes
<code class="sourceCode default">c</code> as
<code class="sourceCode default">ToFormat</code> into
<code class="sourceCode default">buf_</code>; sets
<code class="sourceCode default">buf_last_</code> to the number of code
units encoded into <code class="sourceCode default">buf_</code>; and
sets <code class="sourceCode default">buf_index_</code> to
<code class="sourceCode default">buf_last_ - 1</code>. If an exception
is thrown during a call to
<code class="sourceCode default">read_reverse</code>, the call to
<code class="sourceCode default">read_reverse</code> has no effect.</p>
<h3 data-number="5.4.1" id="why-utf_iterator-is-constrained-the-way-it-is"><span class="header-section-number">5.4.1</span> Why
<code class="sourceCode default">utf_iterator</code> is constrained the
way it is<a href="#why-utf_iterator-is-constrained-the-way-it-is" class="self-link"></a></h3>
<p>The template parameter <code class="sourceCode default">I</code> to
<code class="sourceCode default">utf_iterator</code> is not constrained
with
<code class="sourceCode default">code_unit_iter&lt;FromFormat&gt;</code>
as it was in earlier revisions of this paper. Instead,
<code class="sourceCode default">I</code> must be an
<code class="sourceCode default">input_iterator</code> whose value type
is convertible to <code class="sourceCode default"><em>format-to-type-t</em>&lt;FromFormat&gt;</code>.
This allows two uses of
<code class="sourceCode default">utf_iterator</code> that the previous
constraint would not.</p>
<p>First, <code class="sourceCode default">utf_iterator</code> can be
used to adapt an iterator whose value type is some non-character type.
This is useful in general, since lots of existing Unicode-aware user
code uses <code class="sourceCode default">uint32_t</code> for UTF-32,
or <code class="sourceCode default">short</code> for UTF-16 or whatever.
It is useful in particular because ICU uses
<code class="sourceCode default">int</code> for its UTF-32/code point
type.</p>
<p>Second, because of the first point, adaptations of ranges of
non-character types can be made more efficient. Consider:</p>
<div class="sourceCode" id="cb13"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb13-1"><a href="#cb13-1" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>vector<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;</span> code_points_from_icu <span class="op">=</span> <span class="co">/* ... */</span>;</span>
<span id="cb13-2"><a href="#cb13-2" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v <span class="op">=</span> code_points_from_icu <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_char32_t <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8;</span>
<span id="cb13-3"><a href="#cb13-3" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> first <span class="op">=</span> v<span class="op">.</span>begin<span class="op">()</span>;</span></code></pre></div>
<p>The type of <code class="sourceCode default">first</code> is:</p>
<div class="sourceCode" id="cb14"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb14-1"><a href="#cb14-1" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf32, std<span class="op">::</span>vector<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;::</span>iterator<span class="op">&gt;</span></span></code></pre></div>
<p>That is, the adapting iterator that
<code class="sourceCode default">as_char32_t</code> uses is gone. This
makes using <code class="sourceCode default">as_char32_t</code> more
efficient, when used in conjunction with
<code class="sourceCode default">as_utfN</code>. If
<code class="sourceCode default">utf_iterator</code>’s
<code class="sourceCode default">I</code> were required to be a
<code class="sourceCode default">utf_iter</code>, this optimization
would not work.</p>
<h3 data-number="5.4.2" id="why-utf_iterator-is-not-a-nested-type-within-utf_view"><span class="header-section-number">5.4.2</span> Why
<code class="sourceCode default">utf_iterator</code> is not a nested
type within <code class="sourceCode default">utf_view</code><a href="#why-utf_iterator-is-not-a-nested-type-within-utf_view" class="self-link"></a></h3>
<p>Most users will use views most of the time. However, it can be useful
to use iterators some of the time. For example, say I wanted to track
some user-visible cursor within some bit of text. If I wanted to
represent that cursor independently from the view within which it is
found, it can be awkward to do so without an independent iterator
template.</p>
<div class="sourceCode" id="cb15"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="co">// This is the easy case.  We have the View right there, and can use</span></span>
<span id="cb15-2"><a href="#cb15-2" aria-hidden="true" tabindex="-1"></a><span class="co">// ranges::iterator_t to get its iterator type.</span></span>
<span id="cb15-3"><a href="#cb15-3" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb15-4"><a href="#cb15-4" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> View<span class="op">&gt;</span></span>
<span id="cb15-5"><a href="#cb15-5" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> my_state_type</span>
<span id="cb15-6"><a href="#cb15-6" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb15-7"><a href="#cb15-7" aria-hidden="true" tabindex="-1"></a>    View all_text_;</span>
<span id="cb15-8"><a href="#cb15-8" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span>View<span class="op">&gt;&gt;</span> current_position_;</span>
<span id="cb15-9"><a href="#cb15-9" aria-hidden="true" tabindex="-1"></a>    <span class="co">// other state ...</span></span>
<span id="cb15-10"><a href="#cb15-10" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb15-11"><a href="#cb15-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb15-12"><a href="#cb15-12" aria-hidden="true" tabindex="-1"></a><span class="co">// This one, not so much.  Since we don&#39;t have the View type, we have to make</span></span>
<span id="cb15-13"><a href="#cb15-13" aria-hidden="true" tabindex="-1"></a><span class="co">// the type of current_position_ a template parameter, even if there&#39;s only one</span></span>
<span id="cb15-14"><a href="#cb15-14" aria-hidden="true" tabindex="-1"></a><span class="co">// type ever in use for a given view.</span></span>
<span id="cb15-15"><a href="#cb15-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb15-16"><a href="#cb15-16" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> Iterator<span class="op">&gt;</span></span>
<span id="cb15-17"><a href="#cb15-17" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> my_other_state_type</span>
<span id="cb15-18"><a href="#cb15-18" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb15-19"><a href="#cb15-19" aria-hidden="true" tabindex="-1"></a>    Iterator current_position_;</span>
<span id="cb15-20"><a href="#cb15-20" aria-hidden="true" tabindex="-1"></a>    <span class="co">// other state ...</span></span>
<span id="cb15-21"><a href="#cb15-21" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span></code></pre></div>
<p>Using <code class="sourceCode default">utf_iterator</code> allows us
to write more specific code. Sometimes, generic code is more desirable;
sometimes nongeneric code is more desirable.</p>
<div class="sourceCode" id="cb16"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb16-1"><a href="#cb16-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> my_other_state_type</span>
<span id="cb16-2"><a href="#cb16-2" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb16-3"><a href="#cb16-3" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf8, format<span class="op">::</span>utf32, <span class="dt">char</span> <span class="kw">const</span><span class="op">*&gt;</span> current_position_;</span>
<span id="cb16-4"><a href="#cb16-4" aria-hidden="true" tabindex="-1"></a>    <span class="co">// other state ...</span></span>
<span id="cb16-5"><a href="#cb16-5" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span></code></pre></div>
<p>Further, <code class="sourceCode default">utf_iterator</code> has
configurability options that do not work for
<code class="sourceCode default">utfN_view</code>, like the
<code class="sourceCode default">ErrorHandler</code> template parameter.
This will not be used often, but some users will want it sometimes. I
don’t think such alternate uses are going to be common enough to justify
complicating <code class="sourceCode default">utfN_view</code>; those
uses belong in a lower-level interface like
<code class="sourceCode default">utf_iterator</code>.</p>
<h3 data-number="5.4.3" id="optional-add-aliases-for-common-utf_iterator-specializations"><span class="header-section-number">5.4.3</span> Optional: Add aliases for
common <code class="sourceCode default">utf_iterator</code>
specializations<a href="#optional-add-aliases-for-common-utf_iterator-specializations" class="self-link"></a></h3>
<div class="sourceCode" id="cb17"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb17-1"><a href="#cb17-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb17-2"><a href="#cb17-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-3"><a href="#cb17-3" aria-hidden="true" tabindex="-1"></a>        utf8_iter I,</span>
<span id="cb17-4"><a href="#cb17-4" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-5"><a href="#cb17-5" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-6"><a href="#cb17-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_8_to_16_iterator <span class="op">=</span></span>
<span id="cb17-7"><a href="#cb17-7" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf8, format<span class="op">::</span>utf16, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-8"><a href="#cb17-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-9"><a href="#cb17-9" aria-hidden="true" tabindex="-1"></a>        utf16_iter I,</span>
<span id="cb17-10"><a href="#cb17-10" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-11"><a href="#cb17-11" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-12"><a href="#cb17-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_16_to_8_iterator <span class="op">=</span></span>
<span id="cb17-13"><a href="#cb17-13" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf16, format<span class="op">::</span>utf8, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-14"><a href="#cb17-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb17-15"><a href="#cb17-15" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-16"><a href="#cb17-16" aria-hidden="true" tabindex="-1"></a>        utf8_iter I,</span>
<span id="cb17-17"><a href="#cb17-17" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-18"><a href="#cb17-18" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-19"><a href="#cb17-19" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_8_to_32_iterator <span class="op">=</span></span>
<span id="cb17-20"><a href="#cb17-20" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf8, format<span class="op">::</span>utf32, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-21"><a href="#cb17-21" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-22"><a href="#cb17-22" aria-hidden="true" tabindex="-1"></a>        utf32_iter I,</span>
<span id="cb17-23"><a href="#cb17-23" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-24"><a href="#cb17-24" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-25"><a href="#cb17-25" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_32_to_8_iterator <span class="op">=</span></span>
<span id="cb17-26"><a href="#cb17-26" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf32, format<span class="op">::</span>utf8, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-27"><a href="#cb17-27" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb17-28"><a href="#cb17-28" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-29"><a href="#cb17-29" aria-hidden="true" tabindex="-1"></a>        utf16_iter I,</span>
<span id="cb17-30"><a href="#cb17-30" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-31"><a href="#cb17-31" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-32"><a href="#cb17-32" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_16_to_32_iterator <span class="op">=</span></span>
<span id="cb17-33"><a href="#cb17-33" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf16, format<span class="op">::</span>utf32, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-34"><a href="#cb17-34" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span></span>
<span id="cb17-35"><a href="#cb17-35" aria-hidden="true" tabindex="-1"></a>        utf32_iter I,</span>
<span id="cb17-36"><a href="#cb17-36" aria-hidden="true" tabindex="-1"></a>        std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S <span class="op">=</span> I,</span>
<span id="cb17-37"><a href="#cb17-37" aria-hidden="true" tabindex="-1"></a>        transcoding_error_handler ErrorHandler <span class="op">=</span> use_replacement_character<span class="op">&gt;</span></span>
<span id="cb17-38"><a href="#cb17-38" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> utf_32_to_16_iterator <span class="op">=</span></span>
<span id="cb17-39"><a href="#cb17-39" aria-hidden="true" tabindex="-1"></a>        utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf32, format<span class="op">::</span>utf16, I, S, ErrorHandler<span class="op">&gt;</span>;</span>
<span id="cb17-40"><a href="#cb17-40" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>These aliases make it easier to spell
<code class="sourceCode default">utf_iterator</code>s. Consider <code class="sourceCode default">utf_8_to_32_iterator&lt;char const *&gt;</code>
versus <code class="sourceCode default">utf_iterator&lt;format::utf8, format::utf32, char const *&gt;</code>.
More importantly, they allow CTAD to work, as in <code class="sourceCode default">utf_8_to_32_iterator(first, it, last)</code>.
These aliases are completely optional, of course. Let us poll.</p>
<h3 data-number="5.4.4" id="add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking"><span class="header-section-number">5.4.4</span> Add
<code class="sourceCode default">unpack_iterator_and_sentinel</code> CPO
for iterator “unpacking”<a href="#add-unpack_iterator_and_sentinel-cpo-for-iterator-unpacking" class="self-link"></a></h3>
<div class="sourceCode" id="cb18"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb18-1"><a href="#cb18-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> no_op_repacker <span class="op">{</span></span>
<span id="cb18-2"><a href="#cb18-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb18-3"><a href="#cb18-3" aria-hidden="true" tabindex="-1"></a>    T <span class="kw">operator</span><span class="op">()(</span>T x<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> x; <span class="op">}</span></span>
<span id="cb18-4"><a href="#cb18-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb18-5"><a href="#cb18-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb18-6"><a href="#cb18-6" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>format FormatTag, utf_iter I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, <span class="kw">class</span> Repack<span class="op">&gt;</span></span>
<span id="cb18-7"><a href="#cb18-7" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> unpack_result <span class="op">{</span></span>
<span id="cb18-8"><a href="#cb18-8" aria-hidden="true" tabindex="-1"></a>  <span class="kw">static</span> <span class="kw">constexpr</span> format format_tag <span class="op">=</span> FormatTag;</span>
<span id="cb18-9"><a href="#cb18-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb18-10"><a href="#cb18-10" aria-hidden="true" tabindex="-1"></a>  I first;</span>
<span id="cb18-11"><a href="#cb18-11" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> S last;</span>
<span id="cb18-12"><a href="#cb18-12" aria-hidden="true" tabindex="-1"></a>  <span class="op">[[</span><span class="at">no_unique_address</span><span class="op">]]</span> Repack repack;</span>
<span id="cb18-13"><a href="#cb18-13" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb18-14"><a href="#cb18-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb18-15"><a href="#cb18-15" aria-hidden="true" tabindex="-1"></a><span class="co">// CPO equivalent to:</span></span>
<span id="cb18-16"><a href="#cb18-16" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>utf_iter I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, <span class="kw">class</span> Repack <span class="op">=</span> no_op_repacker<span class="op">&gt;</span></span>
<span id="cb18-17"><a href="#cb18-17" aria-hidden="true" tabindex="-1"></a><span class="kw">constexpr</span> <span class="kw">auto</span> unpack_iterator_and_sentinel<span class="op">(</span>I first, S last, Repack repack <span class="op">=</span> Repack<span class="op">())</span>;</span></code></pre></div>
<p>Any <code class="sourceCode default">utf_iterator</code>
<code class="sourceCode default">ti</code> contains two iterators and a
sentinel. If one were to adapt
<code class="sourceCode default">ti</code> in another transcoding
iterator <code class="sourceCode default">ti2</code>, one quickly
encounters a problem – since for example <code class="sourceCode default">utf_iterator&lt;format::utf32, format::utf16, utf_iterator&lt;format::utf8, format::utf32, char const *&gt;&gt;</code>
would be the size of 9 pointers! Further, such an iterator would do a
UTF-8 to UTF-16 to UTF-32 conversion, when it could have done a direct
UTF-8 to UTF-32 conversion instead.</p>
<p>One would obviously never write a type like the monstrosity above.
However, it is quite possible to accidentally construct one in generic
code. Consider:</p>
<div class="sourceCode" id="cb19"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb19-1"><a href="#cb19-1" aria-hidden="true" tabindex="-1"></a><span class="kw">using</span> <span class="kw">namespace</span> std<span class="op">::</span>uc;</span>
<span id="cb19-2"><a href="#cb19-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-3"><a href="#cb19-3" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>format IterFormat, <span class="kw">typename</span> Iter<span class="op">&gt;</span></span>
<span id="cb19-4"><a href="#cb19-4" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f<span class="op">(</span>Iter it, null_sentinel_t<span class="op">)</span> <span class="op">{</span></span>
<span id="cb19-5"><a href="#cb19-5" aria-hidden="true" tabindex="-1"></a><span class="pp">#if _MSC_VER</span></span>
<span id="cb19-6"><a href="#cb19-6" aria-hidden="true" tabindex="-1"></a>    <span class="co">// On Windows, do something with &#39;it&#39; that requires UTF-16.</span></span>
<span id="cb19-7"><a href="#cb19-7" aria-hidden="true" tabindex="-1"></a>    utf_iterator<span class="op">&lt;</span>IterFormat, format<span class="op">::</span>utf16, Iter, null_sentinel_t<span class="op">&gt;</span> it16;</span>
<span id="cb19-8"><a href="#cb19-8" aria-hidden="true" tabindex="-1"></a>    windows_function<span class="op">(</span>it16, null_sentinel<span class="op">)</span>;</span>
<span id="cb19-9"><a href="#cb19-9" aria-hidden="true" tabindex="-1"></a><span class="pp">#endif</span></span>
<span id="cb19-10"><a href="#cb19-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-11"><a href="#cb19-11" aria-hidden="true" tabindex="-1"></a>    <span class="co">// ... etc.</span></span>
<span id="cb19-12"><a href="#cb19-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb19-13"><a href="#cb19-13" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-14"><a href="#cb19-14" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">int</span> argc, <span class="dt">char</span> <span class="kw">const</span> <span class="op">*</span> argv<span class="op">[])</span> <span class="op">{</span></span>
<span id="cb19-15"><a href="#cb19-15" aria-hidden="true" tabindex="-1"></a>    utf_iterator<span class="op">&lt;</span>format<span class="op">::</span>utf8, format<span class="op">::</span>utf32, <span class="dt">char</span> <span class="kw">const</span> <span class="op">*</span>, null_sentinel_t<span class="op">&gt;</span> it<span class="op">(</span>argv<span class="op">[</span><span class="dv">1</span><span class="op">]</span>, null_sentinel<span class="op">)</span>;</span>
<span id="cb19-16"><a href="#cb19-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-17"><a href="#cb19-17" aria-hidden="true" tabindex="-1"></a>    f<span class="op">&lt;</span>format<span class="op">::</span>utf32<span class="op">&gt;(</span>it, null_sentinel<span class="op">)</span>;</span>
<span id="cb19-18"><a href="#cb19-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb19-19"><a href="#cb19-19" aria-hidden="true" tabindex="-1"></a>    <span class="co">// ... etc.</span></span>
<span id="cb19-20"><a href="#cb19-20" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>This example is a bit contrived, since users will not create
iterators directly like this very often. Users are much more likely to
use the <code class="sourceCode default">utfN_view</code> views and
<code class="sourceCode default">as_utfN</code> view adaptors being
proposed below. The view adaptors are defined in such a way that they
avoid this problem altogether. They do this by unpacking the view they
are adapting before adapting it. For instance:</p>
<div class="sourceCode" id="cb20"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb20-1"><a href="#cb20-1" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string str <span class="op">=</span> <span class="st">u8&quot;some text&quot;</span>;</span>
<span id="cb20-2"><a href="#cb20-2" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-3"><a href="#cb20-3" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> utf16_str <span class="op">=</span> str <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb20-4"><a href="#cb20-4" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-5"><a href="#cb20-5" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>same_as<span class="op">&lt;</span></span>
<span id="cb20-6"><a href="#cb20-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>utf16_str<span class="op">.</span>begin<span class="op">())</span>,</span>
<span id="cb20-7"><a href="#cb20-7" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, std<span class="op">::</span>u8string<span class="op">::</span>iterator<span class="op">&gt;</span></span>
<span id="cb20-8"><a href="#cb20-8" aria-hidden="true" tabindex="-1"></a><span class="op">&gt;)</span>;</span>
<span id="cb20-9"><a href="#cb20-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-10"><a href="#cb20-10" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> utf32_str <span class="op">=</span> utf16_str <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf32;</span>
<span id="cb20-11"><a href="#cb20-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb20-12"><a href="#cb20-12" aria-hidden="true" tabindex="-1"></a><span class="co">// Poof!  The utf_iterator&lt;format::utf8, format::utf16 iterator disappeared!</span></span>
<span id="cb20-13"><a href="#cb20-13" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>same_as<span class="op">&lt;</span></span>
<span id="cb20-14"><a href="#cb20-14" aria-hidden="true" tabindex="-1"></a>    <span class="kw">decltype</span><span class="op">(</span>utf32_str<span class="op">.</span>begin<span class="op">())</span>,</span>
<span id="cb20-15"><a href="#cb20-15" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>uc<span class="op">::</span>utf_iterator<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, std<span class="op">::</span>uc<span class="op">::</span>format<span class="op">::</span>utf32, std<span class="op">::</span>u8string<span class="op">::</span>iterator<span class="op">&gt;</span></span>
<span id="cb20-16"><a href="#cb20-16" aria-hidden="true" tabindex="-1"></a><span class="op">&gt;)</span>;</span></code></pre></div>
<p>The unpacking logic is used in the view adaptors, as shown above.
This allows you to write
<code class="sourceCode default">r | std::uc::as_utf32</code> in a
generic context, without caring whether
<code class="sourceCode default">r</code> is a range of UTF-8, UTF-16,
or UTF-32. You also do not need to care about whether
<code class="sourceCode default">r</code> is a common range or not. You
also can ignore whether <code class="sourceCode default">r</code> is
comprised of raw pointers, some other kind of iterator, or transcoding
iterators.</p>
<p>This becomes especially useful in the APIs proposed in later papers
that depend on this paper. In particular, APIs in subsequent papers
accept any UTF-N iterator, and then transcode internally to UTF-32.
However, this creates a minor problem for some algorithms. Consider this
algorithm (not proposed) as an example.</p>
<div class="sourceCode" id="cb21"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb21-1"><a href="#cb21-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>input_iterator I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, output_iterator<span class="op">&lt;</span><span class="dt">char8_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb21-2"><a href="#cb21-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">requires</span> <span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb21-3"><a href="#cb21-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span>;</span></code></pre></div>
<p>Such a transcoding algorithm is pretty similar to
<code class="sourceCode default">std::ranges::copy</code>, in that you
should return both the output iterator <em>and</em> the final position
of the input iterator
(<code class="sourceCode default">transcode_result</code> is an alias
for <code class="sourceCode default">in_out_result</code>). For such
interfaces, it can be difficult in the general case to form an iterator
of type <code class="sourceCode default">I</code> to return to the
user:</p>
<div class="sourceCode" id="cb22"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb22-1"><a href="#cb22-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>input_iterator I, sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, output_iterator<span class="op">&lt;</span><span class="dt">char8_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb22-2"><a href="#cb22-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> <span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb22-3"><a href="#cb22-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span> <span class="op">{</span></span>
<span id="cb22-4"><a href="#cb22-4" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Get the input as UTF-32.  This may involve unpacking, so possibly decltype(r.begin()) != I.</span></span>
<span id="cb22-5"><a href="#cb22-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> r <span class="op">=</span> ranges<span class="op">::</span>subrange<span class="op">(</span>first, last<span class="op">)</span> <span class="op">|</span> uc<span class="op">::</span>as_utf32;</span>
<span id="cb22-6"><a href="#cb22-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb22-7"><a href="#cb22-7" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Do transcoding.</span></span>
<span id="cb22-8"><a href="#cb22-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> copy_result <span class="op">=</span> ranges<span class="op">::</span>copy<span class="op">(</span>r, out<span class="op">)</span>;</span>
<span id="cb22-9"><a href="#cb22-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb22-10"><a href="#cb22-10" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Return an in_out_result.</span></span>
<span id="cb22-11"><a href="#cb22-11" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> result<span class="op">&lt;</span>I, O<span class="op">&gt;{</span><span class="co">/* ??? */</span>, copy_result<span class="op">.</span>out<span class="op">}</span>;</span>
<span id="cb22-12"><a href="#cb22-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>What should we write for
<code class="sourceCode default">/* ??? */</code>? That is, how do we
get back from the UTF-32 iterator
<code class="sourceCode default">r.begin()</code> to an
<code class="sourceCode default">I</code> iterator? It’s harder than it
first seems; consider the case where
<code class="sourceCode default">I</code> is <code class="sourceCode default">std::uc::utf_16_to_32_iterator&lt;std::uc::utf_8_to_16_iterator&lt;std::string::iterator&gt;&gt;</code>.
The solution is for the unpacking algorithm to remember the structure of
whatever iterator it unpacks, and then rebuild the structure when
returning the result. To demonstrate, here is the implementation of
<code class="sourceCode default">transcode_to_utf32</code> from
Boost.Text:</p>
<div class="sourceCode" id="cb23"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb23-1"><a href="#cb23-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>input_iterator I, std<span class="op">::</span>sentinel_for<span class="op">&lt;</span>I<span class="op">&gt;</span> S, std<span class="op">::</span>output_iterator<span class="op">&lt;</span><span class="dt">char32_t</span><span class="op">&gt;</span> O<span class="op">&gt;</span></span>
<span id="cb23-2"><a href="#cb23-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> <span class="op">(</span>utf8_code_unit<span class="op">&lt;</span>std<span class="op">::</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;</span> <span class="op">||</span> utf16_code_unit<span class="op">&lt;</span>std<span class="op">::</span>iter_value_t<span class="op">&lt;</span>I<span class="op">&gt;&gt;)</span></span>
<span id="cb23-3"><a href="#cb23-3" aria-hidden="true" tabindex="-1"></a>transcode_result<span class="op">&lt;</span>I, O<span class="op">&gt;</span> transcode_to_utf32<span class="op">(</span>I first, S last, O out<span class="op">)</span></span>
<span id="cb23-4"><a href="#cb23-4" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb23-5"><a href="#cb23-5" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> <span class="kw">const</span> r <span class="op">=</span> boost<span class="op">::</span>text<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>first, last<span class="op">)</span>;</span>
<span id="cb23-6"><a href="#cb23-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> unpacked <span class="op">=</span> detail<span class="op">::</span>transcode_to_32<span class="op">&lt;</span><span class="kw">false</span><span class="op">&gt;(</span></span>
<span id="cb23-7"><a href="#cb23-7" aria-hidden="true" tabindex="-1"></a>        detail<span class="op">::</span>tag_t<span class="op">&lt;</span>r<span class="op">.</span>format_tag<span class="op">&gt;</span>, r<span class="op">.</span>first, r<span class="op">.</span>last, <span class="op">-</span><span class="dv">1</span>, out<span class="op">)</span>;</span>
<span id="cb23-8"><a href="#cb23-8" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">{</span>r<span class="op">.</span>repack<span class="op">(</span>unpacked<span class="op">.</span>in<span class="op">)</span>, unpacked<span class="op">.</span>out<span class="op">}</span>;</span>
<span id="cb23-9"><a href="#cb23-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Note the call to <code class="sourceCode default">r.repack</code>.
This is an invocable created by the unpacking process itself.</p>
<p>If this all sounds way too complicated, it’s not bad at all. Here’s
the unpacking/repacking implementation from Boost.Text: <a href="https://github.com/tzlaine/text/blob/develop/include/boost/text/unpack.hpp">unpack.hpp</a>.</p>
<p><code class="sourceCode default">unpack_iterator_and_sentinel</code>
is a CPO. It is intended to work with UDTs that provide their own
unpacking implementation. It returns an
<code class="sourceCode default">unpack_result</code>.</p>
<h3 data-number="5.4.5" id="why-input-iterators-are-not-unpackable"><span class="header-section-number">5.4.5</span> Why input iterators are not
unpackable<a href="#why-input-iterators-are-not-unpackable" class="self-link"></a></h3>
<p>Input iterators are messed up. They barely resemble the other
iterators. For one thing, they are single-pass. This means that when a
<code class="sourceCode default">utf_iterator</code> adapting an input
iterator reads the next code point from the range it is adapting, it
must leave the iterator at a location that is just after the current
code point. It has no choice, since it cannot backtrack.</p>
<p>It is possible to unpack an input iterator in an entirely different
way than other iterators. The unpack operation for input iterators could
be to produce the underlying code unit iterator (the adapted input
iterator itself), <em>plus</em> the current code point that the input
iterator was just used to read.</p>
<p>However, this is not very much help. Consider a case in which we need
to unpack a UTF-8 to UTF-32 transcoding iterator so we can form a UTF-8
to UTF-16 iterator instead. The unpack operation will produce an
unpacked input transcoding iterator – the moral equivalent of
<code class="sourceCode default">std::pair&lt;I, char32_t&gt;</code>.</p>
<p>What can you do with this? Well, you can try to construct a <code class="sourceCode default">utf_iterator&lt;format::utf8, format::utf16, I&gt;</code>
from it. That would mean adding a constructor that takes an input
iterator and a <code class="sourceCode default">char32_t</code>. This
would also mean that any user transcoding iterator types that are usable
with the
<code class="sourceCode default">unpack_iterator_and_sentinel</code> CPO
would also need to unpack their input iterator into an iterator/code
point pair, and that those user types would also need to add this odd
constructor.</p>
<p>This is all weird. It’s also a pretty small use case. People don’t
use input iterators that often. Since this can always be added later, it
is not being proposed right now.</p>
<h2 data-number="5.5" id="add-code-unit-views-and-adaptors"><span class="header-section-number">5.5</span> Add code unit views and
adaptors<a href="#add-code-unit-views-and-adaptors" class="self-link"></a></h2>
<div class="sourceCode" id="cb24"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb24-1"><a href="#cb24-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb24-2"><a href="#cb24-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb24-3"><a href="#cb24-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> <em>iterator-to-tag</em><span class="op">()</span> <span class="op">{</span>                                <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-4"><a href="#cb24-4" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>random_access_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-5"><a href="#cb24-5" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> random_access_iterator_tag<span class="op">{}</span>;</span>
<span id="cb24-6"><a href="#cb24-6" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-7"><a href="#cb24-7" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> bidirectional_iterator_tag<span class="op">{}</span>;</span>
<span id="cb24-8"><a href="#cb24-8" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>forward_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-9"><a href="#cb24-9" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> forward_iterator_tag<span class="op">{}</span>;</span>
<span id="cb24-10"><a href="#cb24-10" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-11"><a href="#cb24-11" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> input_iterator_tag<span class="op">{}</span>;</span>
<span id="cb24-12"><a href="#cb24-12" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-13"><a href="#cb24-13" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-14"><a href="#cb24-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-15"><a href="#cb24-15" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I<span class="op">&gt;</span></span>
<span id="cb24-16"><a href="#cb24-16" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> <em>iterator-to-tag_t</em> <span class="op">=</span> <span class="kw">decltype</span><span class="op">(</span><em>iterator-to-tag</em><span class="op">&lt;</span>I<span class="op">&gt;())</span>;         <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-17"><a href="#cb24-17" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-18"><a href="#cb24-18" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> I, <span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb24-19"><a href="#cb24-19" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> <em>charn-projection-iterator</em>                                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-20"><a href="#cb24-20" aria-hidden="true" tabindex="-1"></a>    <span class="op">:</span> proxy_iterator_interface<span class="op">&lt;</span><em>charn-projection-iterator</em><span class="op">&lt;</span>I, T<span class="op">&gt;</span>, <em>iterator-to-tag_t</em><span class="op">&lt;</span>I<span class="op">&gt;</span>, T<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb24-21"><a href="#cb24-21" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <em>charn-projection-iterator</em><span class="op">()</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb24-22"><a href="#cb24-22" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <em>charn-projection-iterator</em><span class="op">(</span>I it<span class="op">)</span> <span class="op">:</span> it_<span class="op">(</span>std<span class="op">::</span>move<span class="op">(</span>it<span class="op">))</span> <span class="op">{}</span></span>
<span id="cb24-23"><a href="#cb24-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-24"><a href="#cb24-24" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> T <span class="kw">operator</span><span class="op">*()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> T<span class="op">(*</span>it_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-25"><a href="#cb24-25" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-26"><a href="#cb24-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> <span class="kw">constexpr</span> <span class="dt">bool</span> <span class="kw">operator</span><span class="op">==(</span><em>charn-projection-iterator</em> lhs, null_sentinel_t rhs<span class="op">)</span></span>
<span id="cb24-27"><a href="#cb24-27" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> sentinel_for<span class="op">&lt;</span>null_sentinel_t, I<span class="op">&gt;</span></span>
<span id="cb24-28"><a href="#cb24-28" aria-hidden="true" tabindex="-1"></a>        <span class="op">{</span> <span class="cf">return</span> lhs<span class="op">.</span>it_ <span class="op">==</span> rhs; <span class="op">}</span></span>
<span id="cb24-29"><a href="#cb24-29" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-30"><a href="#cb24-30" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb24-31"><a href="#cb24-31" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> boost<span class="op">::</span>stl_interfaces<span class="op">::</span>access;</span>
<span id="cb24-32"><a href="#cb24-32" aria-hidden="true" tabindex="-1"></a>    I <span class="op">&amp;</span> base_reference<span class="op">()</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb24-33"><a href="#cb24-33" aria-hidden="true" tabindex="-1"></a>    I base_reference<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> it_; <span class="op">}</span></span>
<span id="cb24-34"><a href="#cb24-34" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-35"><a href="#cb24-35" aria-hidden="true" tabindex="-1"></a>    I it_;</span>
<span id="cb24-36"><a href="#cb24-36" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb24-37"><a href="#cb24-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-38"><a href="#cb24-38" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>ranges<span class="op">::</span>view V<span class="op">&gt;</span></span>
<span id="cb24-39"><a href="#cb24-39" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>input_range<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>ranges<span class="op">::</span>range_reference_t<span class="op">&lt;</span>V<span class="op">&gt;</span>, <span class="dt">char8_t</span><span class="op">&gt;</span></span>
<span id="cb24-40"><a href="#cb24-40" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> char8_view <span class="op">:</span> <span class="kw">public</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>char8_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span></span>
<span id="cb24-41"><a href="#cb24-41" aria-hidden="true" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb24-42"><a href="#cb24-42" aria-hidden="true" tabindex="-1"></a>    V base_ <span class="op">=</span> V<span class="op">()</span>;  <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-43"><a href="#cb24-43" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-44"><a href="#cb24-44" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb24-45"><a href="#cb24-45" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> iterator <span class="op">=</span> <em>charn-projection-iterator</em><span class="op">&lt;</span>ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span>, <span class="dt">char8_t</span><span class="op">&gt;</span>;</span>
<span id="cb24-46"><a href="#cb24-46" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> sentinel <span class="op">=</span> conditional_t<span class="op">&lt;</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;</span>, iterator, ranges<span class="op">::</span>sentinel_t<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-47"><a href="#cb24-47" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-48"><a href="#cb24-48" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char8_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb24-49"><a href="#cb24-49" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char8_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> base_<span class="op">{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb24-50"><a href="#cb24-50" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-51"><a href="#cb24-51" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-52"><a href="#cb24-52" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">const</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-53"><a href="#cb24-53" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V base<span class="op">()</span> <span class="op">&amp;&amp;</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-54"><a href="#cb24-54" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-55"><a href="#cb24-55" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> iterator begin<span class="op">()</span> <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-56"><a href="#cb24-56" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> sentinel end<span class="op">()</span> <span class="op">{</span></span>
<span id="cb24-57"><a href="#cb24-57" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-58"><a href="#cb24-58" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-59"><a href="#cb24-59" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-60"><a href="#cb24-60" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-61"><a href="#cb24-61" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-62"><a href="#cb24-62" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-63"><a href="#cb24-63" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-64"><a href="#cb24-64" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-65"><a href="#cb24-65" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-66"><a href="#cb24-66" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb24-67"><a href="#cb24-67" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-68"><a href="#cb24-68" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-69"><a href="#cb24-69" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-70"><a href="#cb24-70" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-71"><a href="#cb24-71" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-72"><a href="#cb24-72" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-73"><a href="#cb24-73" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-74"><a href="#cb24-74" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb24-75"><a href="#cb24-75" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-76"><a href="#cb24-76" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-77"><a href="#cb24-77" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-78"><a href="#cb24-78" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb24-79"><a href="#cb24-79" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-80"><a href="#cb24-80" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>ranges<span class="op">::</span>view V<span class="op">&gt;</span></span>
<span id="cb24-81"><a href="#cb24-81" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>input_range<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>ranges<span class="op">::</span>range_reference_t<span class="op">&lt;</span>V<span class="op">&gt;</span>, <span class="dt">char16_t</span><span class="op">&gt;</span></span>
<span id="cb24-82"><a href="#cb24-82" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> char16_view <span class="op">:</span> <span class="kw">public</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>char16_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span></span>
<span id="cb24-83"><a href="#cb24-83" aria-hidden="true" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb24-84"><a href="#cb24-84" aria-hidden="true" tabindex="-1"></a>    V base_ <span class="op">=</span> V<span class="op">()</span>;  <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-85"><a href="#cb24-85" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-86"><a href="#cb24-86" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb24-87"><a href="#cb24-87" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> iterator <span class="op">=</span> <em>charn-projection-iterator</em><span class="op">&lt;</span>ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span>, <span class="dt">char16_t</span><span class="op">&gt;</span>;</span>
<span id="cb24-88"><a href="#cb24-88" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> sentinel <span class="op">=</span> conditional_t<span class="op">&lt;</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;</span>, iterator, ranges<span class="op">::</span>sentinel_t<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-89"><a href="#cb24-89" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-90"><a href="#cb24-90" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char16_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb24-91"><a href="#cb24-91" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char16_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> base_<span class="op">{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb24-92"><a href="#cb24-92" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-93"><a href="#cb24-93" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-94"><a href="#cb24-94" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">const</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-95"><a href="#cb24-95" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V base<span class="op">()</span> <span class="op">&amp;&amp;</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-96"><a href="#cb24-96" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-97"><a href="#cb24-97" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> iterator begin<span class="op">()</span> <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-98"><a href="#cb24-98" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> sentinel end<span class="op">()</span> <span class="op">{</span></span>
<span id="cb24-99"><a href="#cb24-99" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-100"><a href="#cb24-100" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-101"><a href="#cb24-101" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-102"><a href="#cb24-102" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-103"><a href="#cb24-103" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-104"><a href="#cb24-104" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-105"><a href="#cb24-105" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-106"><a href="#cb24-106" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-107"><a href="#cb24-107" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-108"><a href="#cb24-108" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb24-109"><a href="#cb24-109" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-110"><a href="#cb24-110" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-111"><a href="#cb24-111" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-112"><a href="#cb24-112" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-113"><a href="#cb24-113" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-114"><a href="#cb24-114" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-115"><a href="#cb24-115" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-116"><a href="#cb24-116" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb24-117"><a href="#cb24-117" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-118"><a href="#cb24-118" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-119"><a href="#cb24-119" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-120"><a href="#cb24-120" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb24-121"><a href="#cb24-121" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-122"><a href="#cb24-122" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>ranges<span class="op">::</span>view V<span class="op">&gt;</span></span>
<span id="cb24-123"><a href="#cb24-123" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>input_range<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> convertible_to<span class="op">&lt;</span>ranges<span class="op">::</span>range_reference_t<span class="op">&lt;</span>V<span class="op">&gt;</span>, <span class="dt">char32_t</span><span class="op">&gt;</span></span>
<span id="cb24-124"><a href="#cb24-124" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> char32_view <span class="op">:</span> <span class="kw">public</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>char32_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span></span>
<span id="cb24-125"><a href="#cb24-125" aria-hidden="true" tabindex="-1"></a>  <span class="op">{</span></span>
<span id="cb24-126"><a href="#cb24-126" aria-hidden="true" tabindex="-1"></a>    V base_ <span class="op">=</span> V<span class="op">()</span>;  <span class="co">// <em>exposition only</em></span></span>
<span id="cb24-127"><a href="#cb24-127" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-128"><a href="#cb24-128" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb24-129"><a href="#cb24-129" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> iterator <span class="op">=</span> <em>charn-projection-iterator</em><span class="op">&lt;</span>ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span>, <span class="dt">char32_t</span><span class="op">&gt;</span>;</span>
<span id="cb24-130"><a href="#cb24-130" aria-hidden="true" tabindex="-1"></a>    <span class="kw">using</span> sentinel <span class="op">=</span> conditional_t<span class="op">&lt;</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;</span>, iterator, ranges<span class="op">::</span>sentinel_t<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-131"><a href="#cb24-131" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-132"><a href="#cb24-132" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char32_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb24-133"><a href="#cb24-133" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> char32_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> base_<span class="op">{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb24-134"><a href="#cb24-134" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-135"><a href="#cb24-135" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-136"><a href="#cb24-136" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">const</span> V <span class="op">&amp;</span> base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb24-137"><a href="#cb24-137" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V base<span class="op">()</span> <span class="op">&amp;&amp;</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-138"><a href="#cb24-138" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-139"><a href="#cb24-139" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> iterator begin<span class="op">()</span> <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-140"><a href="#cb24-140" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> sentinel end<span class="op">()</span> <span class="op">{</span></span>
<span id="cb24-141"><a href="#cb24-141" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-142"><a href="#cb24-142" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-143"><a href="#cb24-143" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-144"><a href="#cb24-144" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-145"><a href="#cb24-145" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-146"><a href="#cb24-146" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-147"><a href="#cb24-147" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-148"><a href="#cb24-148" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-149"><a href="#cb24-149" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)}</span>; <span class="op">}</span></span>
<span id="cb24-150"><a href="#cb24-150" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb24-151"><a href="#cb24-151" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>common_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb24-152"><a href="#cb24-152" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> iterator<span class="op">{</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)}</span>;</span>
<span id="cb24-153"><a href="#cb24-153" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb24-154"><a href="#cb24-154" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">)</span>;</span>
<span id="cb24-155"><a href="#cb24-155" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb24-156"><a href="#cb24-156" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb24-157"><a href="#cb24-157" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-158"><a href="#cb24-158" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb24-159"><a href="#cb24-159" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-160"><a href="#cb24-160" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> size<span class="op">()</span> <span class="kw">const</span> <span class="kw">requires</span> ranges<span class="op">::</span>sized_range<span class="op">&lt;</span><span class="kw">const</span> V<span class="op">&gt;</span></span>
<span id="cb24-161"><a href="#cb24-161" aria-hidden="true" tabindex="-1"></a>      <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>size<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb24-162"><a href="#cb24-162" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb24-163"><a href="#cb24-163" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-164"><a href="#cb24-164" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb24-165"><a href="#cb24-165" aria-hidden="true" tabindex="-1"></a>  char8_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> char8_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-166"><a href="#cb24-166" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb24-167"><a href="#cb24-167" aria-hidden="true" tabindex="-1"></a>  char16_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> char16_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-168"><a href="#cb24-168" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb24-169"><a href="#cb24-169" aria-hidden="true" tabindex="-1"></a>  char32_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> char32_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb24-170"><a href="#cb24-170" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb24-171"><a href="#cb24-171" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-172"><a href="#cb24-172" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>ranges <span class="op">{</span></span>
<span id="cb24-173"><a href="#cb24-173" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb24-174"><a href="#cb24-174" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>char8_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb24-175"><a href="#cb24-175" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb24-176"><a href="#cb24-176" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>char16_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb24-177"><a href="#cb24-177" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb24-178"><a href="#cb24-178" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>char32_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb24-179"><a href="#cb24-179" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb24-180"><a href="#cb24-180" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb24-181"><a href="#cb24-181" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb24-182"><a href="#cb24-182" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_char8_t;</span>
<span id="cb24-183"><a href="#cb24-183" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_char16_t;</span>
<span id="cb24-184"><a href="#cb24-184" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_char32_t;</span>
<span id="cb24-185"><a href="#cb24-185" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p><code class="sourceCode default">char8_view</code> produces a view of
elements from another view as
<code class="sourceCode default">char8_t</code> values.
<code class="sourceCode default">char16_view</code> produces a view of
elements from another view as
<code class="sourceCode default">char16_t</code> values.
<code class="sourceCode default">char32_view</code> produces a view of
elements from another view as
<code class="sourceCode default">char32_t</code> values. Let
<code class="sourceCode default">charN_view</code> denote any one of the
views <code class="sourceCode default">char8_view</code>,
<code class="sourceCode default">char16_view</code>, and
<code class="sourceCode default">char32_view</code>.</p>
<p>The names <code class="sourceCode default">as_char8_t</code>,
<code class="sourceCode default">as_char16_t</code>, and
<code class="sourceCode default">as_char32_t</code> denote range adaptor
objects ([range.adaptor.object]).
<code class="sourceCode default">as_char8_t</code> produces
<code class="sourceCode default">char8_view</code>s,
<code class="sourceCode default">as_char16_t</code> produces
<code class="sourceCode default">char16_view</code>s, and
<code class="sourceCode default">as_char32_t</code> produces
<code class="sourceCode default">char32_view</code>s. Let
<code class="sourceCode default">as_charN_t</code> denote any one of
<code class="sourceCode default">as_char8_t</code>,
<code class="sourceCode default">as_char16_t</code>, and
<code class="sourceCode default">as_char32_t</code>, and let
<code class="sourceCode default">V</code> denote the
<code class="sourceCode default">charN_view</code> associated with that
object. Let <code class="sourceCode default">E</code> be an expression
and let <code class="sourceCode default">T</code> be <code class="sourceCode default">remove_cvref_t&lt;decltype((E))&gt;</code>.
Let <code class="sourceCode default">F</code> be the
<code class="sourceCode default">format</code> enumerator associated
with <code class="sourceCode default">as_charN_t</code>. If
<code class="sourceCode default">decltype((E))</code> does not model
<code class="sourceCode default">utf_pointer&lt;T&gt;</code> and if
<code class="sourceCode default">charN_view(E)</code> is ill-formed,
<code class="sourceCode default">as_charN_t(E)</code> is ill-formed. The
expression <code class="sourceCode default">as_charN_t(E)</code> is
expression-equivalent to:</p>
<ul>
<li><p>If <code class="sourceCode default">T</code> is a specialization
of <code class="sourceCode default">empty_view</code>
([range.empty.view]), then <code class="sourceCode default">empty_view&lt;<em>format-to-type-t</em>&lt;F&gt;&gt;{}</code>.</p></li>
<li><p>Otherwise, if
<code class="sourceCode default">is_pointer_v&lt;T&gt;</code> is
<code class="sourceCode default">true</code>, then <code class="sourceCode default">V(ranges::subrange(E, null_sentinel))</code>.</p></li>
<li><p>Otherwise, <code class="sourceCode default">V(E)</code>.</p></li>
</ul>
<p>[Example 1:</p>
<div class="sourceCode" id="cb25"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb25-1"><a href="#cb25-1" aria-hidden="true" tabindex="-1"></a><span class="dt">char32_t</span> chars<span class="op">[]</span> <span class="op">=</span> <span class="st">U&quot;Unicode&quot;</span>;</span>
<span id="cb25-2"><a href="#cb25-2" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>vector<span class="op">&lt;</span><span class="dt">int</span><span class="op">&gt;</span> v<span class="op">(</span>std<span class="op">::</span>ranges<span class="op">::</span>begin<span class="op">(</span>chars<span class="op">)</span>, std<span class="op">::</span>ranges<span class="op">::</span>end<span class="op">(</span>chars<span class="op">))</span>;</span>
<span id="cb25-3"><a href="#cb25-3" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> <span class="op">(</span><span class="dt">char8_t</span> c <span class="op">:</span> s <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf32<span class="op">)</span></span>
<span id="cb25-4"><a href="#cb25-4" aria-hidden="true" tabindex="-1"></a>  cout <span class="op">&lt;&lt;</span> <span class="op">(</span><span class="dt">char</span><span class="op">)</span>c <span class="op">&lt;&lt;</span> <span class="ch">&#39; &#39;</span>; <span class="co">// prints U n i c o d e </span></span></code></pre></div>
<p>— end example]</p>
<h3 data-number="5.5.1" id="why-as_charn_t-requires-utf_pointer"><span class="header-section-number">5.5.1</span> Why
<code class="sourceCode default">as_charN_t</code> requires
<code class="sourceCode default">utf_pointer</code><a href="#why-as_charn_t-requires-utf_pointer" class="self-link"></a></h3>
<p>It may seem odd that
<code class="sourceCode default">foo | as_charN_t</code> is well formed
if <code class="sourceCode default">decltype(foo)</code> is
<code class="sourceCode default">std::vector&lt;int&gt;</code>, but ill
formed if <code class="sourceCode default">decltype(foo)</code> is
<code class="sourceCode default">int *</code>. However, this is
intentional.</p>
<p>If you write <code class="sourceCode default">std::vector&lt;int&gt;{/* ... */} | as_char32_t</code>,
the result is always a view whose value type is
<code class="sourceCode default">char32_t</code>. If you write:</p>
<div class="sourceCode" id="cb26"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb26-1"><a href="#cb26-1" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> <span class="op">*</span> ptr <span class="op">=</span> <span class="co">/* ... */</span>;</span>
<span id="cb26-2"><a href="#cb26-2" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> v <span class="op">=</span> ptr <span class="op">|</span> as_char32_t;</span></code></pre></div>
<p><code class="sourceCode default">v</code> may be a view of
<code class="sourceCode default">char32_t</code> values that ends in a
null terminator, or it may be an error that results in UB.
Null-terminated strings are very common, but null-terminated strings of
values that are not a character type are not. It seems far more safe and
idiomatic to restrict the pointer-adaptation case only to
<code class="sourceCode default">utf_pointer</code>s.</p>
<h2 data-number="5.6" id="add-transcoding-views-and-adaptors"><span class="header-section-number">5.6</span> Add transcoding views and
adaptors<a href="#add-transcoding-views-and-adaptors" class="self-link"></a></h2>
<p>The macro
<code class="sourceCode default">CODE_UNIT_CONCEPT_OPTION_2</code> is
used below to indicate the two options for how to define
<code class="sourceCode default"><em>format-of</em></code>, based on the
definition of <code class="sourceCode default">code_unit</code>.</p>
<div class="sourceCode" id="cb27"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb27-1"><a href="#cb27-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb27-2"><a href="#cb27-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> T<span class="op">&gt;</span></span>
<span id="cb27-3"><a href="#cb27-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> format <em>format-of</em><span class="op">()</span> <span class="op">{</span>                                    <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-4"><a href="#cb27-4" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char8_t</span><span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-5"><a href="#cb27-5" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> format<span class="op">::</span>utf8<span class="op">{}</span>;</span>
<span id="cb27-6"><a href="#cb27-6" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char16_t</span><span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-7"><a href="#cb27-7" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> format<span class="op">::</span>utf16<span class="op">{}</span>;</span>
<span id="cb27-8"><a href="#cb27-8" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char32_t</span><span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-9"><a href="#cb27-9" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> format<span class="op">::</span>utf32<span class="op">{}</span>;</span>
<span id="cb27-10"><a href="#cb27-10" aria-hidden="true" tabindex="-1"></a>  <span class="pp">#if CODE_UNIT_CONCEPT_OPTION_2</span></span>
<span id="cb27-11"><a href="#cb27-11" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">char</span><span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-12"><a href="#cb27-12" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> format<span class="op">::</span>utf8<span class="op">{}</span>;</span>
<span id="cb27-13"><a href="#cb27-13" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>same_as<span class="op">&lt;</span>T, <span class="dt">wchar_t</span><span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-14"><a href="#cb27-14" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> <em>wchar-t-format</em>;</span>
<span id="cb27-15"><a href="#cb27-15" aria-hidden="true" tabindex="-1"></a>  <span class="pp">#endif</span></span>
<span id="cb27-16"><a href="#cb27-16" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-17"><a href="#cb27-17" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-18"><a href="#cb27-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-19"><a href="#cb27-19" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>ranges<span class="op">::</span>range R<span class="op">&gt;</span></span>
<span id="cb27-20"><a href="#cb27-20" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> movable<span class="op">&lt;</span>R<span class="op">&gt;</span> <span class="op">&amp;&amp;</span> <span class="op">(!</span><em>is-initializer-list</em><span class="op">&lt;</span>R<span class="op">&gt;)</span></span>
<span id="cb27-21"><a href="#cb27-21" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> unpacking_owning_view <span class="op">:</span> <span class="kw">public</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>unpacking_owning_view<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span> <span class="op">{</span></span>
<span id="cb27-22"><a href="#cb27-22" aria-hidden="true" tabindex="-1"></a>    R r_ <span class="op">=</span> R<span class="op">()</span>;</span>
<span id="cb27-23"><a href="#cb27-23" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-24"><a href="#cb27-24" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb27-25"><a href="#cb27-25" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> unpacking_owning_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>R<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb27-26"><a href="#cb27-26" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> unpacking_owning_view<span class="op">(</span>R<span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="op">:</span> r_<span class="op">(</span>std<span class="op">::</span>move<span class="op">(</span>r<span class="op">))</span> <span class="op">{}</span></span>
<span id="cb27-27"><a href="#cb27-27" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-28"><a href="#cb27-28" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> R<span class="op">&amp;</span> base<span class="op">()</span> <span class="op">&amp;</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> r_; <span class="op">}</span></span>
<span id="cb27-29"><a href="#cb27-29" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">const</span> R<span class="op">&amp;</span> base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> r_; <span class="op">}</span></span>
<span id="cb27-30"><a href="#cb27-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> R<span class="op">&amp;&amp;</span> base<span class="op">()</span> <span class="op">&amp;&amp;</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>r_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb27-31"><a href="#cb27-31" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">const</span> R<span class="op">&amp;&amp;</span> base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;&amp;</span> <span class="kw">noexcept</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>r_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb27-32"><a href="#cb27-32" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-33"><a href="#cb27-33" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> code_units<span class="op">()</span> <span class="kw">const</span> <span class="kw">noexcept</span> <span class="op">{</span></span>
<span id="cb27-34"><a href="#cb27-34" aria-hidden="true" tabindex="-1"></a>      <span class="kw">auto</span> unpacked <span class="op">=</span> uc<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>ranges<span class="op">::</span>begin<span class="op">(</span>r_<span class="op">)</span>, ranges<span class="op">::</span>end<span class="op">(</span>r_<span class="op">))</span>;</span>
<span id="cb27-35"><a href="#cb27-35" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> ranges<span class="op">::</span>subrange<span class="op">(</span>unpacked<span class="op">.</span>first, unpacked<span class="op">.</span>last<span class="op">)</span>;</span>
<span id="cb27-36"><a href="#cb27-36" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-37"><a href="#cb27-37" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-38"><a href="#cb27-38" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>begin<span class="op">(</span>r_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb27-39"><a href="#cb27-39" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> ranges<span class="op">::</span>end<span class="op">(</span>r_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb27-40"><a href="#cb27-40" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb27-41"><a href="#cb27-41" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-42"><a href="#cb27-42" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb27-43"><a href="#cb27-43" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-unpacking-owning-view</em> <span class="op">=</span> <span class="kw">false</span>;                  <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-44"><a href="#cb27-44" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R, <span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-45"><a href="#cb27-45" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-unpacking-owning-view</em><span class="op">&lt;</span>unpacking_owning_view<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span> <span class="op">=</span> <span class="kw">true</span>; <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-46"><a href="#cb27-46" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-47"><a href="#cb27-47" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> T<span class="op">&gt;</span></span>
<span id="cb27-48"><a href="#cb27-48" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-charn-view</em> <span class="op">=</span> <span class="kw">false</span>;                             <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-49"><a href="#cb27-49" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-50"><a href="#cb27-50" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-charn-view</em><span class="op">&lt;</span>char8_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> <span class="kw">true</span>;               <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-51"><a href="#cb27-51" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-52"><a href="#cb27-52" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-charn-view</em><span class="op">&lt;</span>char16_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> <span class="kw">true</span>;              <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-53"><a href="#cb27-53" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-54"><a href="#cb27-54" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">bool</span> <em>is-charn-view</em><span class="op">&lt;</span>char32_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> <span class="kw">true</span>;              <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-55"><a href="#cb27-55" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-56"><a href="#cb27-56" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>format Format, utf_range V<span class="op">&gt;</span></span>
<span id="cb27-57"><a href="#cb27-57" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb27-58"><a href="#cb27-58" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf_view <span class="op">:</span> <span class="kw">public</span> ranges<span class="op">::</span>view_interface<span class="op">&lt;</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&gt;</span> <span class="op">{</span></span>
<span id="cb27-59"><a href="#cb27-59" aria-hidden="true" tabindex="-1"></a>    V base_ <span class="op">=</span> V<span class="op">()</span>;                                          <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-60"><a href="#cb27-60" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-61"><a href="#cb27-61" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>format FromFormat, <span class="kw">class</span> I, <span class="kw">class</span> S<span class="op">&gt;</span></span>
<span id="cb27-62"><a href="#cb27-62" aria-hidden="true" tabindex="-1"></a>      <span class="kw">static</span> <span class="kw">constexpr</span> <span class="kw">auto</span> make_begin<span class="op">(</span>I first, S last<span class="op">)</span> <span class="op">{</span>   <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-63"><a href="#cb27-63" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-64"><a href="#cb27-64" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> utf_iterator<span class="op">&lt;</span>FromFormat, Format, I, S<span class="op">&gt;{</span></span>
<span id="cb27-65"><a href="#cb27-65" aria-hidden="true" tabindex="-1"></a>            first, first, last<span class="op">}</span>;</span>
<span id="cb27-66"><a href="#cb27-66" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-67"><a href="#cb27-67" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> utf_iterator<span class="op">&lt;</span>FromFormat, Format, I, S<span class="op">&gt;{</span>first, last<span class="op">}</span>;</span>
<span id="cb27-68"><a href="#cb27-68" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb27-69"><a href="#cb27-69" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-70"><a href="#cb27-70" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>format FromFormat, <span class="kw">class</span> I, <span class="kw">class</span> S<span class="op">&gt;</span></span>
<span id="cb27-71"><a href="#cb27-71" aria-hidden="true" tabindex="-1"></a>      <span class="kw">static</span> <span class="kw">constexpr</span> <span class="kw">auto</span> make_end<span class="op">(</span>I first, S last<span class="op">)</span> <span class="op">{</span>     <span class="co">// <em>exposition only</em></span></span>
<span id="cb27-72"><a href="#cb27-72" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(!</span>same_as<span class="op">&lt;</span>I, S<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-73"><a href="#cb27-73" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> last;</span>
<span id="cb27-74"><a href="#cb27-74" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>bidirectional_iterator<span class="op">&lt;</span>I<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-75"><a href="#cb27-75" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> utf_iterator<span class="op">&lt;</span>FromFormat, Format, I, S<span class="op">&gt;{</span></span>
<span id="cb27-76"><a href="#cb27-76" aria-hidden="true" tabindex="-1"></a>            first, last, last<span class="op">}</span>;</span>
<span id="cb27-77"><a href="#cb27-77" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-78"><a href="#cb27-78" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> utf_iterator<span class="op">&lt;</span>FromFormat, Format, I, S<span class="op">&gt;{</span>last, last<span class="op">}</span>;</span>
<span id="cb27-79"><a href="#cb27-79" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb27-80"><a href="#cb27-80" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-81"><a href="#cb27-81" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-82"><a href="#cb27-82" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb27-83"><a href="#cb27-83" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb27-84"><a href="#cb27-84" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> base_<span class="op">{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb27-85"><a href="#cb27-85" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-86"><a href="#cb27-86" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V base<span class="op">()</span> <span class="kw">const</span> <span class="op">&amp;</span> <span class="kw">requires</span> copy_constructible<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">{</span> <span class="cf">return</span> base_; <span class="op">}</span></span>
<span id="cb27-87"><a href="#cb27-87" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> V base<span class="op">()</span> <span class="op">&amp;&amp;</span> <span class="op">{</span> <span class="cf">return</span> std<span class="op">::</span>move<span class="op">(</span>base_<span class="op">)</span>; <span class="op">}</span></span>
<span id="cb27-88"><a href="#cb27-88" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-89"><a href="#cb27-89" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> code_units<span class="op">()</span> <span class="kw">const</span> <span class="kw">noexcept</span></span>
<span id="cb27-90"><a href="#cb27-90" aria-hidden="true" tabindex="-1"></a>      <span class="kw">requires</span> copy_constructible<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">||</span> <em>is-unpacking-owning-view</em><span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb27-91"><a href="#cb27-91" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span><em>is-unpacking-owning-view</em><span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-92"><a href="#cb27-92" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> base_<span class="op">.</span>code_units<span class="op">()</span>;</span>
<span id="cb27-93"><a href="#cb27-93" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-94"><a href="#cb27-94" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> base_;</span>
<span id="cb27-95"><a href="#cb27-95" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-96"><a href="#cb27-96" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-97"><a href="#cb27-97" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-98"><a href="#cb27-98" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span></span>
<span id="cb27-99"><a href="#cb27-99" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> format from_format <span class="op">=</span> <em>format-of</em><span class="op">&lt;</span>ranges<span class="op">::</span>range_value_t<span class="op">&lt;</span>V<span class="op">&gt;&gt;()</span>;</span>
<span id="cb27-100"><a href="#cb27-100" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span><span class="op">(</span><em>is-charn-view</em><span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-101"><a href="#cb27-101" aria-hidden="true" tabindex="-1"></a>         <span class="cf">return</span> make_begin<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>std<span class="op">::</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">.</span>base<span class="op">())</span>, std<span class="op">::</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">.</span>base<span class="op">()))</span>;</span>
<span id="cb27-102"><a href="#cb27-102" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span><em>is-unpacking-owning-view</em><span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-103"><a href="#cb27-103" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> make_begin<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">.</span>code_units<span class="op">())</span>, ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">.</span>code_units<span class="op">()))</span>;</span>
<span id="cb27-104"><a href="#cb27-104" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-105"><a href="#cb27-105" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> make_begin<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)</span>, ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">))</span>;</span>
<span id="cb27-106"><a href="#cb27-106" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-107"><a href="#cb27-107" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-108"><a href="#cb27-108" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span></span>
<span id="cb27-109"><a href="#cb27-109" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> format from_format <span class="op">=</span> <em>format-of</em><span class="op">&lt;</span>ranges<span class="op">::</span>range_value_t<span class="op">&lt;</span>V<span class="op">&gt;&gt;()</span>;</span>
<span id="cb27-110"><a href="#cb27-110" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span><span class="op">(</span><em>is-charn-view</em><span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-111"><a href="#cb27-111" aria-hidden="true" tabindex="-1"></a>         <span class="cf">return</span> make_end<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>std<span class="op">::</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">.</span>base<span class="op">())</span>, std<span class="op">::</span>ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">.</span>base<span class="op">()))</span>;</span>
<span id="cb27-112"><a href="#cb27-112" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span><em>is-unpacking-owning-view</em><span class="op">&lt;</span>V<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-113"><a href="#cb27-113" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> make_end<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">.</span>code_units<span class="op">())</span>, ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">.</span>code_units<span class="op">()))</span>;</span>
<span id="cb27-114"><a href="#cb27-114" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-115"><a href="#cb27-115" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> make_end<span class="op">&lt;</span>from_format<span class="op">&gt;(</span>ranges<span class="op">::</span>begin<span class="op">(</span>base_<span class="op">)</span>, ranges<span class="op">::</span>end<span class="op">(</span>base_<span class="op">))</span>;</span>
<span id="cb27-116"><a href="#cb27-116" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-117"><a href="#cb27-117" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-118"><a href="#cb27-118" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-119"><a href="#cb27-119" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> ostream <span class="op">&amp;</span> <span class="kw">operator</span><span class="op">&lt;&lt;(</span>ostream <span class="op">&amp;</span> os, utf_view v<span class="op">)</span>;</span>
<span id="cb27-120"><a href="#cb27-120" aria-hidden="true" tabindex="-1"></a>    <span class="kw">friend</span> wostream <span class="op">&amp;</span> <span class="kw">operator</span><span class="op">&lt;&lt;(</span>wostream <span class="op">&amp;</span> os, utf_view v<span class="op">)</span>;</span>
<span id="cb27-121"><a href="#cb27-121" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb27-122"><a href="#cb27-122" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-123"><a href="#cb27-123" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf_range V<span class="op">&gt;</span></span>
<span id="cb27-124"><a href="#cb27-124" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb27-125"><a href="#cb27-125" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf8_view <span class="op">:</span> <span class="kw">public</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf8, V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb27-126"><a href="#cb27-126" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb27-127"><a href="#cb27-127" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf8_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb27-128"><a href="#cb27-128" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf8_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf8, V<span class="op">&gt;{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb27-129"><a href="#cb27-129" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb27-130"><a href="#cb27-130" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf_range V<span class="op">&gt;</span></span>
<span id="cb27-131"><a href="#cb27-131" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb27-132"><a href="#cb27-132" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf16_view <span class="op">:</span> <span class="kw">public</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf16, V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb27-133"><a href="#cb27-133" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb27-134"><a href="#cb27-134" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf16_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb27-135"><a href="#cb27-135" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf16_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf16, V<span class="op">&gt;{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb27-136"><a href="#cb27-136" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb27-137"><a href="#cb27-137" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>utf_range V<span class="op">&gt;</span></span>
<span id="cb27-138"><a href="#cb27-138" aria-hidden="true" tabindex="-1"></a>    <span class="kw">requires</span> ranges<span class="op">::</span>view<span class="op">&lt;</span>V<span class="op">&gt;</span></span>
<span id="cb27-139"><a href="#cb27-139" aria-hidden="true" tabindex="-1"></a>  <span class="kw">class</span> utf32_view <span class="op">:</span> <span class="kw">public</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf32, V<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb27-140"><a href="#cb27-140" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb27-141"><a href="#cb27-141" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf32_view<span class="op">()</span> <span class="kw">requires</span> default_initializable<span class="op">&lt;</span>V<span class="op">&gt;</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb27-142"><a href="#cb27-142" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> utf32_view<span class="op">(</span>V base<span class="op">)</span> <span class="op">:</span> utf_view<span class="op">&lt;</span>format<span class="op">::</span>utf32, V<span class="op">&gt;{</span>std<span class="op">::</span>move<span class="op">(</span>base<span class="op">)}</span> <span class="op">{}</span></span>
<span id="cb27-143"><a href="#cb27-143" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb27-144"><a href="#cb27-144" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-145"><a href="#cb27-145" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb27-146"><a href="#cb27-146" aria-hidden="true" tabindex="-1"></a>  utf8_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> utf8_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb27-147"><a href="#cb27-147" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb27-148"><a href="#cb27-148" aria-hidden="true" tabindex="-1"></a>  utf16_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> utf16_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb27-149"><a href="#cb27-149" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb27-150"><a href="#cb27-150" aria-hidden="true" tabindex="-1"></a>  utf32_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> utf32_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span>
<span id="cb27-151"><a href="#cb27-151" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb27-152"><a href="#cb27-152" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-153"><a href="#cb27-153" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>ranges <span class="op">{</span></span>
<span id="cb27-154"><a href="#cb27-154" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span>uc<span class="op">::</span>format Format, <span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-155"><a href="#cb27-155" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb27-156"><a href="#cb27-156" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-157"><a href="#cb27-157" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>utf8_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb27-158"><a href="#cb27-158" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-159"><a href="#cb27-159" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb27-160"><a href="#cb27-160" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V<span class="op">&gt;</span></span>
<span id="cb27-161"><a href="#cb27-161" aria-hidden="true" tabindex="-1"></a>    <span class="kw">inline</span> <span class="kw">constexpr</span> <span class="dt">bool</span> enable_borrowed_range<span class="op">&lt;</span>uc<span class="op">::</span>utf32_view<span class="op">&lt;</span>V<span class="op">&gt;&gt;</span> <span class="op">=</span> enable_borrowed_range<span class="op">&lt;</span>V<span class="op">&gt;</span>;</span>
<span id="cb27-162"><a href="#cb27-162" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb27-163"><a href="#cb27-163" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-164"><a href="#cb27-164" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std<span class="op">::</span>uc <span class="op">{</span></span>
<span id="cb27-165"><a href="#cb27-165" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb27-166"><a href="#cb27-166" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="kw">decltype</span><span class="op">(</span><span class="kw">auto</span><span class="op">)</span> unpack_range<span class="op">(</span>R <span class="op">&amp;&amp;</span> r<span class="op">)</span> <span class="op">{</span></span>
<span id="cb27-167"><a href="#cb27-167" aria-hidden="true" tabindex="-1"></a>      <span class="kw">using</span> T <span class="op">=</span> remove_cvref_t<span class="op">&lt;</span>R<span class="op">&gt;</span>;</span>
<span id="cb27-168"><a href="#cb27-168" aria-hidden="true" tabindex="-1"></a>      <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>forward_range<span class="op">&lt;</span>T<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-169"><a href="#cb27-169" aria-hidden="true" tabindex="-1"></a>        <span class="kw">auto</span> unpacked <span class="op">=</span> uc<span class="op">::</span>unpack_iterator_and_sentinel<span class="op">(</span>ranges<span class="op">::</span>begin<span class="op">(</span>r<span class="op">)</span>, ranges<span class="op">::</span>end<span class="op">(</span>r<span class="op">))</span>;</span>
<span id="cb27-170"><a href="#cb27-170" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>is_bounded_array_v<span class="op">&lt;</span>T<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-171"><a href="#cb27-171" aria-hidden="true" tabindex="-1"></a>          <span class="kw">constexpr</span> <span class="kw">auto</span> n <span class="op">=</span> extent_v<span class="op">&lt;</span>T<span class="op">&gt;</span>;</span>
<span id="cb27-172"><a href="#cb27-172" aria-hidden="true" tabindex="-1"></a>          <span class="cf">if</span> <span class="op">(</span>n <span class="op">&amp;&amp;</span> <span class="op">!</span>r<span class="op">[</span>n <span class="op">-</span> <span class="dv">1</span><span class="op">])</span></span>
<span id="cb27-173"><a href="#cb27-173" aria-hidden="true" tabindex="-1"></a>            <span class="op">--</span>unpacked<span class="op">.</span>last;</span>
<span id="cb27-174"><a href="#cb27-174" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb27-175"><a href="#cb27-175" aria-hidden="true" tabindex="-1"></a>        <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(</span>ranges<span class="op">::</span>borrowed_range<span class="op">&lt;</span>T<span class="op">&gt;</span> <span class="op">||</span> is_lvalue_reference_v<span class="op">&lt;</span>R<span class="op">&gt;)</span> <span class="op">{</span></span>
<span id="cb27-176"><a href="#cb27-176" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> ranges<span class="op">::</span>subrange<span class="op">(</span>unpacked<span class="op">.</span>first, unpacked<span class="op">.</span>last<span class="op">)</span>;</span>
<span id="cb27-177"><a href="#cb27-177" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span> <span class="cf">else</span> <span class="cf">if</span> <span class="kw">constexpr</span> <span class="op">(!</span>same_as<span class="op">&lt;</span><span class="kw">decltype</span><span class="op">(</span>unpacked<span class="op">.</span>first<span class="op">)</span>, ranges<span class="op">::</span>iterator_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;</span> <span class="op">||</span></span>
<span id="cb27-178"><a href="#cb27-178" aria-hidden="true" tabindex="-1"></a>                             <span class="op">!</span>same_as<span class="op">&lt;</span><span class="kw">decltype</span><span class="op">(</span>unpacked<span class="op">.</span>last<span class="op">)</span>, ranges<span class="op">::</span>sentinel_t<span class="op">&lt;</span>T<span class="op">&gt;&gt;)</span> <span class="op">{</span></span>
<span id="cb27-179"><a href="#cb27-179" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> unpacking_owning_view<span class="op">(</span>std<span class="op">::</span>move<span class="op">(</span>r<span class="op">))</span>;</span>
<span id="cb27-180"><a href="#cb27-180" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-181"><a href="#cb27-181" aria-hidden="true" tabindex="-1"></a>          <span class="cf">return</span> forward<span class="op">&lt;</span>R<span class="op">&gt;(</span>r<span class="op">)</span>;</span>
<span id="cb27-182"><a href="#cb27-182" aria-hidden="true" tabindex="-1"></a>        <span class="op">}</span></span>
<span id="cb27-183"><a href="#cb27-183" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span> <span class="cf">else</span> <span class="op">{</span></span>
<span id="cb27-184"><a href="#cb27-184" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> forward<span class="op">&lt;</span>R<span class="op">&gt;(</span>r<span class="op">)</span>;</span>
<span id="cb27-185"><a href="#cb27-185" aria-hidden="true" tabindex="-1"></a>      <span class="op">}</span></span>
<span id="cb27-186"><a href="#cb27-186" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb27-187"><a href="#cb27-187" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb27-188"><a href="#cb27-188" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_utf8;</span>
<span id="cb27-189"><a href="#cb27-189" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_utf16;</span>
<span id="cb27-190"><a href="#cb27-190" aria-hidden="true" tabindex="-1"></a>  <span class="kw">inline</span> <span class="kw">constexpr</span> <em>unspecified</em> as_utf32;</span>
<span id="cb27-191"><a href="#cb27-191" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p><code class="sourceCode default">unpacking_owning_view</code>
contains an owned range of type
<code class="sourceCode default">R</code> and knows how to unpack that
range into code unit iterators using
<code class="sourceCode default">unpack_iterator_and_sentinel</code>.</p>
<p><code class="sourceCode default">utf_view</code> produces a view in
UTF format <code class="sourceCode default">Format</code> of the
elements from another UTF view.
<code class="sourceCode default">utf8_view</code> produces a UTF-8 view
of the elements from another UTF view.
<code class="sourceCode default">utf16_view</code> produces a UTF-16
view of the elements from another UTF view.
<code class="sourceCode default">utf32_view</code> produces a UTF-32
view of the elements from another UTF view. Let
<code class="sourceCode default">utfN_view</code> denote any one of the
views <code class="sourceCode default">utf8_view</code>,
<code class="sourceCode default">utf16_view</code>, and
<code class="sourceCode default">utf32_view</code>.</p>
<p>The names <code class="sourceCode default">as_utf8</code>,
<code class="sourceCode default">as_utf16</code>, and
<code class="sourceCode default">as_utf32</code> denote range adaptor
objects ([range.adaptor.object]).
<code class="sourceCode default">as_utf8</code> produces
<code class="sourceCode default">utf8_view</code>s,
<code class="sourceCode default">as_utf16</code> produces
<code class="sourceCode default">utf16_view</code>s, and
<code class="sourceCode default">as_utf32</code> produces
<code class="sourceCode default">utf32_view</code>s. Let
<code class="sourceCode default">as_utfN</code> denote any one of
<code class="sourceCode default">as_utf8</code>,
<code class="sourceCode default">as_utf16</code>, and
<code class="sourceCode default">as_utf32</code>, and let
<code class="sourceCode default">V</code> denote the
<code class="sourceCode default">utfN_view</code> associated with that
object. Let <code class="sourceCode default">charN_view</code> denote
any one of <code class="sourceCode default">char8_view</code>,
<code class="sourceCode default">char16_view</code>, and
<code class="sourceCode default">char32_view</code>. Let
<code class="sourceCode default">E</code> be an expression and let
<code class="sourceCode default">T</code> be <code class="sourceCode default">remove_cvref_t&lt;decltype((E))&gt;</code>.
Let <code class="sourceCode default">F</code> be the
<code class="sourceCode default">format</code> enumerator associated
with <code class="sourceCode default">as_utfN</code>. If
<code class="sourceCode default">decltype((E))</code> does not model
<code class="sourceCode default">utf_range_like</code>,
<code class="sourceCode default">as_utfN(E)</code> is ill-formed. The
expression <code class="sourceCode default">as_utfN(E)</code> is
expression-equivalent to:</p>
<ul>
<li><p>If <code class="sourceCode default">T</code> is a specialization
of <code class="sourceCode default">empty_view</code>
([range.empty.view]), then <code class="sourceCode default">empty_view&lt;<em>format-to-type-t</em>&lt;F&gt;&gt;{}</code>.</p></li>
<li><p>Otherwise, if <code class="sourceCode default">T</code> is a
specialization of <code class="sourceCode default">utfN_view</code>,
then <code class="sourceCode default">V(E.base())</code>.</p></li>
<li><p>Otherwise, if <code class="sourceCode default">T</code> is a
specialization of <code class="sourceCode default">charN_view</code>,
then <code class="sourceCode default">V(E)</code>.</p></li>
<li><p>Otherwise, if
<code class="sourceCode default">is_pointer_v&lt;T&gt;</code> is
<code class="sourceCode default">true</code>, then <code class="sourceCode default">V(ranges::subrange(E, null_sentinel))</code>.</p></li>
<li><p>Otherwise,
<code class="sourceCode default">V(<em>unpack-range</em>(E))</code>.</p></li>
</ul>
<p>[Example 1:</p>
<div class="sourceCode" id="cb28"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb28-1"><a href="#cb28-1" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u32string s <span class="op">=</span> <span class="st">U&quot;Unicode&quot;</span>;</span>
<span id="cb28-2"><a href="#cb28-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> <span class="op">(</span><span class="dt">char8_t</span> c <span class="op">:</span> s <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8<span class="op">)</span></span>
<span id="cb28-3"><a href="#cb28-3" aria-hidden="true" tabindex="-1"></a>  cout <span class="op">&lt;&lt;</span> <span class="op">(</span><span class="dt">char</span><span class="op">)</span>c <span class="op">&lt;&lt;</span> <span class="ch">&#39; &#39;</span>; <span class="co">// prints U n i c o d e </span></span></code></pre></div>
<p>— end example]</p>
<p>[Example 2:</p>
<div class="sourceCode" id="cb29"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb29-1"><a href="#cb29-1" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> <span class="op">*</span> s <span class="op">=</span> <span class="st">L&quot;is weird.&quot;</span>;</span>
<span id="cb29-2"><a href="#cb29-2" aria-hidden="true" tabindex="-1"></a><span class="cf">for</span> <span class="op">(</span><span class="dt">char8_t</span> c <span class="op">:</span> s <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8<span class="op">)</span></span>
<span id="cb29-3"><a href="#cb29-3" aria-hidden="true" tabindex="-1"></a>  cout <span class="op">&lt;&lt;</span> <span class="op">(</span><span class="dt">char</span><span class="op">)</span>c <span class="op">&lt;&lt;</span> <span class="ch">&#39; &#39;</span>; <span class="co">// prints i s   w e i r d . </span></span></code></pre></div>
<p>— end example]</p>
<p>The <code class="sourceCode default">ostream</code> and
<code class="sourceCode default">wostream</code> stream operators
transcode the <code class="sourceCode default">utf_view</code> to UTF-8
and UTF-16 respectively (if transcoding is needed), and the
<code class="sourceCode default">wostream</code> overload is only
defined on Windows.</p>
<h3 data-number="5.6.1" id="why-there-are-three-utfn_views-views-plus-utf_view"><span class="header-section-number">5.6.1</span> Why there are three
<code class="sourceCode default">utfN_view</code>s views plus
<code class="sourceCode default">utf_view</code><a href="#why-there-are-three-utfn_views-views-plus-utf_view" class="self-link"></a></h3>
<p>The views in <code class="sourceCode default">std::ranges</code> are
constrained to accept only
<code class="sourceCode default">std::ranges::view</code> template
parameters. However, they accept
<code class="sourceCode default">std::ranges::viewable_range</code>s in
practice, because they each have a deduction guide that likes like
this:</p>
<div class="sourceCode" id="cb30"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb30-1"><a href="#cb30-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> R<span class="op">&gt;</span></span>
<span id="cb30-2"><a href="#cb30-2" aria-hidden="true" tabindex="-1"></a>utf8_view<span class="op">(</span>R <span class="op">&amp;&amp;)</span> <span class="op">-&gt;</span> utf8_view<span class="op">&lt;</span>views<span class="op">::</span>all_t<span class="op">&lt;</span>R<span class="op">&gt;&gt;</span>;</span></code></pre></div>
<p>It’s not possible to make this work for
<code class="sourceCode default">utf_view</code>, since to use it you
must supply a <code class="sourceCode default">format</code> NTTP. So,
we need the <code class="sourceCode default">utfN_view</code>s. It might
be possible to make <code class="sourceCode default">utf_view</code> an
exposition-only implementation detail, but I think some users might find
use for it, especially in generic contexts. For instance:</p>
<div class="sourceCode" id="cb31"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb31-1"><a href="#cb31-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>format F, <span class="kw">typename</span> V<span class="op">&gt;</span></span>
<span id="cb31-2"><a href="#cb31-2" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> f<span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>F, V<span class="op">&gt;</span> <span class="kw">const</span> <span class="op">&amp;</span> view<span class="op">)</span> <span class="op">{</span></span>
<span id="cb31-3"><a href="#cb31-3" aria-hidden="true" tabindex="-1"></a>    <span class="co">// Use F, V, and view here....</span></span>
<span id="cb31-4"><a href="#cb31-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<h3 data-number="5.6.2" id="unpacking_owning_view"><span class="header-section-number">5.6.2</span>
<code class="sourceCode default">unpacking_owning_view</code><a href="#unpacking_owning_view" class="self-link"></a></h3>
<p>To see why
<code class="sourceCode default">unpacking_owning_view</code> is needed,
consider:</p>
<div class="sourceCode" id="cb32"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb32-1"><a href="#cb32-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> my_text_type</span>
<span id="cb32-2"><a href="#cb32-2" aria-hidden="true" tabindex="-1"></a><span class="op">{</span></span>
<span id="cb32-3"><a href="#cb32-3" aria-hidden="true" tabindex="-1"></a>    my_text_type<span class="op">()</span> <span class="op">=</span> <span class="cf">default</span>;</span>
<span id="cb32-4"><a href="#cb32-4" aria-hidden="true" tabindex="-1"></a>    my_text_type<span class="op">(</span>std<span class="op">::</span>u8string utf8<span class="op">)</span> <span class="op">:</span> utf8_<span class="op">(</span>std<span class="op">::</span>move<span class="op">(</span>utf8<span class="op">))</span> <span class="op">{}</span></span>
<span id="cb32-5"><a href="#cb32-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb32-6"><a href="#cb32-6" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> begin<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> utf_8_to_32_iterator<span class="op">(</span>utf8_<span class="op">.</span>begin<span class="op">()</span>, utf8_<span class="op">.</span>begin<span class="op">()</span>, utf8_<span class="op">.</span>end<span class="op">())</span>; <span class="op">}</span></span>
<span id="cb32-7"><a href="#cb32-7" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> end<span class="op">()</span> <span class="kw">const</span> <span class="op">{</span> <span class="cf">return</span> utf_8_to_32_iterator<span class="op">(</span>utf8_<span class="op">.</span>begin<span class="op">()</span>, utf8_<span class="op">.</span>end<span class="op">()</span>, utf8_<span class="op">.</span>end<span class="op">())</span>; <span class="op">}</span></span>
<span id="cb32-8"><a href="#cb32-8" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb32-9"><a href="#cb32-9" aria-hidden="true" tabindex="-1"></a><span class="kw">private</span><span class="op">:</span></span>
<span id="cb32-10"><a href="#cb32-10" aria-hidden="true" tabindex="-1"></a>    std<span class="op">::</span>u8string utf8_;</span>
<span id="cb32-11"><a href="#cb32-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb32-12"><a href="#cb32-12" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb32-13"><a href="#cb32-13" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f<span class="op">()</span> <span class="op">{</span></span>
<span id="cb32-14"><a href="#cb32-14" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> view <span class="op">=</span> my_text_type<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16;</span>
<span id="cb32-15"><a href="#cb32-15" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>This type <code class="sourceCode default">my_text_type</code> is a
bit odd. We cannot just unpack a
<code class="sourceCode default">my_text_type</code> rvalue and store
the resulting unpacked range in a
<code class="sourceCode default">utf_view</code>, because that would
create a view with dangling iterators. We also cannot just store the
not-unpacked <code class="sourceCode default">my_text_type</code>
either, because its iterator type is then not unpacked, and
<code class="sourceCode default">utf_view::begin()</code> would not work
as written above.</p>
<p>We could just store
<code class="sourceCode default">utf_view::base_</code> as it is given
to us (that is, as a <code class="sourceCode default">decltype(r)</code>
when the users writes
<code class="sourceCode default">r | as_utfN</code>), then unpack it and
form a <code class="sourceCode default">utf_iterator</code> in each of
<code class="sourceCode default">utf_view::begin()</code> and
<code class="sourceCode default">utf_view::end()</code>.</p>
<p>Of course, we could do just that – unpack in
<code class="sourceCode default">utf_view::begin()</code> and
<code class="sourceCode default">utf_view::end()</code> before forming a
<code class="sourceCode default">utf_iterator</code>. The unpacking
logic currently exists in the
<code class="sourceCode default">as_utfN</code> adaptors, and moving it
into <code class="sourceCode default">utf_view</code> seems to be at
odds with how the existing standard adaptors work – the adaptors usually
contain as much of the adaptation logic as possible, leaving the view
itself comparatively simple. It also seems a shame to repeatedly unpack
in <code class="sourceCode default">utf_view::begin()</code> and
<code class="sourceCode default">utf_view::end()</code> when, for the
vast majority of cases, just unpacking once in the adaptor would
suffice. (Remember, we only need to make this special case of an rvalue
with unpackable iterators work, like <code class="sourceCode default">my_text_type(u8&quot;text&quot;) | std::uc::as_utf16</code>.)</p>
<p>Another option would be to give
<code class="sourceCode default">utf_view</code> and all the
<code class="sourceCode default">utfN_view</code>s knowledge of the
special case, in the form of a
<code class="sourceCode default">bool</code> template parameter
indicating that the adapted view needs to be unpacked. Adding an NTTP
like that to <code class="sourceCode default">utfN_view</code> would
make it awkward to use outside the context of the
<code class="sourceCode default">as_utfN</code> adaptors. For
instance:</p>
<div class="sourceCode" id="cb33"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb33-1"><a href="#cb33-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">typename</span> V<span class="op">&gt;</span></span>
<span id="cb33-2"><a href="#cb33-2" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f<span class="op">(</span>std<span class="op">::</span>uc<span class="op">::</span>utf32_view<span class="op">&lt;</span>V, <span class="co">/* ??? */</span><span class="op">&gt;</span> <span class="kw">const</span> <span class="op">&amp;</span> utf32<span class="op">)</span> <span class="op">{</span></span>
<span id="cb33-3"><a href="#cb33-3" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>What do we write for the
<code class="sourceCode default">/* ??? */</code>, the NTTP that
indicates whether <code class="sourceCode default">V</code> is already
unpacked, or we need to unpack <code class="sourceCode default">V</code>
it? We have to do a nontrivial amount of work involving
<code class="sourceCode default">V</code> to know what to write
there.</p>
<h3 data-number="5.6.3" id="more-examples"><span class="header-section-number">5.6.3</span> More examples<a href="#more-examples" class="self-link"></a></h3>
<div class="sourceCode" id="cb34"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb34-1"><a href="#cb34-1" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-2"><a href="#cb34-2" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>my_text_type<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-3"><a href="#cb34-3" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>std<span class="op">::</span>uc<span class="op">::</span>unpacking_owning_view<span class="op">&lt;</span></span>
<span id="cb34-4"><a href="#cb34-4" aria-hidden="true" tabindex="-1"></a>                  my_text_type,</span>
<span id="cb34-5"><a href="#cb34-5" aria-hidden="true" tabindex="-1"></a>                  std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span>std<span class="op">::</span>u8string<span class="op">::</span>const_iterator<span class="op">&gt;&gt;&gt;&gt;)</span>;</span>
<span id="cb34-6"><a href="#cb34-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-7"><a href="#cb34-7" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-8"><a href="#cb34-8" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span><span class="st">u8&quot;text&quot;</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-9"><a href="#cb34-9" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span><span class="kw">const</span> <span class="dt">char8_t</span> <span class="op">*&gt;&gt;&gt;)</span>;</span>
<span id="cb34-10"><a href="#cb34-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-11"><a href="#cb34-11" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-12"><a href="#cb34-12" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>u8string<span class="op">(</span><span class="st">u8&quot;text&quot;</span><span class="op">)</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-13"><a href="#cb34-13" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>std<span class="op">::</span>ranges<span class="op">::</span>owning_view<span class="op">&lt;</span>std<span class="op">::</span>u8string<span class="op">&gt;&gt;&gt;)</span>;</span>
<span id="cb34-14"><a href="#cb34-14" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-15"><a href="#cb34-15" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u8string str <span class="op">=</span> <span class="st">u8&quot;text&quot;</span>;</span>
<span id="cb34-16"><a href="#cb34-16" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-17"><a href="#cb34-17" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-18"><a href="#cb34-18" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>str <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-19"><a href="#cb34-19" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span></span>
<span id="cb34-20"><a href="#cb34-20" aria-hidden="true" tabindex="-1"></a>                  std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span>std<span class="op">::</span>u8string<span class="op">::</span>iterator<span class="op">&gt;&gt;&gt;)</span>;</span>
<span id="cb34-21"><a href="#cb34-21" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-22"><a href="#cb34-22" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-23"><a href="#cb34-23" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>str<span class="op">.</span>c_str<span class="op">()</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-24"><a href="#cb34-24" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span></span>
<span id="cb34-25"><a href="#cb34-25" aria-hidden="true" tabindex="-1"></a>                  <span class="kw">const</span> <span class="dt">char8_t</span> <span class="op">*</span>,</span>
<span id="cb34-26"><a href="#cb34-26" aria-hidden="true" tabindex="-1"></a>                  std<span class="op">::</span>uc<span class="op">::</span>null_sentinel_t<span class="op">&gt;&gt;&gt;)</span>;</span>
<span id="cb34-27"><a href="#cb34-27" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-28"><a href="#cb34-28" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-29"><a href="#cb34-29" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>std<span class="op">::</span>ranges<span class="op">::</span>empty_view<span class="op">&lt;</span><span class="dt">char32_t</span><span class="op">&gt;{}</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-30"><a href="#cb34-30" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>ranges<span class="op">::</span>empty_view<span class="op">&lt;</span><span class="dt">char16_t</span><span class="op">&gt;&gt;)</span>;</span>
<span id="cb34-31"><a href="#cb34-31" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-32"><a href="#cb34-32" aria-hidden="true" tabindex="-1"></a>std<span class="op">::</span>u16string str2 <span class="op">=</span> <span class="st">u&quot;text&quot;</span>;</span>
<span id="cb34-33"><a href="#cb34-33" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-34"><a href="#cb34-34" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-35"><a href="#cb34-35" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>str2 <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-36"><a href="#cb34-36" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span></span>
<span id="cb34-37"><a href="#cb34-37" aria-hidden="true" tabindex="-1"></a>                  std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span>std<span class="op">::</span>u16string<span class="op">::</span>iterator<span class="op">&gt;&gt;&gt;)</span>;</span>
<span id="cb34-38"><a href="#cb34-38" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb34-39"><a href="#cb34-39" aria-hidden="true" tabindex="-1"></a><span class="kw">static_assert</span><span class="op">(</span>std<span class="op">::</span>is_same_v<span class="op">&lt;</span></span>
<span id="cb34-40"><a href="#cb34-40" aria-hidden="true" tabindex="-1"></a>              <span class="kw">decltype</span><span class="op">(</span>str2<span class="op">.</span>c_str<span class="op">()</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf16<span class="op">)</span>,</span>
<span id="cb34-41"><a href="#cb34-41" aria-hidden="true" tabindex="-1"></a>              std<span class="op">::</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>std<span class="op">::</span>ranges<span class="op">::</span>subrange<span class="op">&lt;</span></span>
<span id="cb34-42"><a href="#cb34-42" aria-hidden="true" tabindex="-1"></a>                  <span class="kw">const</span> <span class="dt">char16_t</span> <span class="op">*</span>,</span>
<span id="cb34-43"><a href="#cb34-43" aria-hidden="true" tabindex="-1"></a>                  std<span class="op">::</span>uc<span class="op">::</span>null_sentinel_t<span class="op">&gt;&gt;&gt;)</span>;</span></code></pre></div>
<h3 data-number="5.6.4" id="why-utf_view-always-uses-utf_iterator-even-in-utf-n-to-utf-n-cases"><span class="header-section-number">5.6.4</span> Why
<code class="sourceCode default">utf_view</code> always uses
<code class="sourceCode default">utf_iterator</code>, even in UTF-N to
UTF-N cases<a href="#why-utf_view-always-uses-utf_iterator-even-in-utf-n-to-utf-n-cases" class="self-link"></a></h3>
<p>You might expect that if <code class="sourceCode default">r</code> in
<code class="sourceCode default">r | as_utfN</code> is already in UTF-N,
<code class="sourceCode default">r | as_utfN</code> might just be
<code class="sourceCode default">r</code>. This is not what the
<code class="sourceCode default">as_utfN</code> adaptors do, though.</p>
<p>The adaptors each produce a view
<code class="sourceCode default">utfv</code> that stores a view of type
<code class="sourceCode default">V</code>, where
<code class="sourceCode default">V</code> is made from the result of
unpacking <code class="sourceCode default">r</code>. Further,
<code class="sourceCode default">utfv.begin()</code> is always a
specialization of <code class="sourceCode default">utf_iterator</code>.
<code class="sourceCode default">utfv.end()</code> is also a
specialization of <code class="sourceCode default">utf_iterator</code>
(if <code class="sourceCode default">common_range&lt;V&gt;</code>), or
otherwise the sentinel value for
<code class="sourceCode default">V</code>.</p>
<p>This gives <code class="sourceCode default">r | as_utfN</code> some
nice, consistent properties. With the exception of
<code class="sourceCode default">empty_view&lt;T&gt;{} | as_utfN</code>,
the following are always true:</p>
<ul>
<li><p><code class="sourceCode default">r | as_utfN</code> produces
well-formed UTF. Since the default
<code class="sourceCode default">ErrorHandler</code> template parameter
to <code class="sourceCode default">utf_iterator</code>
<code class="sourceCode default">use_replacement_character</code> is
always used, any ill-formed UTF is replaced with
<code class="sourceCode default">replacement_character</code>. This is
true even when the input was already UTF-N. Remember, the input could
have been UTF-N but had ill-formed UTF in it.</p></li>
<li><p><code class="sourceCode default">r | as_utfN</code> has a
consistent API. If <code class="sourceCode default">r | as_utfN</code>
were sometimes <code class="sourceCode default">r</code>, and since
<code class="sourceCode default">r</code> may be a reference to an
array, you’d have to use
<code class="sourceCode default">std::ranges::begin(r)</code> and
<code class="sourceCode default">::end(r)</code> all the time. However,
you’d probably write <code class="sourceCode default">r.begin()</code>
and <code class="sourceCode default">r.end()</code>, only to one day get
bitten by an array-reference
<code class="sourceCode default">r</code>.</p></li>
<li><p><code class="sourceCode default">r | as_utfN</code> is
formattable/printable. This means you can adapt anything that can be
UTF-transcoded to do I/O in a consistent way. For example:</p></li>
</ul>
<div class="sourceCode" id="cb35"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb35-1"><a href="#cb35-1" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> str0 <span class="op">=</span> std<span class="op">::</span>format<span class="op">(</span><span class="st">&quot;{}&quot;</span>, std<span class="op">::</span>u8string<span class="op">{})</span>;                    <span class="co">// Error: ill-formed!</span></span>
<span id="cb35-2"><a href="#cb35-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> str1 <span class="op">=</span> std<span class="op">::</span>format<span class="op">(</span><span class="st">&quot;{}&quot;</span>, std<span class="op">::</span>u8string<span class="op">{}</span> <span class="op">|</span> std<span class="op">::</span>uc<span class="op">::</span>as_utf8<span class="op">)</span>; <span class="co">// Ok.</span></span></code></pre></div>
<h3 data-number="5.6.5" id="add-utf_view-specialization-of-formatter"><span class="header-section-number">5.6.5</span> Add
<code class="sourceCode default">utf_view</code> specialization of
<code class="sourceCode default">formatter</code><a href="#add-utf_view-specialization-of-formatter" class="self-link"></a></h3>
<p>These should be added to the list of “the debug-enabled string type
specializations” in [format.formatter.spec]. This allows
<code class="sourceCode default">utf_view</code> and
<code class="sourceCode default">utfN_view</code> to be used in
<code class="sourceCode default">std::format()</code> and
<code class="sourceCode default">std::print()</code>. The intention is
that the formatter will transcode to UTF-8 if the formatter’s
<code class="sourceCode default">CharT</code> is
<code class="sourceCode default">char</code>, or to UTF-16 or UTF-32
(which one is implementation defined) if the formatter’s
<code class="sourceCode default">CharT</code> is
<code class="sourceCode default">wchar_t</code> – if transcoding is
necessary at all.</p>
<div class="sourceCode" id="cb36"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb36-1"><a href="#cb36-1" aria-hidden="true" tabindex="-1"></a><span class="kw">namespace</span> std <span class="op">{</span></span>
<span id="cb36-2"><a href="#cb36-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span>uc<span class="op">::</span>format Format, <span class="kw">class</span> V, <span class="kw">class</span> CharT<span class="op">&gt;</span></span>
<span id="cb36-3"><a href="#cb36-3" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb36-4"><a href="#cb36-4" aria-hidden="true" tabindex="-1"></a>  <span class="kw">private</span><span class="op">:</span></span>
<span id="cb36-5"><a href="#cb36-5" aria-hidden="true" tabindex="-1"></a>    formatter<span class="op">&lt;</span>basic_string<span class="op">&lt;</span>CharT<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <em>underlying_</em>;                <span class="co">// <em>exposition only</em></span></span>
<span id="cb36-6"><a href="#cb36-6" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-7"><a href="#cb36-7" aria-hidden="true" tabindex="-1"></a>  <span class="kw">public</span><span class="op">:</span></span>
<span id="cb36-8"><a href="#cb36-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> ParseContext<span class="op">&gt;</span></span>
<span id="cb36-9"><a href="#cb36-9" aria-hidden="true" tabindex="-1"></a>      <span class="kw">constexpr</span> <span class="kw">typename</span> ParseContext<span class="op">::</span>iterator</span>
<span id="cb36-10"><a href="#cb36-10" aria-hidden="true" tabindex="-1"></a>        parse<span class="op">(</span>ParseContext<span class="op">&amp;</span> ctx<span class="op">)</span>;</span>
<span id="cb36-11"><a href="#cb36-11" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-12"><a href="#cb36-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> FormatContext<span class="op">&gt;</span></span>
<span id="cb36-13"><a href="#cb36-13" aria-hidden="true" tabindex="-1"></a>      <span class="kw">typename</span> FormatContext<span class="op">::</span>iterator</span>
<span id="cb36-14"><a href="#cb36-14" aria-hidden="true" tabindex="-1"></a>        format<span class="op">(</span><span class="kw">const</span> uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&amp;</span> view, FormatContext<span class="op">&amp;</span> ctx<span class="op">)</span> <span class="kw">const</span>;</span>
<span id="cb36-15"><a href="#cb36-15" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-16"><a href="#cb36-16" aria-hidden="true" tabindex="-1"></a>    <span class="kw">constexpr</span> <span class="dt">void</span> set_debug_format<span class="op">()</span> <span class="kw">noexcept</span>;</span>
<span id="cb36-17"><a href="#cb36-17" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb36-18"><a href="#cb36-18" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-19"><a href="#cb36-19" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V, <span class="kw">class</span> CharT<span class="op">&gt;</span></span>
<span id="cb36-20"><a href="#cb36-20" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf8_view<span class="op">&lt;</span>V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">:</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb36-21"><a href="#cb36-21" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> FormatContext<span class="op">&gt;</span></span>
<span id="cb36-22"><a href="#cb36-22" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> format<span class="op">(</span><span class="kw">const</span> uc<span class="op">::</span>utf8_view<span class="op">&lt;</span>V<span class="op">&gt;&amp;</span> view, FormatContext<span class="op">&amp;</span> ctx<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span></span>
<span id="cb36-23"><a href="#cb36-23" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf8, V<span class="op">&gt;</span>,CharT<span class="op">&gt;::</span>format<span class="op">(</span>view, ctx<span class="op">)</span>;</span>
<span id="cb36-24"><a href="#cb36-24" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb36-25"><a href="#cb36-25" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb36-26"><a href="#cb36-26" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-27"><a href="#cb36-27" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V, <span class="kw">class</span> CharT<span class="op">&gt;</span></span>
<span id="cb36-28"><a href="#cb36-28" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">:</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb36-29"><a href="#cb36-29" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> FormatContext<span class="op">&gt;</span></span>
<span id="cb36-30"><a href="#cb36-30" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> format<span class="op">(</span><span class="kw">const</span> uc<span class="op">::</span>utf16_view<span class="op">&lt;</span>V<span class="op">&gt;&amp;</span> view, FormatContext<span class="op">&amp;</span> ctx<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span></span>
<span id="cb36-31"><a href="#cb36-31" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf16, V<span class="op">&gt;</span>,CharT<span class="op">&gt;::</span>format<span class="op">(</span>view, ctx<span class="op">)</span>;</span>
<span id="cb36-32"><a href="#cb36-32" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb36-33"><a href="#cb36-33" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb36-34"><a href="#cb36-34" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb36-35"><a href="#cb36-35" aria-hidden="true" tabindex="-1"></a>  <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> V, <span class="kw">class</span> CharT<span class="op">&gt;</span></span>
<span id="cb36-36"><a href="#cb36-36" aria-hidden="true" tabindex="-1"></a>  <span class="kw">struct</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf32_view<span class="op">&lt;</span>V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">:</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf32, V<span class="op">&gt;</span>, CharT<span class="op">&gt;</span> <span class="op">{</span></span>
<span id="cb36-37"><a href="#cb36-37" aria-hidden="true" tabindex="-1"></a>    <span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> FormatContext<span class="op">&gt;</span></span>
<span id="cb36-38"><a href="#cb36-38" aria-hidden="true" tabindex="-1"></a>    <span class="kw">auto</span> format<span class="op">(</span><span class="kw">const</span> uc<span class="op">::</span>utf32_view<span class="op">&lt;</span>V<span class="op">&gt;&amp;</span> view, FormatContext<span class="op">&amp;</span> ctx<span class="op">)</span> <span class="kw">const</span> <span class="op">{</span></span>
<span id="cb36-39"><a href="#cb36-39" aria-hidden="true" tabindex="-1"></a>      <span class="cf">return</span> formatter<span class="op">&lt;</span>uc<span class="op">::</span>utf_view<span class="op">&lt;</span>uc<span class="op">::</span>format<span class="op">::</span>utf32, V<span class="op">&gt;</span>,CharT<span class="op">&gt;::</span>format<span class="op">(</span>view, ctx<span class="op">)</span>;</span>
<span id="cb36-40"><a href="#cb36-40" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb36-41"><a href="#cb36-41" aria-hidden="true" tabindex="-1"></a>  <span class="op">}</span>;</span>
<span id="cb36-42"><a href="#cb36-42" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<div class="sourceCode" id="cb37"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb37-1"><a href="#cb37-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> ParseContext<span class="op">&gt;</span></span>
<span id="cb37-2"><a href="#cb37-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">constexpr</span> <span class="kw">typename</span> ParseContext<span class="op">::</span>iterator</span>
<span id="cb37-3"><a href="#cb37-3" aria-hidden="true" tabindex="-1"></a>    parse<span class="op">(</span>ParseContext<span class="op">&amp;</span> ctx<span class="op">)</span>;</span></code></pre></div>
<p>Effects: Equivalent to:</p>
<div class="sourceCode" id="cb38"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb38-1"><a href="#cb38-1" aria-hidden="true" tabindex="-1"></a><span class="cf">return</span> <em>underlying_</em><span class="op">.</span>parse<span class="op">(</span>ctx<span class="op">)</span>;</span></code></pre></div>
<div class="sourceCode" id="cb39"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb39-1"><a href="#cb39-1" aria-hidden="true" tabindex="-1"></a><span class="kw">template</span><span class="op">&lt;</span><span class="kw">class</span> FormatContext<span class="op">&gt;</span></span>
<span id="cb39-2"><a href="#cb39-2" aria-hidden="true" tabindex="-1"></a>  <span class="kw">typename</span> FormatContext<span class="op">::</span>iterator</span>
<span id="cb39-3"><a href="#cb39-3" aria-hidden="true" tabindex="-1"></a>    format<span class="op">(</span><span class="kw">const</span> uc<span class="op">::</span>utf_view<span class="op">&lt;</span>Format, V<span class="op">&gt;&amp;</span> view, FormatContext<span class="op">&amp;</span> ctx<span class="op">)</span> <span class="kw">const</span>;</span></code></pre></div>
<p>Effects: Equivalent to:</p>
<div class="sourceCode" id="cb40"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb40-1"><a href="#cb40-1" aria-hidden="true" tabindex="-1"></a><span class="kw">auto</span> adaptor <span class="op">=</span> <em>see below</em>;</span>
<span id="cb40-2"><a href="#cb40-2" aria-hidden="true" tabindex="-1"></a><span class="cf">return</span> <em>underlying_</em><span class="op">.</span>format<span class="op">(</span>basic_string<span class="op">&lt;</span>CharT<span class="op">&gt;(</span>from_range, view <span class="op">|</span> adaptor<span class="op">)</span>, ctx<span class="op">)</span>;</span></code></pre></div>
<p><code class="sourceCode default">adaptor</code> is
<code class="sourceCode default">uc::as_utf8</code> if
<code class="sourceCode default">CharT</code> is
<code class="sourceCode default">char</code>. Otherwise, it is
implementation defined whether
<code class="sourceCode default">adaptor</code> is
<code class="sourceCode default">uc::as_utf16</code> or
<code class="sourceCode default">uc::as_utf32</code>.</p>
<div class="sourceCode" id="cb41"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb41-1"><a href="#cb41-1" aria-hidden="true" tabindex="-1"></a><span class="kw">constexpr</span> <span class="dt">void</span> set_debug_format<span class="op">()</span> <span class="kw">noexcept</span>;</span></code></pre></div>
<p>Effects: Equivalent to:</p>
<div class="sourceCode" id="cb42"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb42-1"><a href="#cb42-1" aria-hidden="true" tabindex="-1"></a><em>underlying_</em><span class="op">.</span>set_debug_format<span class="op">()</span>;</span></code></pre></div>
<h2 data-number="5.7" id="add-a-feature-test-macro"><span class="header-section-number">5.7</span> Add a feature test macro<a href="#add-a-feature-test-macro" class="self-link"></a></h2>
<p>Add the feature test macro
<code class="sourceCode default">__cpp_lib_unicode_transcoding</code>.</p>
<h2 data-number="5.8" id="design-notes"><span class="header-section-number">5.8</span> Design notes<a href="#design-notes" class="self-link"></a></h2>
<p>None of the proposed interfaces is subject to change in future
versions of Unicode; each relates to the guaranteed-stable subset. Just
sayin’.</p>
<p>None of the proposed interfaces allocates or throws, unless the user
supplies a throwing <code class="sourceCode default">ErrorHandler</code>
template parameter to
<code class="sourceCode default">utf_iterator</code>.</p>
<p>The proposed interfaces allow users to choose amongst multiple
convenience-vs-compatibility tradeoffs. Explicitly, they are:</p>
<ul>
<li>If you need compatibility with existing iterator-based algorithms
(such as the standard algorithms), use the transcoding iterators.</li>
<li>If you want streamability or the convenience of constructing ranges
with a single <code class="sourceCode default">| as_utfN</code> adaptor
use, use the transcoding views.</li>
</ul>
<p>All the transcoding iterators allow you access to the underlying
iterator via <code class="sourceCode default">.base()</code> (except
when adapting an input iterator), following the convention of the
iterator adaptors already in the standard.</p>
<p>The transcoding views are lazy, as you’d expect. They also compose
with the standard view adaptors, so just transcoding at most 10 UTF-16
code units out of some UTF can be done with <code class="sourceCode default">foo | std::uc::as_utf16 | std::ranges::views::take(10)</code>.</p>
<p>Error handling is explicitly configurable in the transcoding
iterators. This gives control to those who want to do something other
than the default. The default, according to Unicode, is to produce a
replacement character (<code class="sourceCode default">0xfffd</code>)
in the output when broken UTF encoding is seen in the input. This is
what all these interfaces do, unless you configure one of the iterators
as mentioned above.</p>
<p>The production of replacement characters as error-handling strategy
is good for memory compactness and safety. It allows us to store all our
text as UTF-8 (or, less compactly, as UTF-16), and then process code
points as transcoding views. If an error occurs, the transcoding views
will simply produce a replacement character; there is no danger of
UB.</p>
<p>A null-terminated pointer <code class="sourceCode default">p</code>
to an 8-, 16-, or 32-bit string of code units is considered the implicit
range <code class="sourceCode default">[p, null_sentinel)</code>. This
makes user code much more natural;
<code class="sourceCode default">&quot;foo&quot; | as_utf16</code>,
<code class="sourceCode default">&quot;foo&quot;sv | as_utf16</code>,
and <code class="sourceCode default">&quot;foo&quot;s | as_utf16</code>
are roughly equivalent (though the iterator type of the resulting view
may differ).</p>
<p>Iterators are constructed from more than one underlying iterator. To
do iteration in many text-handling contexts, you need to know the
beginning and the end of the range you are iterating over, just to be
able to do iteration correctly. Note that this is not a safety issue,
but a correctness one. For example, say we have a string
<code class="sourceCode default">s</code> of UTF-8 code units that we
would like to iterate over to produce UTF-32 code points. If the last
code unit in <code class="sourceCode default">s</code> is
<code class="sourceCode default">0xe0</code>, we should expect two more
code units to follow. They are not present, though, because
<code class="sourceCode default">0xe0</code> is the last code unit. Now
consider how you would implement
<code class="sourceCode default">operator++()</code> for an iterator
<code class="sourceCode default">iter</code> that transcodes from UTF-8
to UTF-32. If you advance far enough to get the next UTF-32 code point
in each call to <code class="sourceCode default">operator++()</code>,
you may run off the end of <code class="sourceCode default">s</code>
when you find <code class="sourceCode default">0xe0</code> and try to
read two more code units. Note that it does not matter that
<code class="sourceCode default">iter</code> probably comes from a range
with an end-iterator or sentinel as its mate; inside
<code class="sourceCode default">iter</code>’s
<code class="sourceCode default">operator++()</code> this is no help.
<code class="sourceCode default">iter</code> must therefore have the
end-iterator or sentinel as a data member. The same logic applies to the
other end of the range if <code class="sourceCode default">iter</code>
is bidirectional — it must also have the iterator to the start of the
underlying range as a data member. This unfortunate reality comes up
over and over in the proposed iterators, not just the ones that are UTF
transcoding iterators. This is why iterators in this proposal (and the
ones to come) usually consist of three underlying iterators.</p>
<h1 data-number="6" id="implementation-experience"><span class="header-section-number">6</span> Implementation experience<a href="#implementation-experience" class="self-link"></a></h1>
<p>All the interfaces proposed here have been implemented, and
re-implemented, several times over the last 5 years or so. They are part
of a proposed (but not yet accepted!) Boost library, <a href="https://github.com/tzlaine/text">Boost.Text</a>.</p>
<p>The library has hundreds of stars, though I’m not sure how many users
that equates to. All of the interfaces proposed here are among the
best-exercised in the library. There are comprehensive tests for all the
proposed entities, and those entities are used as the foundation upon
which all the other library entities are composed.</p>
<p>Though there are a lot of individual entities proposed here, at one
time or another I have need each one of them, though maybe not in every
UTF-N -&gt; UTF-M permutation. Those transcoding permutations are there
mostly for completeness. I have only ever needed UTF-8 &lt;-&gt;
UTF-&gt;32 in any of my work that uses Unicode. Frequent Windows users
will also need to convert to and from UTF-16 sometimes, because that is
the UTF that the OS APIs use.</p>
<h1 data-number="7" id="bibliography"><span class="header-section-number">7</span> References<a href="#bibliography" class="self-link"></a></h1>
<div id="refs" class="references csl-bib-body hanging-indent" role="doc-bibliography">
<div id="ref-P1629R1" class="csl-entry" role="doc-biblioentry">
[P1629R1] JeanHeyd Meneide. 2020-03-02. Transcoding the world - Standard
Text Encoding. <a href="https://wg21.link/p1629r1"><div class="csl-block">https://wg21.link/p1629r1</div></a>
</div>
</div>
</div>
</div>
</body>
</html>
