<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml" lang xml:lang>
<head>
  <meta charset="utf-8" />
  <meta name="generator" content="mpark/wg21" />
  <meta name="viewport" content="width=device-width, initial-scale=1.0, user-scalable=yes" />
  <meta name="dcterms.date" content="2024-10-14" />
  <title>Make idiomatic usage of `offsetof` well-defined</title>
  <style>
code{white-space: pre-wrap;}
span.smallcaps{font-variant: small-caps;}
span.underline{text-decoration: underline;}
div.column{display: inline-block; vertical-align: top; width: 50%;}
div.csl-block{margin-left: 1.5em;}
ul.task-list{list-style: none;}
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { display: inline-block; line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
div.sourceCode { margin: 1em 0; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ background-color: #f6f8fa; }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span { } 
code span.al { color: #ff0000; } 
code span.an { } 
code span.at { } 
code span.bn { color: #9f6807; } 
code span.bu { color: #9f6807; } 
code span.cf { color: #00607c; } 
code span.ch { color: #9f6807; } 
code span.cn { } 
code span.co { color: #008000; font-style: italic; } 
code span.cv { color: #008000; font-style: italic; } 
code span.do { color: #008000; } 
code span.dt { color: #00607c; } 
code span.dv { color: #9f6807; } 
code span.er { color: #ff0000; font-weight: bold; } 
code span.ex { } 
code span.fl { color: #9f6807; } 
code span.fu { } 
code span.im { } 
code span.in { color: #008000; } 
code span.kw { color: #00607c; } 
code span.op { color: #af1915; } 
code span.ot { } 
code span.pp { color: #6f4e37; } 
code span.re { } 
code span.sc { color: #9f6807; } 
code span.ss { color: #9f6807; } 
code span.st { color: #9f6807; } 
code span.va { } 
code span.vs { color: #9f6807; } 
code span.wa { color: #008000; font-weight: bold; } 
code.diff {color: #898887}
code.diff span.va {color: #006e28}
code.diff span.st {color: #bf0303}
</style>
  <style type="text/css">
body {
margin: 5em;
font-family: serif;

hyphens: auto;
line-height: 1.35;
text-align: justify;
}
@media screen and (max-width: 30em) {
body {
margin: 1.5em;
}
}
div.wrapper {
max-width: 60em;
margin: auto;
}

ul {
list-style-type: none;
padding-left: 2.5em;
margin-top: -0.2em;
margin-bottom: -0.2em;
}
ol {
padding-left: 2.5em;
}
a {
text-decoration: none;
color: #4183C4;
}
a.hidden_link {
text-decoration: none;
color: inherit;
}
li {
margin-top: 0.6em;
margin-bottom: 0.6em;
}
h1, h2, h3, h4 {
position: relative;
line-height: 1;
}
a.self-link {
position: absolute;
top: 0;
left: calc(-1 * (3.5rem - 26px));
width: calc(3.5rem - 26px);
height: 2em;
text-align: center;
border: none;
transition: opacity .2s;
opacity: .5;
font-family: sans-serif;
font-weight: normal;
font-size: 83%;
}
a.self-link:hover { opacity: 1; }
a.self-link::before { content: "§"; }
ul > li:before {
content: "\2014";
position: absolute;
margin-left: -1.5em;
}

#TOC ul > li:before {
content: none;
}
#TOC > ul {
padding-left: 0;
}

.toc-section-number {
margin-right: 0.5em;
}
:target { background-color: #C9FBC9; }
:target .codeblock { background-color: #C9FBC9; }
:target ul { background-color: #C9FBC9; }
.abbr_ref { float: right; }
.folded_abbr_ref { float: right; }
:target .folded_abbr_ref { display: none; }
:target .unfolded_abbr_ref { float: right; display: inherit; }
.unfolded_abbr_ref { display: none; }
.secnum { display: inline-block; min-width: 35pt; }
.header-section-number { display: inline-block; min-width: 35pt; }
.annexnum { display: block; }
div.sourceLinkParent {
float: right;
}
a.sourceLink {
position: absolute;
opacity: 0;
margin-left: 10pt;
}
a.sourceLink:hover {
opacity: 1;
}
a.itemDeclLink {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
opacity: 0;
}
a.itemDeclLink:hover { opacity: 1; }
span.marginalizedparent {
position: relative;
left: -5em;
}
li span.marginalizedparent { left: -7em; }
li ul > li span.marginalizedparent { left: -9em; }
li ul > li ul > li span.marginalizedparent { left: -11em; }
li ul > li ul > li ul > li span.marginalizedparent { left: -13em; }
div.footnoteNumberParent {
position: relative;
left: -4.7em;
}
a.marginalized {
position: absolute;
font-size: 75%;
text-align: right;
width: 5em;
}
a.enumerated_item_num {
position: relative;
left: -3.5em;
display: inline-block;
margin-right: -3em;
text-align: right;
width: 3em;
}
div.para { margin-bottom: 0.6em; margin-top: 0.6em; text-align: justify; }
div.section { text-align: justify; }
div.sentence { display: inline; }
span.indexparent {
display: inline;
position: relative;
float: right;
right: -1em;
}
a.index {
position: absolute;
display: none;
}
a.index:before { content: "⟵"; }

a.index:target {
display: inline;
}
.indexitems {
margin-left: 2em;
text-indent: -2em;
}
div.itemdescr {
margin-left: 3em;
}
.bnf {
font-family: serif;
margin-left: 40pt;
margin-top: 0.5em;
margin-bottom: 0.5em;
}
.ncbnf {
font-family: serif;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
}
.ncsimplebnf {
font-family: serif;
font-style: italic;
margin-top: 0.5em;
margin-bottom: 0.5em;
margin-left: 40pt;
background: inherit; 
}
span.textnormal {
font-style: normal;
font-family: serif;
white-space: normal;
display: inline-block;
}
span.rlap {
display: inline-block;
width: 0px;
}
span.descr { font-style: normal; font-family: serif; }
span.grammarterm { font-style: italic; }
span.term { font-style: italic; }
span.terminal { font-family: monospace; font-style: normal; }
span.nonterminal { font-style: italic; }
span.tcode { font-family: monospace; font-style: normal; }
span.textbf { font-weight: bold; }
span.textsc { font-variant: small-caps; }
a.nontermdef { font-style: italic; font-family: serif; }
span.emph { font-style: italic; }
span.techterm { font-style: italic; }
span.mathit { font-style: italic; }
span.mathsf { font-family: sans-serif; }
span.mathrm { font-family: serif; font-style: normal; }
span.textrm { font-family: serif; }
span.textsl { font-style: italic; }
span.mathtt { font-family: monospace; font-style: normal; }
span.mbox { font-family: serif; font-style: normal; }
span.ungap { display: inline-block; width: 2pt; }
span.textit { font-style: italic; }
span.texttt { font-family: monospace; }
span.tcode_in_codeblock { font-family: monospace; font-style: normal; }
span.phantom { color: white; }

span.math { font-style: normal; }
span.mathblock {
display: block;
margin-left: auto;
margin-right: auto;
margin-top: 1.2em;
margin-bottom: 1.2em;
text-align: center;
}
span.mathalpha {
font-style: italic;
}
span.synopsis {
font-weight: bold;
margin-top: 0.5em;
display: block;
}
span.definition {
font-weight: bold;
display: block;
}
.codeblock {
margin-left: 1.2em;
line-height: 127%;
}
.outputblock {
margin-left: 1.2em;
line-height: 127%;
}
div.itemdecl {
margin-top: 2ex;
}
code.itemdeclcode {
white-space: pre;
display: block;
}
span.textsuperscript {
vertical-align: super;
font-size: smaller;
line-height: 0;
}
.footnotenum { vertical-align: super; font-size: smaller; line-height: 0; }
.footnote {
font-size: small;
margin-left: 2em;
margin-right: 2em;
margin-top: 0.6em;
margin-bottom: 0.6em;
}
div.minipage {
display: inline-block;
margin-right: 3em;
}
div.numberedTable {
text-align: center;
margin: 2em;
}
div.figure {
text-align: center;
margin: 2em;
}
table {
border: 1px solid black;
border-collapse: collapse;
margin-left: auto;
margin-right: auto;
margin-top: 0.8em;
text-align: left;
hyphens: none; 
}
td, th {
padding-left: 1em;
padding-right: 1em;
vertical-align: top;
}
td.empty {
padding: 0px;
padding-left: 1px;
}
td.left {
text-align: left;
}
td.right {
text-align: right;
}
td.center {
text-align: center;
}
td.justify {
text-align: justify;
}
td.border {
border-left: 1px solid black;
}
tr.rowsep, td.cline {
border-top: 1px solid black;
}
tr.even, tr.odd {
border-bottom: 1px solid black;
}
tr.capsep {
border-top: 3px solid black;
border-top-style: double;
}
tr.header {
border-bottom: 3px solid black;
border-bottom-style: double;
}
th {
border-bottom: 1px solid black;
}
span.centry {
font-weight: bold;
}
div.table {
display: block;
margin-left: auto;
margin-right: auto;
text-align: center;
width: 90%;
}
span.indented {
display: block;
margin-left: 2em;
margin-bottom: 1em;
margin-top: 1em;
}
ol.enumeratea { list-style-type: none; background: inherit; }
ol.enumerate { list-style-type: none; background: inherit; }

code.sourceCode > span { display: inline; }
</style>
  <link href="data:image/x-icon;base64,AAABAAIAEBAAAAEAIABoBAAAJgAAACAgAAABACAAqBAAAI4EAAAoAAAAEAAAACAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAVoJEAN6CRADegkQAWIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wCCRAAAgkQAAIJEAACCRAAsgkQAvoJEAP+CRAD/gkQA/4JEAP+CRADAgkQALoJEAACCRAAAgkQAAP///wD///8AgkQAAIJEABSCRACSgkQA/IJEAP99PQD/dzMA/3czAP99PQD/gkQA/4JEAPyCRACUgkQAFIJEAAD///8A////AHw+AFiBQwDqgkQA/4BBAP9/PxP/uZd6/9rJtf/bybX/upd7/39AFP+AQQD/gkQA/4FDAOqAQgBc////AP///wDKklv4jlEa/3o7AP+PWC//8+3o///////////////////////z7un/kFox/35AAP+GRwD/mVYA+v///wD///8A0Zpk+NmibP+0d0T/8evj///////+/fv/1sKz/9bCs//9/fr//////+/m2/+NRwL/nloA/5xYAPj///8A////ANKaZPjRmGH/5cKh////////////k149/3UwAP91MQD/lmQ//86rhv+USg3/m1YA/5hSAP+bVgD4////AP///wDSmmT4zpJY/+/bx///////8+TV/8mLT/+TVx//gkIA/5lVAP+VTAD/x6B//7aEVv/JpH7/s39J+P///wD///8A0ppk+M6SWP/u2sf///////Pj1f/Nj1T/2KFs/8mOUv+eWhD/lEsA/8aee/+0glT/x6F7/7J8Rvj///8A////ANKaZPjRmGH/48Cf///////+/v7/2qt//82PVP/OkFX/37KJ/86siv+USg7/mVQA/5hRAP+bVgD4////AP///wDSmmT40ppk/9CVXP/69O////////7+/v/x4M//8d/P//7+/f//////9u7n/6tnJf+XUgD/nFgA+P///wD///8A0ppk+NKaZP/RmWL/1qNy//r07///////////////////////+vXw/9akdP/Wnmn/y5FY/6JfFvj///8A////ANKaZFTSmmTo0ppk/9GYYv/Ql1//5cWm//Hg0P/x4ND/5cWm/9GXYP/RmGH/0ppk/9KaZOjVnmpY////AP///wDSmmQA0ppkEtKaZI7SmmT60ppk/9CWX//OkVb/zpFW/9CWX//SmmT/0ppk/NKaZJDSmmQS0ppkAP///wD///8A0ppkANKaZADSmmQA0ppkKtKaZLrSmmT/0ppk/9KaZP/SmmT/0ppkvNKaZCrSmmQA0ppkANKaZAD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkUtKaZNzSmmTc0ppkVNKaZADSmmQA0ppkANKaZADSmmQA////AP5/AAD4HwAA4AcAAMADAACAAQAAgAEAAIABAACAAQAAgAEAAIABAACAAQAAgAEAAMADAADgBwAA+B8AAP5/AAAoAAAAIAAAAEAAAAABACAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAyCRACMgkQA6oJEAOqCRACQgkQAEIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRABigkQA5oJEAP+CRAD/gkQA/4JEAP+CRADqgkQAZoJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAA4gkQAwoJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQAxIJEADyCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAAgkQAAP///wD///8A////AP///wCCRAAAgkQAAIJEAACCRAAAgkQAAIJEAACCRAAWgkQAmIJEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAJyCRAAYgkQAAIJEAACCRAAAgkQAAIJEAACCRAAA////AP///wD///8A////AIJEAACCRAAAgkQAAIJEAACCRAAAgkQAdIJEAPCCRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAP+CRAD/gkQA/4JEAPSCRAB4gkQAAIJEAACCRAAAgkQAAIJEAAD///8A////AP///wD///8AgkQAAIJEAACCRAAAgkQASoJEANKCRAD/gkQA/4JEAP+CRAD/g0YA/39AAP9zLgD/bSQA/2shAP9rIQD/bSQA/3MuAP9/PwD/g0YA/4JEAP+CRAD/gkQA/4JEAP+CRADUgkQAToJEAACCRAAAgkQAAP///wD///8A////AP///wB+PwAAgkUAIoJEAKiCRAD/gkQA/4JEAP+CRAD/hEcA/4BBAP9sIwD/dTAA/5RfKv+viF7/vp56/76ee/+wiF7/lWAr/3YxAP9sIwD/f0AA/4RHAP+CRAD/gkQA/4JEAP+CRAD/gkQArIJEACaBQwAA////AP///wD///8A////AIBCAEBzNAD6f0EA/4NFAP+CRAD/gkQA/4VIAP92MwD/bSUA/6N1Tv/ezsL/////////////////////////////////38/D/6V3Uv9uJgD/dTEA/4VJAP+CRAD/gkQA/4JEAP+BQwD/fUAA/4FDAEj///8A////AP///wD///8AzJRd5qBlKf91NgD/dDUA/4JEAP+FSQD/cy4A/3YyAP/PuKP//////////////////////////////////////////////////////9K7qP94NQD/ciwA/4VJAP+CRAD/fkEA/35BAP+LSwD/mlYA6v///wD///8A////AP///wDdpnL/4qx3/8KJUv+PUhf/cTMA/3AsAP90LgD/4dK+/////////////////////////////////////////////////////////////////+TYxf91MAD/dTIA/31CAP+GRwD/llQA/6FcAP+gWwD8////AP///wD///8A////ANGZY/LSm2X/4ap3/92mcP+wdT3/byQA/8mwj////////////////////////////////////////////////////////////////////////////+LYxv9zLgP/jUoA/59bAP+hXAD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/RmWL/1p9q/9ubXv/XqXj////////////////////////////7+fD/vZyG/6BxS/+gcUr/vJuE//r37f//////////////////////3MOr/5dQBf+dVQD/nVkA/5xYAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmWP/yohJ//jo2P//////////////////////4NTG/4JDFf9lGAD/bSQA/20kAP9kGAD/fz8S/+Xb0f//////5NG9/6txN/+LOgD/m1QA/51aAP+cWAD/m1cA/5xYAP+cWADy////AP///wD///8A////ANKaZPLSmmT/0ppk/8+TWf/Unmv//v37//////////////////////+TWRr/VwsA/35AAP+ERgD/g0UA/4JGAP9lHgD/kFga/8KXX/+TRwD/jT4A/49CAP+VTQD/n10A/5xYAP+OQQD/lk4A/55cAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/y4tO/92yiP//////////////////////8NnE/8eCQP+rcTT/ez0A/3IyAP98PgD/gEMA/5FSAP+USwD/jj8A/5lUAP+JNwD/yqV2/694Mf+HNQD/jkAA/82rf/+laBj/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/LiUr/4byY///////////////////////gupX/0I5P/+Wuev/Lklz/l1sj/308AP+QSwD/ol0A/59aAP+aVQD/k0oA/8yoh///////+fXv/6pwO//Lp3v///////Pr4f+oay7y////AP///wD///8A////ANKaZPLSmmT/0ppk/8uJSv/hvJj//////////////////////+G7l//Jhkb/0ppk/96nc//fqXX/x4xO/6dkFP+QSQD/llEA/5xXAP+USgD/yaOA///////38uv/qG05/8ijdv//////8efb/6ZpLPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/zIxO/9yxh///////////////////////7dbA/8iEQf/Sm2X/0Zlj/9ScZv/eqHf/2KJv/7yAQf+XTgD/iToA/5lSAP+JNgD/yKFv/611LP+HNQD/jT8A/8qmeP+kZRT/jT4A8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/Pk1n/1J5q//78+//////////////////+/fv/1aFv/8iEQv/Tm2b/0ppl/9GZY//Wn2z/1pZc/9eldf/Bl2b/kUcA/4w9AP+OQAD/lUwA/59eAP+cWQD/jT8A/5ZOAP+eXADy////AP///wD///8A////ANKaZPLSmmT/0ppk/9KZY//KiEn/8d/P///////////////////////47+f/05tm/8iCP//KiEj/yohJ/8eCP//RmGH//vfy///////n1sP/rXQ7/4k4AP+TTAD/nVoA/5xYAP+cVwD/nFgA/5xYAPL///8A////AP///wD///8A0ppk8tKaZP/SmmT/0ptl/8uLTf/aq37////////////////////////////+/fz/6c2y/961jv/etY7/6Myx//78+v//////////////////////3MWv/5xXD/+ORAD/mFQA/51ZAP+cWAD/nFgA8v///wD///8A////AP///wDSmmTy0ppk/9KaZP/SmmT/0ppk/8mFRP/s1b//////////////////////////////////////////////////////////////////////////////+PD/0JFU/7NzMv+WUQD/kUsA/5tXAP+dWQDy////AP///wD///8A////ANKaZP/SmmT/0ppk/9KaZP/Sm2X/z5NZ/8yMT//z5NX/////////////////////////////////////////////////////////////////9Ofa/8yNUP/UmGH/36p5/8yTWv+qaSD/kksA/5ROAPz///8A////AP///wD///8A0ppk5NKaZP/SmmT/0ppk/9KaZP/TnGf/zY9T/82OUv/t1sD//////////////////////////////////////////////////////+7Yw//OkFX/zI5R/9OcZ//SmmP/26V0/9ymdf/BhUf/ol8R6P///wD///8A////AP///wDSmmQ80ppk9tKaZP/SmmT/0ppk/9KaZP/TnGj/zpFW/8qJSv/dson/8uHS//////////////////////////////////Lj0//etIv/y4lL/86QVf/TnGj/0ppk/9KaZP/RmWP/05xn/9ymdfjUnWdC////AP///wD///8A////ANKaZADSmmQc0ppkotKaZP/SmmT/0ppk/9KaZP/Tm2b/0Zli/8qJSf/NjlH/16Z3/+G8mP/myKr/5siq/+G8mP/Xp3f/zY5S/8qISf/RmGH/05tm/9KaZP/SmmT/0ppk/9KaZP/SmmSm0pljINWdaQD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkQtKaZMrSmmT/0ppk/9KaZP/SmmT/0ptl/9GYYf/Nj1P/y4lL/8qISP/KiEj/y4lK/82PU//RmGH/0ptl/9KaZP/SmmT/0ppk/9KaZP/SmmTO0ppkRtKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZGzSmmTu0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmTw0ppkcNKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZBLSmmSQ0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppklNKaZBTSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP///wD///8A0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQy0ppkutKaZP/SmmT/0ppk/9KaZP/SmmT/0ppk/9KaZP/SmmT/0ppkvtKaZDbSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkAP///wD///8A////AP///wDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkXNKaZODSmmT/0ppk/9KaZP/SmmT/0ppk5NKaZGDSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA////AP///wD///8A////ANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkBtKaZIbSmmTo0ppk6tKaZIrSmmQK0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZADSmmQA0ppkANKaZAD///8A////AP/8P///+B///+AH//+AAf//AAD//AAAP/AAAA/gAAAHwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA8AAAAPAAAADwAAAA+AAAAfwAAAP/AAAP/8AAP//gAH//+AH///4H////D//" rel="icon" />
  
  <!--[if lt IE 9]>
    <script src="//cdnjs.cloudflare.com/ajax/libs/html5shiv/3.7.3/html5shiv-printshiv.min.js"></script>
  <![endif]-->
</head>
<body>
<div class="wrapper">
<header id="title-block-header">
<h1 class="title" style="text-align:center">Make idiomatic usage of
<code class="sourceCode cpp">offsetof</code> well-defined</h1>
<table style="border:none;float:right">
  <tr>
    <td>Document #:</td>
    <td>P3407R0</td>
  </tr>
  <tr>
    <td>Date:</td>
    <td>2024-10-14</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Project:</td>
    <td>Programming Language C++</td>
  </tr>
  <tr>
    <td style="vertical-align:top">Audience:</td>
    <td>
      SG22, EWG<br>
    </td>
  </tr>
  <tr>
    <td style="vertical-align:top">Reply-to:</td>
    <td>
      Brian Bi<br>&lt;<a href="mailto:bbi10@bloomberg.net" class="email">bbi10@bloomberg.net</a>&gt;<br>
    </td>
  </tr>
</table>
</header>
<div style="clear:both">
<div id="TOC" role="doc-toc">
<h1 id="toctitle">Contents</h1>
<ul>
<li><a href="#abstract" id="toc-abstract"><span class="toc-section-number">1</span> Abstract</a></li>
<li><a href="#introduction" id="toc-introduction"><span class="toc-section-number">2</span> Introduction</a></li>
<li><a href="#provenance-in-c" id="toc-provenance-in-c"><span class="toc-section-number">3</span> Provenance in C</a></li>
<li><a href="#provenance-in-c-1" id="toc-provenance-in-c-1"><span class="toc-section-number">4</span> Provenance in C++</a></li>
<li><a href="#removing-undefined-behavior-and-making-optimizations-opt-in" id="toc-removing-undefined-behavior-and-making-optimizations-opt-in"><span class="toc-section-number">5</span> Removing undefined behavior and
making optimizations opt-in</a></li>
<li><a href="#design-space-for-a-solution" id="toc-design-space-for-a-solution"><span class="toc-section-number">6</span> Design space for a solution</a></li>
<li><a href="#proposed-wording" id="toc-proposed-wording"><span class="toc-section-number">7</span> Proposed wording</a></li>
<li><a href="#appendix-a" id="toc-appendix-a"><span class="toc-section-number">8</span> Appendix A</a></li>
<li><a href="#bibliography" id="toc-bibliography"><span class="toc-section-number">9</span> References</a></li>
</ul>
</div>
<h1 data-number="1" id="abstract"><span class="header-section-number">1</span> Abstract<a href="#abstract" class="self-link"></a></h1>
<p>In C, the <code class="sourceCode cpp">offsetof</code> macro is
frequently used to obtain a pointer to an object given a pointer to one
of its subobjects. Such C code is often incompatible with C++ because of
two changes to the pointer provenance model made in C++17. Pointer
arithmetic within non-array objects became undefined, which is an issue
that is tackled by <span class="citation" data-cites="P1839R6">[<a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html" role="doc-biblioref">P1839R6</a>]</span>; however, C++17 also introduced
reachability restrictions that are at cross purposes with the usage of
<code class="sourceCode cpp">offsetof</code> described above. Because
C++ should not break C features without a compelling reason, this paper
proposes to relax reachability restrictions in C++.</p>
<h1 data-number="2" id="introduction"><span class="header-section-number">2</span> Introduction<a href="#introduction" class="self-link"></a></h1>
<p>In C, an intrusive data structure, such as a doubly-linked list, must
be implemented using composition, not inheritance, since C does not have
inheritance. Given a pointer to a node within the data structure,
accessing the rest of the object requires the use of
<code class="sourceCode cpp">offsetof</code>:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> ListNode <span class="op">{</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> ListNode<span class="op">*</span> prev<span class="op">;</span></span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> ListNode<span class="op">*</span> next<span class="op">;</span></span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="kw">typedef</span> <span class="kw">struct</span> <span class="op">{</span></span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> data<span class="op">;</span></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> ListNode node<span class="op">;</span></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span> Foo<span class="op">;</span></span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>Foo<span class="op">*</span> next_foo<span class="op">(</span>Foo<span class="op">*</span> foo<span class="op">)</span> <span class="op">{</span></span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> ListNode<span class="op">*</span> next_node <span class="op">=</span> foo<span class="op">-&gt;</span>node<span class="op">;</span></span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">(</span>Foo<span class="op">*)((</span><span class="dt">char</span><span class="op">*)</span>next_node <span class="op">-</span> offsetof<span class="op">(</span>Foo<span class="op">,</span> node<span class="op">));</span></span>
<span id="cb1-14"><a href="#cb1-14" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>This pattern of casting to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>,
subtracting the appropriate <code class="sourceCode cpp">offsetof</code>
value, and then casting to a pointer to the enclosing type, is often
encapsulated in a macro that is named
<code class="sourceCode cpp">container_of</code> or similar (see
<em>e.g.</em> <a href="https://github.com/search?q=container_of+language%3AC+&amp;type=code">GitHub
code search</a>)<a href="#fn1" class="footnote-ref" id="fnref1" role="doc-noteref"><sup>1</sup></a>.</p>
<p>A C++-only project would typically make
<code class="sourceCode cpp">ListNode</code> a base class. Converting a
<code class="sourceCode cpp">ListNode<span class="op">*</span></code> to
a <code class="sourceCode cpp">Foo<span class="op">*</span></code> could
then be done easily using
<code class="sourceCode cpp"><span class="kw">static_cast</span></code>,
and <code class="sourceCode cpp">offsetof</code> would be unnecessary.
This option is not available in C. In C, the
<code class="sourceCode cpp">container_of</code> pattern is the only
option, unless the <code class="sourceCode cpp">ListNode</code> can be
arranged to always be the first member of the enclosing struct.</p>
<p>Unfortunately, the operand of the
<code class="sourceCode cpp"><span class="cf">return</span></code>
statement in <code class="sourceCode cpp">next_foo</code> has undefined
behavior in C++. There are two reasons for this. The first is that
casting <code class="sourceCode cpp">next_node</code> to type <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
does not yield a pointer that points into an array of
<code class="sourceCode cpp"><span class="dt">char</span></code>;
therefore, subtracting any value other than 0 can only have UB
(§<span>7.6.6
<a href="https://wg21.link/N4988#expr.add">[expr.add]</a><a href="#fn2" class="footnote-ref" id="fnref2" role="doc-noteref"><sup>2</sup></a></span>p4.3). This issue is already
being addressed by <span class="citation" data-cites="P1839R6">[<a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html" role="doc-biblioref">P1839R6</a>]</span>, which proposes that object
representations be made arrays of <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span></code>
(and also allows pointers to
<code class="sourceCode cpp"><span class="dt">char</span></code> to
traverse such arrays). This issue has also been pointed out by <span class="citation" data-cites="P2883R0">[<a href="https://wg21.link/p2883r0" role="doc-biblioref">P2883R0</a>]</span>, which also noted that,
although this use of <code class="sourceCode cpp">offsetof</code> has UB
in C++, every known C++ implementation “consistently produced the same
behavior as the C program”.</p>
<p>The second issue is that the adoption of <span class="citation" data-cites="P0137R1">[<a href="https://wg21.link/p0137r1" role="doc-biblioref">P0137R1</a>]</span> into C++17 introduced the
concept of <em>reachability</em>, which is now defined at §<span>6.8.4
<a href="https://wg21.link/N4988#basic.compound">[basic.compound]</a></span>p6:</p>
<blockquote>
<p>A byte of storage <em>b</em> is <em>reachable through</em> a pointer
value that points to an object <em>x</em> if there is an object
<em>y</em>, pointer-interconvertible with <em>x</em>, such that
<em>b</em> is within the storage occupied by <em>y</em>, or the
immediately-enclosing array object if <em>y</em> is an array
element.</p>
</blockquote>
<p>In the status quo (prior to the adoption of <span class="citation" data-cites="P1839R6">[<a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html" role="doc-biblioref">P1839R6</a>]</span>, if any), reachability can
prevent some memory accesses even when no pointer arithmetic is
involved. For example:</p>
<div class="sourceCode" id="cb1"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb1-1"><a href="#cb1-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S <span class="op">{</span></span>
<span id="cb1-2"><a href="#cb1-2" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> a<span class="op">[</span><span class="dv">2</span><span class="op">]</span>;</span>
<span id="cb1-3"><a href="#cb1-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> data;</span>
<span id="cb1-4"><a href="#cb1-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb1-5"><a href="#cb1-5" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-6"><a href="#cb1-6" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f1<span class="op">(</span><span class="dt">int</span><span class="op">*</span> p<span class="op">)</span>;</span>
<span id="cb1-7"><a href="#cb1-7" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb1-8"><a href="#cb1-8" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> f2<span class="op">()</span> <span class="op">{</span></span>
<span id="cb1-9"><a href="#cb1-9" aria-hidden="true" tabindex="-1"></a>    S s;</span>
<span id="cb1-10"><a href="#cb1-10" aria-hidden="true" tabindex="-1"></a>    s<span class="op">.</span>data <span class="op">=</span> <span class="dv">1</span>;</span>
<span id="cb1-11"><a href="#cb1-11" aria-hidden="true" tabindex="-1"></a>    f1<span class="op">(&amp;</span>s<span class="op">.</span>a<span class="op">[</span><span class="dv">0</span><span class="op">])</span>;</span>
<span id="cb1-12"><a href="#cb1-12" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> s<span class="op">.</span>data;</span>
<span id="cb1-13"><a href="#cb1-13" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>If <code class="sourceCode cpp">f1</code> is defined as follows:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f1<span class="op">(</span><span class="dt">int</span><span class="op">*</span> p<span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">reinterpret_cast</span><span class="op">&lt;</span>S<span class="op">*&gt;(</span><span class="kw">reinterpret_cast</span><span class="op">&lt;</span><span class="dt">int</span> <span class="op">(*)[</span><span class="dv">2</span><span class="op">]&gt;(</span>p<span class="op">))-&gt;</span>data <span class="op">=</span> <span class="dv">2</span>;</span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>then calling <code class="sourceCode cpp">f1</code> has undefined
behavior, because the entire array
<code class="sourceCode cpp">s<span class="op">.</span>a</code> is not
pointer-interconvertible with the element <code class="sourceCode cpp">s<span class="op">.</span>a<span class="op">[</span><span class="dv">0</span><span class="op">]</span></code>.
The inner <code class="sourceCode cpp"><span class="kw">reinterpret_cast</span></code>
yields a “wrongly typed” pointer: a pointer value that is of type <code class="sourceCode cpp"><span class="dt">int</span> <span class="op">(*)[</span><span class="dv">2</span><span class="op">]</span></code>,
but points to a single
<code class="sourceCode cpp"><span class="dt">int</span></code>, namely
<code class="sourceCode cpp">s<span class="op">.</span>a<span class="op">[</span><span class="dv">0</span><span class="op">]</span></code>;
it does not point to the array
<code class="sourceCode cpp">s<span class="op">.</span>a</code>.
Consequently, the outer <code class="sourceCode cpp"><span class="kw">reinterpret_cast</span></code>,
which attempts to go from the first member of a standard-layout struct
to the struct itself (allowed in C++17), cannot work; instead another
wrongly typed pointer is produced: a value of type
<code class="sourceCode cpp">S<span class="op">*</span></code> that
points to <code class="sourceCode cpp">s<span class="op">.</span>a<span class="op">[</span><span class="dv">0</span><span class="op">]</span></code>
(not <code class="sourceCode cpp">s</code>). Dereferencing this pointer
yields an lvalue that does not refer to an
<code class="sourceCode cpp">S</code> object, which renders the
attempted access to <code class="sourceCode cpp">data</code> UB
(§<span>7.6.1.5
<a href="https://wg21.link/N4988#expr.ref">[expr.ref]</a></span>p9).</p>
<p>The
<code class="sourceCode cpp">std<span class="op">::</span>launder</code>
function, which can accept a pointer and return a different pointer
value that holds the same address, does not help, because it has a
reachability restriction: calling
<code class="sourceCode cpp">std<span class="op">::</span>launder</code>
on a wrongly typed pointer picks out the object of the correct type that
lives at the address that the pointer holds, but if there are bytes
reachable from that object that are not reachable from the object that
the original pointer points to, the behavior is undefined (§<span>17.6.5
<a href="https://wg21.link/N4988#ptr.launder">[ptr.launder]</a></span>p2).</p>
<p>Therefore, the implementation can assume that the call to
<code class="sourceCode cpp">f1</code> in
<code class="sourceCode cpp">f2</code> never modifies
<code class="sourceCode cpp">s<span class="op">.</span>data</code>: if
any attempt were made to do so, then the behavior of the program would
be undefined.</p>
<p>In P1839R6 (which is currently under preparation), I have attempted
to ensure that the proposed wording is consistent with the reachability
restrictions that exist in current C++, because there is no record of
EWG having discussed the question of whether those restrictions should
be relaxed. If the <code class="sourceCode cpp">get_next_foo</code>
example is to be made well-defined, then some reachability-based
assumptions that are currently allowed to implementations must be
invalidated. This paper proposes to do just that.</p>
<h1 data-number="3" id="provenance-in-c"><span class="header-section-number">3</span> Provenance in C<a href="#provenance-in-c" class="self-link"></a></h1>
<p>The C standard does not currently have a notion of
<em>provenance</em>, but it is widely assumed that one ought to exist.
For example, in the following translation unit:</p>
<div class="sourceCode" id="cb2"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb2-1"><a href="#cb2-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> evil<span class="op">(</span><span class="dt">void</span><span class="op">);</span></span>
<span id="cb2-2"><a href="#cb2-2" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> main<span class="op">(</span><span class="dt">void</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb2-3"><a href="#cb2-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> x <span class="op">=</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb2-4"><a href="#cb2-4" aria-hidden="true" tabindex="-1"></a>    evil<span class="op">();</span></span>
<span id="cb2-5"><a href="#cb2-5" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> x<span class="op">;</span></span>
<span id="cb2-6"><a href="#cb2-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>notwithstanding that <code class="sourceCode cpp">evil</code> might
be able to “guess” the address of <code class="sourceCode cpp">x</code>
based on knowledge of the platform ABI, it is widely agreed that
<code class="sourceCode cpp">evil</code> should be allowed to neither
read nor write the value of <code class="sourceCode cpp">x</code>, and,
therefore, the compiler can eliminate
<code class="sourceCode cpp">x</code> and optimize the last statement to
<code class="sourceCode cpp"><span class="cf">return</span> <span class="dv">1</span>;</code>.
GCC and Clang both perform this optimization at
<code class="sourceCode cpp"><span class="op">-</span>O1</code> and
higher.</p>
<p>One can say that even if <code class="sourceCode cpp">evil</code>
correctly guesses the numerical value of
<code class="sourceCode cpp">x</code>’s address, casting that numerical
value to <code class="sourceCode cpp"><span class="dt">int</span><span class="op">*</span></code>
would yield a pointer that <em>lacks provenance</em> and, therefore,
causes UB when dereferenced. Such provenance-based restrictions on the
use of pointers do not exist in the current C standard, but work is
underway on a Draft Technical Specification for pointer provenance in C
(referred to as the “Provenance TS” from this point onward). The latest
version of the Provenance TS is <span class="citation" data-cites="N3057">[<a href="https://wg21.link/n3057" role="doc-biblioref">N3057</a>]</span>.</p>
<p>In the Provenance TS, values of pointer-to-object type<a href="#fn3" class="footnote-ref" id="fnref3" role="doc-noteref"><sup>3</sup></a> are
augmented to include provenance, which may be empty. A non-empty
provenance is the ID of a <em>storage instance</em>, and a pointer value
whose provenance is the ID of a storage instance <em>I</em> can be used
only to access bytes that lie within <em>I</em>. In the example above, a
storage instance is created when <code class="sourceCode cpp">x</code>
is defined. In contrast to the address that a pointer value represents,
there is no way to directly change the provenance of a pointer, other
than by storing into it another pointer value that has the desired
provenance. That is, no cast or other operation in
<code class="sourceCode cpp">evil</code> can construct a pointer value
whose provenance is the ID of <code class="sourceCode cpp">x</code>.
Therefore, the implementation can assume that any pointer constructed by
<code class="sourceCode cpp">evil</code> that happens to represent the
address of <code class="sourceCode cpp">x</code> cannot be used to
access <code class="sourceCode cpp">x</code>, since the provenance of
such a pointer value is either empty or a storage ID other than that of
<code class="sourceCode cpp">x</code>.</p>
<p>Although the Provenance TS doesn’t explicitly state that subobjects
have the provenance of their complete object, the definition of “storage
instance” given in section 3.20 of Annex C implies that only a single
storage instance is created by an object definition. A note to section
3.20 states that two subobjects within an object of structure type share
a storage instance.</p>
<p>Therefore, under the Provenance TS, if the address of a subobject is
taken, the resulting pointer’s provenance is a storage ID that contains
at least the complete object<a href="#fn4" class="footnote-ref" id="fnref4" role="doc-noteref"><sup>4</sup></a>. Therefore, all bytes of
a complete object are always reachable starting from a valid pointer to
any subobject.</p>
<h1 data-number="4" id="provenance-in-c-1"><span class="header-section-number">4</span> Provenance in C++<a href="#provenance-in-c-1" class="self-link"></a></h1>
<p>C++ has had a provenance-based pointer model since <span class="citation" data-cites="P0137R1">[<a href="https://wg21.link/p0137r1" role="doc-biblioref">P0137R1</a>]</span>. However, the C++ standard does
not use the term “provenance”. Instead, every dereferencable pointer in
C++ has a unique object or function to which it points. But the set of
bytes that an object pointer can reach is not necessarily limited to the
bytes occupied by the object that the pointer points to. For example, a
pointer to any element of an array can be used to access any byte of the
array, including bytes that are occupied by other elements. The formal
definition of “reachable” is given in §<span>6.8.4
<a href="https://wg21.link/N4988#basic.compound">[basic.compound]</a></span>p6.
C++ is more restrictive than the C Provenance TS: all bytes reachable
from the pointer value “pointer to <em>o</em>” (where <em>o</em> is an
object) lie within <em>o</em>’s complete object, but not all bytes of a
complete object are reachable from a pointer to a subobject. In
particular, as stated previously, if a pointer points to a non-static
data member of a standard-layout struct other than the first non-static
data member, no other members are reachable from that pointer.</p>
<p>To look at it from the point of view of the compiler, all
provenance-based optimizations that are valid in C are also valid in
C++. For example, Clang, GCC, and MSVC are all capable of performing the
optimization mentioned in the previous section (<em>i.e.</em> that the
value of <code class="sourceCode cpp">x</code> is not accessed by
<code class="sourceCode cpp">evil</code>). Since C++ is stricter than C,
some provenance-based optimizations that are not valid in C are valid in
C++. However, <strong>I have not been able to find any cases in which
C++ implementations exploit provenance-based optimizations that are not
valid in C.</strong> For example, in the following translation unit:</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S <span class="op">{</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> x;</span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> y;</span>
<span id="cb3-4"><a href="#cb3-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb3-5"><a href="#cb3-5" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f4<span class="op">(</span><span class="dt">int</span><span class="op">*</span> p<span class="op">)</span>;</span>
<span id="cb3-6"><a href="#cb3-6" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> f3<span class="op">()</span> <span class="op">{</span></span>
<span id="cb3-7"><a href="#cb3-7" aria-hidden="true" tabindex="-1"></a>    S s;</span>
<span id="cb3-8"><a href="#cb3-8" aria-hidden="true" tabindex="-1"></a>    s<span class="op">.</span>x <span class="op">=</span> <span class="dv">1</span>;</span>
<span id="cb3-9"><a href="#cb3-9" aria-hidden="true" tabindex="-1"></a>    f4<span class="op">(&amp;</span>s<span class="op">.</span>y<span class="op">)</span>;</span>
<span id="cb3-10"><a href="#cb3-10" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> s<span class="op">.</span>x <span class="op">*</span> s<span class="op">.</span>x;</span>
<span id="cb3-11"><a href="#cb3-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>even at maximum optimization levels, Clang, GCC, and MSVC all
generate a load of
<code class="sourceCode cpp">s<span class="op">.</span>x</code> and an
<code class="sourceCode cpp">imul</code> instruction on x86-64; no
implementation assumes that, because only the address of
<code class="sourceCode cpp">s<span class="op">.</span>y</code> escapes
from <code class="sourceCode cpp">f3</code>, the value of
<code class="sourceCode cpp">s<span class="op">.</span>x</code> cannot
be changed.</p>
<p>I believe that the reason why such optimizations are not performed is
that C++ implementations wish to maintain a reasonable degree of
compatibility with C. Since C code often uses the
<code class="sourceCode cpp">container_of</code> idiom, which could be
used to obtain a pointer to <code class="sourceCode cpp">s</code> given
a pointer to
<code class="sourceCode cpp">s<span class="op">.</span>y</code>,
implementations make allowances for the same operation to take place in
a C++ program. Therefore, not only do implementations not currently
perform this optimization, but it is unlikely that future versions will
do so, either. Implementations are more constrained by the needs of
their users, in this case, than by the availability of compiler
engineers to implement the optimization.</p>
<p>Similarly, the function <code class="sourceCode cpp">f1</code>
defined earlier could be given the following definition in C. The offset
value will always be 0 in this case, so the subtraction can be omitted
without changing the meaning.</p>
<div class="sourceCode" id="cb3"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb3-1"><a href="#cb3-1" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f1<span class="op">(</span><span class="dt">int</span><span class="op">*</span> p<span class="op">)</span> <span class="op">{</span></span>
<span id="cb3-2"><a href="#cb3-2" aria-hidden="true" tabindex="-1"></a>    <span class="op">(</span>S<span class="op">*)((</span><span class="dt">char</span><span class="op">*)</span>p <span class="op">-</span> offsetof<span class="op">(</span><span class="kw">struct</span> S<span class="op">,</span> a<span class="op">[</span><span class="dv">0</span><span class="op">]))-&gt;</span>data <span class="op">=</span> <span class="dv">2</span><span class="op">;</span></span>
<span id="cb3-3"><a href="#cb3-3" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>Therefore, even in C++ mode, implementations do not assume that
<code class="sourceCode cpp">f1</code> cannot change the value of
<code class="sourceCode cpp">data</code>, even though the reachability
rules of the language permit optimizations based on this assumption.
Clang, GCC, and MSVC all emit both a store to
<code class="sourceCode cpp">s<span class="op">.</span>data</code>
before the call to <code class="sourceCode cpp">f1</code> and a load
after.</p>
<h1 data-number="5" id="removing-undefined-behavior-and-making-optimizations-opt-in"><span class="header-section-number">5</span> Removing undefined behavior and
making optimizations opt-in<a href="#removing-undefined-behavior-and-making-optimizations-opt-in" class="self-link"></a></h1>
<p>The overly strict reachability rules adopted in C++17 have an
additional disadvantage besides limiting compatibility with C: they
create a category of constructs that:</p>
<ol type="1">
<li>A programmer can easily use without realizing that UB will result,
and</li>
<li>Can be given perfectly sensible defined behavior (which may include
implementation-defined or unspecified results) only at the cost of minor
optimizations.</li>
</ol>
<p>My opinion is that the Committee should not create new forms of UB
that meet the above criteria, and should strongly consider removing any
such UB that already exists in the language. UB that is actually
exploited by compilers for optimization purposes makes the use of C++
less safe; UB that is not currently exploited still has a negative
impact on the perception of how safe C++ is, and is scary to beginners,
who don’t have enough context to distinguish between benign UB that is
unlikely to ever be exploited and dangerous UB that may eventually
result in an unbounded set of possible executions.<a href="#fn5" class="footnote-ref" id="fnref5" role="doc-noteref"><sup>5</sup></a> I
do not mean to suggest that all or even most UB can be removed from C++,
but when the two criteria above are met, I think the cost/benefit
analysis heavily favors giving the construct a defined behavior.</p>
<p>I believe that a better way to obtain the optimizations that such UB
is meant to enable is to provide mechanisms to opt in: that is, language
or library features whose sole purpose is to cause UB, which can then be
used to optimize; experts can use such features to produce faster code,
while beginners can easily avoid them because they cannot be used by
accident while writing code that uses other C++ features. (The <code class="sourceCode cpp"><span class="op">[[</span><span class="at">assume</span><span class="op">]]</span></code>
attribute is a well-known example of this genre.) It seems much more
defensible to provide “sharp tools” for experts to use in order to
improve performance than to build sharp edges into the most basic
language constructs, making it difficult for beginners to use them
safely.</p>
<p>Consider again this example from the previous section:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S <span class="op">{</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> x;</span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> y;</span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f4<span class="op">(</span><span class="dt">int</span><span class="op">*</span> p<span class="op">)</span>;</span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> f3<span class="op">()</span> <span class="op">{</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    S s;</span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a>    s<span class="op">.</span>x <span class="op">=</span> <span class="dv">1</span>;</span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a>    f4<span class="op">(&amp;</span>s<span class="op">.</span>y<span class="op">)</span>;</span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> s<span class="op">.</span>x <span class="op">*</span> s<span class="op">.</span>x;</span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>This paper proposes that <code class="sourceCode cpp">f4</code> would
have the ability to modify
<code class="sourceCode cpp">s<span class="op">.</span>x</code>, and
that if there is sufficient interest from C++ experts in having a way to
tell the compiler that
<code class="sourceCode cpp">s<span class="op">.</span>x</code>
<em>cannot</em> be reached through the pointer passed to
<code class="sourceCode cpp">f4</code>, a new mechanism can be added to
the language. This possibility is discussed in Appendix A.</p>
<h1 data-number="6" id="design-space-for-a-solution"><span class="header-section-number">6</span> Design space for a solution<a href="#design-space-for-a-solution" class="self-link"></a></h1>
<p>To make the C++ standard match existing practice of implementations
and to bless <code class="sourceCode cpp">container_of</code>-like
constructs in C++, it is necessary to permit pointer arithmetic within
objects, which is already being proposed by <span class="citation" data-cites="P1839R6">[<a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html" role="doc-biblioref">P1839R6</a>]</span>, and also to relax the
reachability rules in C++. However, this paper does not propose to relax
the C++ reachability rules all the way to the “complete object or
allocation” model proposed by the C Provenance TS because doing so is
not necessary to solve the immediate problem. Instead, it suffices to
allow a pointer to an object to reach all bytes of the complete object.
For example, this paper does not propose to enable the use of flexible
array members in C++, which are allowed by the C Provenance TS because
the trailing bytes belong to the same storage instance (allocation) as
the preceding members. The
<code class="sourceCode cpp">container_of</code> technique was valid in
C++ prior to C++17 and this paper aims to restore the <em>status quo
ante</em>, not to propose a new feature that has never been in C++.</p>
<p>Because typical <code class="sourceCode cpp">container_of</code>
macros in C use a cast to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
(not <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span><span class="op">*</span></code>),
this paper proposes that a cast to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
be allowed to yield a pointer to an object’s object representation, in
contrast to P1839, which supports only <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span></code>
and
<code class="sourceCode cpp">std<span class="op">::</span>byte</code>. A
further difference from P1839 is that in this paper, I propose that
every <em>complete</em> object has its own object representation, while
in P1839, subobjects also have their own object representations. Because
this paper allows pointer arithmetic within complete objects, a cast to
<code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
should yield a pointer into an array that occupies the entire storage
that the complete object does; the subobject arrays are not needed.</p>
<p>In some cases, a C-style cast to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
already has well-defined behavior in C++ that is different than
producing a pointer to the object representation. One of these cases is
when the operand points to an object of class type that has a conversion
function to <em>cv</em> <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>.
I do not propose to change the behavior of such casts in C++; doing so
would be a disastrous breaking change that is not needed for C
compatibility, because C does not have conversion functions. The
remaining two cases are:</p>
<ol type="1">
<li>The cast is a
<code class="sourceCode cpp"><span class="kw">const_cast</span></code>
because the operand has type <em>cv</em> <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
or array of <em>cv</em>
<code class="sourceCode cpp"><span class="dt">char</span></code>. (This
includes the case where no conversion is neede at all.)</li>
<li>The cast can be interpreted as a <code class="sourceCode cpp"><span class="kw">reinterpret_cast</span></code>
followed by an optional
<code class="sourceCode cpp"><span class="kw">const_cast</span></code>
because there is a “real” <em>cv</em>
<code class="sourceCode cpp"><span class="dt">char</span></code> (not an
element of an object representation) that is located at the address
represented by the operand and is pointer-interconvertible with it.</li>
</ol>
<p>I searched GitHub for uses of
<code class="sourceCode cpp">container_of</code> and uses of
<code class="sourceCode cpp">offsetof</code> for the purpose of reaching
an enclosing struct. In the 65 files that I analyzed manually, I found
two files in which the pointer from which the
<code class="sourceCode cpp">offsetof</code> value is subtracted points
to an array of
<code class="sourceCode cpp"><span class="dt">char</span></code>. (In
one of these cases, the array was a flexible array member, which is not
part of standard C++, but is often accepted as an extension.) That is,
the relevant details of the code are similar to:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S2 <span class="op">{</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span> data;</span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">char</span> buf<span class="op">[</span><span class="dv">100</span><span class="op">]</span>;</span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a><span class="op">}</span>;</span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> get_data<span class="op">(</span><span class="dt">char</span><span class="op">*</span> p<span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">((</span><span class="kw">struct</span> S2<span class="op">*)(</span>p <span class="op">-</span> offsetof<span class="op">(</span>S2, buf<span class="op">)))-&gt;</span>data;</span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span> f5<span class="op">()</span> <span class="op">{</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a>    S2 s;</span>
<span id="cb5-10"><a href="#cb5-10" aria-hidden="true" tabindex="-1"></a>    <span class="co">// ...</span></span>
<span id="cb5-11"><a href="#cb5-11" aria-hidden="true" tabindex="-1"></a>    get_data<span class="op">(</span>s<span class="op">-&gt;</span>buf<span class="op">)</span>;</span>
<span id="cb5-12"><a href="#cb5-12" aria-hidden="true" tabindex="-1"></a>    <span class="co">// ...</span></span>
<span id="cb5-13"><a href="#cb5-13" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>In C++, this code performs out-of-bounds array arithmetic, and thus
exhibits UB even before the attempt to access
<code class="sourceCode cpp">data</code>.</p>
<p>Essentially, this gives us three design options to deal with Case
1.</p>
<ol type="1">
<li>We can say that <em>cv</em> <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
is exempt from bounds checking, just as it’s exempt from the strict
aliasing rule. In other words, while a <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
may point to a specific
<code class="sourceCode cpp"><span class="dt">char</span></code> object
during constant evaluation, in all other cases it merely points to a
byte of storage, and pointer arithmetic that would reach any other byte
in the same complete object is permitted. In this case, <em>cv</em>
<code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span><span class="op">*</span></code>
would also be exempt from bounds checking (for symmetry with the strict
aliasing rule). This might have a negative impact on performance
relative to the status quo if compilers are currently relying on the
assumption that a pointer into a
<code class="sourceCode cpp"><span class="dt">char</span></code> array
that is a subobject cannot be used to perform pointer arithmetic outside
the bounds of the array. However, I have not yet found any examples
where compilers do use such assumptions for optimization. The more
likely impact is on sanitizers and static analyzers: they might be
forced to disable bounds checking for <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
and <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span><span class="op">*</span></code>,
which would reduce their ability to detect UB.</li>
<li>We can say that a C-style cast from <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
or a similar cast (as described in Case 1) sometimes <em>changes</em>
the pointer value such that the above example would have defined
behavior if <code class="sourceCode cpp">p</code> were to be cast to
<code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
prior to the pointer arithmetic. (A similar allowance would be made for
casts to <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span><span class="op">*</span></code>.)
In many cases in the real world, such a cast might be present because it
will have been introduced by a generic
<code class="sourceCode cpp">container_of</code>-like macro that is not
“aware” of the fact that the pointer argument, in some particular cases,
has type <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
already. However, this design has two disadvantages. First, some
compilers might simply ignore casts from <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
at some early stage of semantic analysis, so that at some later stage
they are not aware that the cast is there at all, so the cast cannot
achieve its purpose of giving the program defined behavior; it is not
clear how much work would be required to change the implementations.
Second, it would violate the current design in which a C-style cast is
equivalent to trying C++-style casts in a particular order; instead, the
C-style cast would have the additional power of producing a pointer to
the object representation instead of performing a no-op
<code class="sourceCode cpp"><span class="kw">const_cast</span></code>.</li>
<li>We can say that we don’t care enough about solving the problem in
the case of pointers that are already of type <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>.
The example above would continue to have UB, regardless of whether an
additional cast is inserted. We would still solve 99% of the problem,
because in the vast majority of cases, the subobject pointer points to
an object of struct or union type, not a
<code class="sourceCode cpp"><span class="dt">char</span></code>.</li>
</ol>
<p>This paper proposes option 3 as the conservative option, without
prejudice to adopting something similar to option 1 or 2 in the future.
The small amount of C code that is similar to the example above could be
rewritten so that, if the subobject has type
<code class="sourceCode cpp"><span class="dt">char</span></code>, then
the pointer arithmetic is done using <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span><span class="op">*</span></code>,
and <em>vice versa</em>; it would then have the desired behavior in both
C and C++ under option 3.</p>
<p>For Case 2, I also found two examples in the 65 files that I analyzed
in which the subobject pointer points to a struct that is
pointer-interconvertible with an <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span></code>
subobject. I didn’t find any examples with
<code class="sourceCode cpp"><span class="dt">char</span></code>, but
given that examples exist that use <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span></code>,
I assume there are others that use
<code class="sourceCode cpp"><span class="dt">char</span></code>. Such
code would have relevant details similar to:</p>
<div class="sourceCode" id="cb4"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb4-1"><a href="#cb4-1" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S3 <span class="op">{</span></span>
<span id="cb4-2"><a href="#cb4-2" aria-hidden="true" tabindex="-1"></a>    <span class="dt">char</span> a<span class="op">;</span></span>
<span id="cb4-3"><a href="#cb4-3" aria-hidden="true" tabindex="-1"></a>    <span class="dt">int</span>  b<span class="op">;</span></span>
<span id="cb4-4"><a href="#cb4-4" aria-hidden="true" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb4-5"><a href="#cb4-5" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S4 <span class="op">{</span></span>
<span id="cb4-6"><a href="#cb4-6" aria-hidden="true" tabindex="-1"></a>    <span class="dt">char</span>      c<span class="op">;</span></span>
<span id="cb4-7"><a href="#cb4-7" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> S3 d<span class="op">;</span></span>
<span id="cb4-8"><a href="#cb4-8" aria-hidden="true" tabindex="-1"></a><span class="op">};</span></span>
<span id="cb4-9"><a href="#cb4-9" aria-hidden="true" tabindex="-1"></a></span>
<span id="cb4-10"><a href="#cb4-10" aria-hidden="true" tabindex="-1"></a><span class="kw">struct</span> S4<span class="op">*</span> get_s4<span class="op">(</span><span class="kw">struct</span> S3<span class="op">*</span> s3<span class="op">)</span> <span class="op">{</span></span>
<span id="cb4-11"><a href="#cb4-11" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> <span class="op">(</span><span class="kw">struct</span> S4<span class="op">*)((</span><span class="dt">char</span><span class="op">*)</span>s3 <span class="op">-</span> offsetof<span class="op">(</span>S4<span class="op">,</span> d<span class="op">));</span></span>
<span id="cb4-12"><a href="#cb4-12" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>In current C++, the cast to <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
yields a pointer to the <code class="sourceCode cpp">a</code> subobject.
Note, however, that the entire example has undefined behavior because of
the subsequent pointer arithmetic. If we change the rules of C++ so that
the cast would be allowed to yield a pointer to the object
representation of the <code class="sourceCode cpp">S4</code> object, we
could make this example well-defined when it currently is not. In order
to avoid changing the behavior of any code that is already well-defined,
we could say that the status quo interpretation of the cast takes
precedence, and an pointer to the object representation is obtained only
when the former interpretation would produce undefined behavior. This
specification strategy is similar to that of implicit object creation,
in which the specific objects that are created may only be determined by
the details of a later operation, which would have UB other than under
one particular choice of objects to create.</p>
<h1 data-number="7" id="proposed-wording"><span class="header-section-number">7</span> Proposed wording<a href="#proposed-wording" class="self-link"></a></h1>
<p>This wording is a modified version of the wording in <span class="citation" data-cites="P1839R6">[<a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html" role="doc-biblioref">P1839R6</a>]</span> and is relative to working
draft <span class="citation" data-cites="N4988">[<a href="https://wg21.link/n4988" role="doc-biblioref">N4988</a>]</span>.</p>
<p>Modify §<span>6.7.2
<a href="https://wg21.link/N4988#intro.object">[intro.object]</a></span>p4
as follows:</p>
<blockquote>
<p>An object <em>a</em> is <em>nested within</em> another object
<em>b</em> if</p>
<ul>
<li><em>a</em> is a subobject of <em>b</em>, or</li>
<li><em>b</em> provides storage for <em>a</em>, or</li>
<li><span class="add" style="color: #006e28"><ins><em>a</em> and
<em>b</em> are the object representations of two objects <em>o1</em> and
<em>o2</em>, where <em>o2</em> provides storage for <em>o1</em>,
or</ins></span></li>
<li>there exists an object <em>c</em> where <em>a</em> is nested within
<em>c</em>, and <em>c</em> is nested within <em>b</em>.</li>
</ul>
</blockquote>
<p>Modify §<span>6.7.2
<a href="https://wg21.link/N4988#intro.object">[intro.object]</a></span>p10
as follows:</p>
<blockquote>
<p>Unless an object is a bit-field or a subobject of zero size, the
address of that object is the address of the first byte it occupies. Two
objects with overlapping lifetimes that are not bit-fields may have the
same address if</p>
<ul>
<li>one is nested within the other,</li>
<li>at least one is a subobject of zero size and they are not of similar
types ([conv.qual]),<span class="rm" style="color: #bf0303"><del>or</del></span></li>
<li><span class="add" style="color: #006e28"><ins>at least one is an
element of an object representation, or</ins></span></li>
<li>they are both potentially non-unique objects;</li>
</ul>
<p>otherwise, they have distinct addresses and occupy distinct bytes of
storage.</p>
</blockquote>
<p>Modify §<span>6.7.2
<a href="https://wg21.link/N4988#intro.object">[intro.object]</a></span>p14
as follows:</p>
<blockquote>
<p>Except during constant evaluation, an operation that begins the
lifetime of an array of <code class="sourceCode cpp"><span class="dt">unsigned</span> <span class="dt">char</span></code>
or <code class="sourceCode cpp">std<span class="op">::</span>byte</code>
<span class="add" style="color: #006e28"><ins>other than a synthesized
object representation ([basic.types.general])</ins></span> implicitly
creates objects within the region of storage occupied by the array.</p>
</blockquote>
<p>Insert a new paragraph after §<span>6.7.3
<a href="https://wg21.link/N4988#basic.life">[basic.life]</a></span>p3
as follows:</p>
<blockquote>
<p>The lifetime of a reference begins when its initialization is
complete. The lifetime of a reference ends as if it were a scalar object
requiring storage.</p>
</blockquote>
<blockquote>
<p>[<em>Note 1</em>: [class.base.init] describes the lifetime of base
and member subobjects. —<em>end note</em>]</p>
</blockquote>
<blockquote>
<p>The lifetime of the elements of a synthesized object representation
of an object begins when the lifetime of the object begins. For class
types, the lifetime of the elements of the synthesized object
representation ends when the destruction of the object is completed;
otherwise, the lifetime ends when the object is destroyed.</p>
</blockquote>
<p>Modify §<span>6.8.1
<a href="https://wg21.link/N4988#basic.types.general">[basic.types.general]</a></span>p4
as follows and add a paragraph after it:</p>
<blockquote>
<p>The <em>object representation</em> of a complete object type
<code class="sourceCode cpp">T</code> is the sequence of <em>N</em>
<span class="rm" style="color: #bf0303"><del><span><code class="sourceCode default">unsigned char</code></span>
objects</del></span><span class="add" style="color: #006e28"><ins>bytes</ins></span> taken up by a
non-bit-field complete object of type
<code class="sourceCode cpp">T</code>, where <em>N</em> equals <code class="sourceCode cpp"><span class="kw">sizeof</span><span class="op">(</span>T<span class="op">)</span></code>.
The <em>value representation</em> of a type
<code class="sourceCode cpp">T</code> is the set of bits in the object
representation of <code class="sourceCode cpp">T</code> that participate
in representing a value of type <code class="sourceCode cpp">T</code>.
The object and value representation of a non-bit-field complete object
of type <span class="add" style="color: #006e28"><ins><em>cv</em></ins></span>
<code class="sourceCode cpp">T</code> are the bytes and bits,
respectively, of the object corresponding to the object and value
representation of its type<span class="add" style="color: #006e28"><ins>; the object representation is considered to
be an array of <em>N</em> <em>cv</em>
<span><code class="sourceCode default">unsigned char</code></span> if
the object occupies contiguous bytes of storage
([intro.object])</ins></span>. The object representation of a bit-field
object is the sequence of <em>N</em> bits taken up by the object, where
<em>N</em> is the width of the bit-field ([class.bit]). The value
representation of a bit-field object is the set of bits in the object
representation that participate in representing its value. Bits in the
object representation of a type or object that are not part of the value
representation are padding bits. For trivially copyable types, the value
representation is a set of bits in the object representation that
determines a value, which is one discrete element of an
implementation-defined set of values.</p>
</blockquote>
<div class="add" style="color: #006e28">

<blockquote>
<p>For a complete object <em>o</em> with type <em>cv</em>
<code class="sourceCode default">T</code> whose object representation is
an array <em>A</em>:</p>
<ul>
<li>If <em>o</em> has type “array of <em>cv</em>
<code class="sourceCode default">unsigned char</code>”, then <em>A</em>
is <em>o</em>.</li>
<li>Otherwise, <em>A</em> is said to be a <em>synthesized object
representation</em>, and is distinct from any object that is not an
object representation.<br />
[<em>Note</em>: In particular, when an array <em>B</em> of <em>N</em>
<code class="sourceCode default">unsigned char</code> provides storage
for an object <em>o</em> of size <em>N</em>, the object representation
of <em>o</em> is a different array that occupies the same storage as
<em>B</em>. —<em>end note</em>]<br />
For each element <em>e</em> of <em>A</em>:
<ul>
<li>If <em>e</em> occupies the same storage as a non-bit-field subobject
of <em>o</em> having type <em>cv</em>
<code class="sourceCode default">char</code>, <em>cv</em>
<code class="sourceCode default">unsigned char</code>, or <em>cv</em>
<code class="sourceCode default">std::byte</code>, the value of
<em>e</em> is that of the subobject.</li>
<li>Otherwise, for each bit <em>b</em> in the byte of <em>o</em> that
corresponds to <em>e</em>, let <em>p(b)</em> be the smallest subobject
of <em>o</em> that contains <em>b</em>. If <em>p(b)</em> is not within
its lifetime or has an indeterminate value, or if <em>b</em> is not part
of the value representation of <em>p(b)</em>, then the bit of <em>e</em>
corresponding to <em>b</em> has indeterminate value. Otherwise, if
<em>b</em> has an erroneous value, then the bit of <em>e</em>
corresponding to <em>b</em> has an erroneous value. Otherwise, the bit
of <em>e</em> corresponding to <em>b</em> has an unspecified value.</li>
</ul></li>
</ul>
<p>[<em>Note</em>: An object representation is always a complete object.
—<em>end note</em>]</p>
</blockquote>

</div>
<p>Modify §<span>6.8.4
<a href="https://wg21.link/N4988#basic.compound">[basic.compound]</a></span>p5
as follows:</p>
<blockquote>
<p>Two objects <em>a</em> and <em>b</em> are
<em>pointer-interconvertible</em> if <span class="add" style="color: #006e28"><ins>they have the same address
and</ins></span>:</p>
<ul>
<li><span class="rm" style="color: #bf0303"><del>they are the same
object, or</del></span></li>
<li><span class="rm" style="color: #bf0303"><del>one is a union object
and the other is a non-static data member of that object
([class.union]), or</del></span></li>
<li><span class="rm" style="color: #bf0303"><del>one is a
standard-layout class object and the other is the first non-static data
member of that object or any base class subobject of that object
([class.mem]), or</del></span></li>
<li><span class="rm" style="color: #bf0303"><del>there exists an object
<em>c</em> such that <em>a</em> and <em>c</em> are
pointer-interconvertible, and <em>c</em> and <em>b</em> are
pointer-interconvertible.</del></span></li>
<li><span class="add" style="color: #006e28"><ins>they have the same
complete object, or</ins></span></li>
<li><span class="add" style="color: #006e28"><ins>the complete object of
one is the object representation of the complete object of the
other.</ins></span></li>
</ul>
<p><span class="rm" style="color: #bf0303"><del>If two objects are
pointer-interconvertible, then they have the same address, and it is
possible to obtain a pointer to one from a pointer to the other via a
<span><code class="sourceCode default">reinterpret_cast</code></span>
([expr.reinterpret.cast]).</del></span><br />
<span class="add" style="color: #006e28"><ins>[<em>Note</em>: A
<span><code class="sourceCode default">reinterpret_cast</code></span>
([expr.reinterpret.cast]) never converts a pointer to <em>a</em> to a
pointer to <em>b</em> unless <em>a</em> and <em>b</em> are
pointer-interconvertible. —<em>end note</em>]</ins></span></p>
<p><span class="add" style="color: #006e28"><ins>[<em>Note</em>: A
standard-layout class object is pointer-interconvertible with its first
non-static data member (if any) and each of its base class subobjects
([class.mem]). An array object and an object that the array provides
storage for are not pointer-interconvertible. —<em>end
note</em>]</ins></span></p>
</blockquote>
<p>Modify §<span>6.8.4
<a href="https://wg21.link/N4988#basic.compound">[basic.compound]</a></span>p6
as follows:</p>
<blockquote>
<p>A byte of storage <em>b</em> is <em>reachable through</em> a pointer
value that points to an object <em>x</em> if <span class="rm" style="color: #bf0303"><del>there is an object <em>y</em>,
pointer-interconvertible with <em>x</em>, such that <em>b</em> is within
the storage occupied by <em>y</em>, or the immediately-enclosing array
object if <em>y</em> is an array element</del></span><span class="add" style="color: #006e28"><ins><em>b</em> is within the storage occupied by
<em>x</em>’s complete object</ins></span>.</p>
</blockquote>
<p>Modify §<span>7.3.2
<a href="https://wg21.link/N4988#conv.lval">[conv.lval]</a></span>p3.4,
as amended by the proposed resolution of <span class="citation" data-cites="CWG2901">[<a href="https://wg21.link/cwg2901" role="doc-biblioref">CWG2901</a>]</span>, as follows:</p>
<blockquote>
<ul>
<li>Otherwise, the object indicated by the glvalue is read
([defns.access]). Let <em>V</em> be the value contained in the object.
If <code class="sourceCode cpp">T</code> is an integer type <span class="add" style="color: #006e28"><ins>or <em>cv</em>
<span><code class="sourceCode default">std::byte</code></span></ins></span>,
the prvalue result is the value of <code class="sourceCode cpp">T</code>
congruent ([basic.fundamental]) to <em>V</em>, and <em>V</em> otherwise.
[…]</li>
</ul>
</blockquote>
<p>Modify §<span>7.6.1.9
<a href="https://wg21.link/N4988#expr.static.cast">[expr.static.cast]</a></span>p13
as follows:</p>
<blockquote>
<p>[…] Otherwise, if the original pointer value points to an object
<em>a</em>, <span class="rm" style="color: #bf0303"><del>and there is an
object <em>b</em> of type similar to
<span><code class="sourceCode default">T</code></span> that is
pointer-interconvertible ([basic.compound]) with <em>a</em>, the result
is a pointer to <em>b</em>. Otherwise, the pointer value is unchanged by
the conversion.</del></span><span class="add" style="color: #006e28"><ins>let <em>S</em> be the set of objects that
are pointer-interconvertible with <em>a</em> and have type similar to
<span><code class="sourceCode default">T</code></span>.</ins></span></p>
<div class="add" style="color: #006e28">

<ul>
<li>If <em>S</em> contains <em>a</em>, the result is a pointer to
<em>a</em>.</li>
<li>Otherwise, the result is a member of <em>S</em> whose complete
object is not a synthesized object representation if any such result
would give the program defined behavior. If there are multiple possible
results that would give the program defined behavior, the result is an
unspecified choice among them.</li>
<li>Otherwise (i.e. when there are no such members of <em>S</em> that
would give the program defined behavior), if the object representation
of <em>a</em>’s object is an array <em>A</em>,
<code class="sourceCode default">T</code> is similar to the type of
<em>A</em>, and <em>A</em> is a member of <em>S</em>, the result is a
pointer to <em>A</em>.</li>
<li>Otherwise, if the object representation of <em>a</em>’s complete
object is an array and <code class="sourceCode default">T</code> is
<em>cv</em> <code class="sourceCode default">unsigned char</code>, the
result is a pointer to the element of that object representation that
has the same address as <em>a</em>.</li>
<li>Otherwise, if <code class="sourceCode default">T</code> is
<em>cv</em> <code class="sourceCode default">char</code> or <em>cv</em>
<code class="sourceCode default">std::byte</code>, or an array of one of
these types, let <code class="sourceCode default">U</code> be the type
obtained from <code class="sourceCode default">T</code> by replacing
<code class="sourceCode default">char</code> or
<code class="sourceCode default">std::byte</code> with
<code class="sourceCode default">unsigned char</code>. If a
<code class="sourceCode default">static_cast</code> of the operand to
<code class="sourceCode default">U*</code> would be well-formed and
would yield a pointer to an object representation or element thereof,
the result of the cast to <code class="sourceCode default">T*</code> is
that pointer value.</li>
<li>Otherwise, the result is a pointer to <em>a</em>.</li>
</ul>
<p>Otherwise, if the original pointer value points past the end of an
object <em>a</em>:</p>
<ul>
<li>If the object representation of the complete object of <em>a</em> is
an array <em>A</em>, <code class="sourceCode default">T</code> is
similar to the type of <em>A</em>, and <em>a</em> has the same address
as <em>A</em>, the result is
<code class="sourceCode default">&amp;</code><em>A</em><code class="sourceCode default">+1</code>.</li>
<li>Otherwise, if the object representation of the complete object of
<em>a</em> is an array <em>A</em> and
<code class="sourceCode default">T</code> is <em>cv</em>
<code class="sourceCode default">unsigned char</code>, the result is a
pointer to the element of <em>A</em> (possibly the past-the-end element)
that has the same address as the one represented by the operand.</li>
<li>Otherwise, if <code class="sourceCode default">T</code> is
<em>cv</em> <code class="sourceCode default">char</code> or <em>cv</em>
<code class="sourceCode default">std::byte</code>, or an array of one of
these types, let <code class="sourceCode default">U</code> be the type
obtained from <code class="sourceCode default">T</code> by replacing
<code class="sourceCode default">char</code> or
<code class="sourceCode default">std::byte</code> with
<code class="sourceCode default">unsigned char</code>. If a
<code class="sourceCode default">static_cast</code> of the operand to
<code class="sourceCode default">U*</code> would be well-formed and
would yield a pointer value defined by one of the above cases, the
result of the cast to <code class="sourceCode default">T*</code> is that
pointer value.</li>
<li>Otherwise, the result is the value of the operand.</li>
</ul>

</div>
</blockquote>
<p>Modify §<span>7.6.6
<a href="https://wg21.link/N4988#expr.add">[expr.add]</a></span>p6 as
follows:</p>
<blockquote>
<p>For addition or subtraction, if the expressions
<code class="sourceCode cpp">P</code> or
<code class="sourceCode cpp">Q</code> have type “pointer to <em>cv</em>
<code class="sourceCode cpp">T</code>”<span class="rm" style="color: #bf0303"><del>, where
<span><code class="sourceCode default">T</code></span> and the array
element type are not similar, the behavior is undefined.</del></span>
<span class="add" style="color: #006e28"><ins>, one of the following
shall hold:</ins></span></p>
<ul>
<li><span class="add" style="color: #006e28"><ins><span><code class="sourceCode default">T</code></span>
is similar to the array element type, or</ins></span></li>
<li><span class="add" style="color: #006e28"><ins><span><code class="sourceCode default">T</code></span>
is similar to <span><code class="sourceCode default">char</code></span>
or <span><code class="sourceCode default">std::byte</code></span> and
the pointer value points to a (possibly-hypothetical) element of an
object representation.</ins></span></li>
</ul>
<p><span class="add" style="color: #006e28"><ins>Otherwise, the behavior
is undefined.</ins></span></p>
</blockquote>
<h1 data-number="8" id="appendix-a"><span class="header-section-number">8</span> Appendix A<a href="#appendix-a" class="self-link"></a></h1>
<p>The C programming language already contains an opt-in feature that
can be used to tell the compiler that a pointer to part of an object
cannot be used to access other parts of the same object. That feature is
the <code class="sourceCode cpp">restrict</code> keyword. Using
<code class="sourceCode cpp">restrict</code>, the definition of the
function <code class="sourceCode cpp">f3</code> given previously could
be changed to:</p>
<div class="sourceCode" id="cb5"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb5-1"><a href="#cb5-1" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> f3<span class="op">(</span><span class="dt">void</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb5-2"><a href="#cb5-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> S s<span class="op">;</span></span>
<span id="cb5-3"><a href="#cb5-3" aria-hidden="true" tabindex="-1"></a>    s<span class="op">.</span>x <span class="op">=</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb5-4"><a href="#cb5-4" aria-hidden="true" tabindex="-1"></a>    <span class="op">{</span></span>
<span id="cb5-5"><a href="#cb5-5" aria-hidden="true" tabindex="-1"></a>        <span class="kw">struct</span> S<span class="op">*</span> <span class="dt">restrict</span> p <span class="op">=</span> <span class="op">&amp;</span>s<span class="op">;</span></span>
<span id="cb5-6"><a href="#cb5-6" aria-hidden="true" tabindex="-1"></a>        f4<span class="op">(&amp;</span>s<span class="op">.</span>y<span class="op">);</span></span>
<span id="cb5-7"><a href="#cb5-7" aria-hidden="true" tabindex="-1"></a>        <span class="cf">return</span> p<span class="op">-&gt;</span>x <span class="op">*</span> p<span class="op">-&gt;</span>x<span class="op">;</span></span>
<span id="cb5-8"><a href="#cb5-8" aria-hidden="true" tabindex="-1"></a>    <span class="op">}</span></span>
<span id="cb5-9"><a href="#cb5-9" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>In the example above, if
<code class="sourceCode cpp">s<span class="op">.</span>x</code> is
accessed through an lvalue that is <em>based on</em> the restricted
pointer <code class="sourceCode cpp">p</code> <strong>and</strong>
<code class="sourceCode cpp">s<span class="op">.</span>x</code> is
modified at any point during the execution of the block in which
<code class="sourceCode cpp">p</code> is defined, then all accesses to
<code class="sourceCode cpp">s<span class="op">.</span>x</code> during
that block must be through lvalues that are based on
<code class="sourceCode cpp">p</code>. The first condition (that
<code class="sourceCode cpp">s<span class="op">.</span>x</code> is
accessed through an lvalue based on
<code class="sourceCode cpp">p</code>) is already met by the return
statement in <code class="sourceCode cpp">f3</code>; the second
condition will be met if <code class="sourceCode cpp">f4</code> attempts
to modify
<code class="sourceCode cpp">s<span class="op">.</span>x</code>. In that
case, all accesses to
<code class="sourceCode cpp">s<span class="op">.</span>x</code> during
the lifetime of <code class="sourceCode cpp">p</code> would need to be
through lvalues based on <code class="sourceCode cpp">p</code>, but the
modification in <code class="sourceCode cpp">f4</code> could not be, so
the behavior would be undefined. The compiler can assume that this
scenario does not occur, and that
<code class="sourceCode cpp">s<span class="op">.</span>x</code> will
still have the value 1 upon return from
<code class="sourceCode cpp">f4</code>.</p>
<p>GCC does not actually perform this optimization, even with
<code class="sourceCode cpp"><span class="op">-</span>O3</code>. I can
only speculate as to the reason: I suspect that this is not the kind of
optimization that <code class="sourceCode cpp">restrict</code> was
designed to enable, and that such an optimization is simply not very
useful. However, let’s assume for the sake of argument that some experts
would benefit from being given a tool to enable such an optimization in
C++: one that (unlike the current reachability rules in C++) could
actually be used by implementations without breaking compatibility with
C. What might that tool look like?
<code class="sourceCode cpp">restrict</code> itself is unlikely to be
added to C++. If we were to design a different feature for this purpose,
we would probably want it to be in a form that could also be added to
C.</p>
<p>For example, we could change the definition of pointer values in the
C++ standard so that, in the case of an object pointer, the value not
only identifies the object that the pointer value points to or past the
end of, but also includes a <em>reachable range</em>, which is a
contiguous set of bytes; a pointer could be used to access memory only
at addresses that lie within the pointer value’s reachable range. This
provenance model is the one used by CHERI, which refers to the reachable
range as the <em>bounds</em> of a pointer value. The <em>CHERI C/C++
Programming Guide</em> <span class="citation" data-cites="CHERI">[<a href="https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf" role="doc-biblioref">CHERI</a>]</span> states that the <em>subobject
bounds</em> feature (described in Section 4.3.3), in which taking the
address of a subobject produces a pointer value whose bounds are
narrowed to the memory occupied by the subobject, is not enabled by
default, and when enabled, breaks code that uses the
“<code class="sourceCode cpp">containerof</code> pattern” (p. 16); such
code must be modified to <em>opt out</em> of subobject bounds. However,
CHERI aims to provide improved safety (e.g., by “[preventing] an
overflow on [an array subobject] from affecting the remainder of the
structure”); when the objective of narrowing bounds is to create
potential UB and enable additional optimizations, an opt-in mechanism is
more appropriate. Such an opt-in mechanism, that would be based on Core
wording that defines reachable ranges, might be a library function like
the following:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode cpp"><code class="sourceCode cpp"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="co">/// If `p1` is a null pointer, return `p1`.  Otherwise, return a pointer that</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a><span class="co">/// points to or past the end of the same object `o` as `p1` but whose</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a><span class="co">/// reachable range consists of the bytes in [p2, p3).  The storage occupied by</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a><span class="co">/// `o` shall be a subrange of [p2, p3), which shall be a subrange of the</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a><span class="co">/// reachable range of `p1`; otherwise, the behavior is undefined.</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a><span class="dt">void</span><span class="op">*</span> narrow_reachable_range_to<span class="op">(</span><span class="dt">void</span><span class="op">*</span> p1,</span>
<span id="cb6-7"><a href="#cb6-7" aria-hidden="true" tabindex="-1"></a>                                <span class="kw">const</span> <span class="dt">void</span><span class="op">*</span> p2,</span>
<span id="cb6-8"><a href="#cb6-8" aria-hidden="true" tabindex="-1"></a>                                <span class="kw">const</span> <span class="dt">void</span><span class="op">*</span> p3<span class="op">)</span>;</span></code></pre></div>
<p>The same library function could also be available in C; for example,
it could be in the <code class="sourceCode cpp"><span class="op">&lt;</span>stdlib<span class="op">.</span>h<span class="op">&gt;</span></code>
header. The previously given example would then become:</p>
<div class="sourceCode" id="cb6"><pre class="sourceCode c"><code class="sourceCode c"><span id="cb6-1"><a href="#cb6-1" aria-hidden="true" tabindex="-1"></a><span class="dt">int</span> f3<span class="op">(</span><span class="dt">void</span><span class="op">)</span> <span class="op">{</span></span>
<span id="cb6-2"><a href="#cb6-2" aria-hidden="true" tabindex="-1"></a>    <span class="kw">struct</span> S s<span class="op">;</span></span>
<span id="cb6-3"><a href="#cb6-3" aria-hidden="true" tabindex="-1"></a>    s<span class="op">.</span>x <span class="op">=</span> <span class="dv">1</span><span class="op">;</span></span>
<span id="cb6-4"><a href="#cb6-4" aria-hidden="true" tabindex="-1"></a>    f4<span class="op">((</span><span class="dt">int</span><span class="op">*)</span>narrow_reachable_range_to<span class="op">(&amp;</span>s<span class="op">.</span>y<span class="op">,</span> <span class="op">&amp;</span>s<span class="op">.</span>y<span class="op">,</span> <span class="op">&amp;</span>s<span class="op">.</span>y <span class="op">+</span> <span class="dv">1</span><span class="op">));</span></span>
<span id="cb6-5"><a href="#cb6-5" aria-hidden="true" tabindex="-1"></a>    <span class="cf">return</span> s<span class="op">.</span>x <span class="op">*</span> s<span class="op">.</span>x<span class="op">;</span></span>
<span id="cb6-6"><a href="#cb6-6" aria-hidden="true" tabindex="-1"></a><span class="op">}</span></span></code></pre></div>
<p>The C++ standard library could provide more convenient (presumably
templated) facilities built on top of
<code class="sourceCode cpp">narrow_reachable_range_to</code>.</p>
<p>This paper does not propose to add reachable ranges to the C++
standard, nor a library function similar to
<code class="sourceCode cpp">narrow_reachable_range_to</code>. This
Appendix merely aims to describe one possibility as to how the
optimizations that the paper seeks to invalidate could be recovered by a
future opt-in mechanism.</p>
<h1 data-number="9" id="bibliography"><span class="header-section-number">9</span> References<a href="#bibliography" class="self-link"></a></h1>
<div id="refs" class="references csl-bib-body hanging-indent" data-entry-spacing="1" role="doc-bibliography">
<div id="ref-CHERI" class="csl-entry" role="doc-biblioentry">
[CHERI] Robert N. M. Watson et al. 2020-06. CHERI C/C++ Programming
Guide. <a href="https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf"><div class="csl-block">https://www.cl.cam.ac.uk/techreports/UCAM-CL-TR-947.pdf</div></a>
</div>
<div id="ref-CWG2901" class="csl-entry" role="doc-biblioentry">
[CWG2901] Jan Schultke. 2024-06-14. Unclear semantics for near-match
aliased access. <a href="https://wg21.link/cwg2901"><div class="csl-block">https://wg21.link/cwg2901</div></a>
</div>
<div id="ref-N3057" class="csl-entry" role="doc-biblioentry">
[N3057] Paul McKenney, et al. 2010-03-11. Explicit Initializers for
Atomics. <a href="https://wg21.link/n3057"><div class="csl-block">https://wg21.link/n3057</div></a>
</div>
<div id="ref-N4988" class="csl-entry" role="doc-biblioentry">
[N4988] Thomas Köppe. 2024-08-05. Working Draft, Programming Languages —
C++. <a href="https://wg21.link/n4988"><div class="csl-block">https://wg21.link/n4988</div></a>
</div>
<div id="ref-P0137R1" class="csl-entry" role="doc-biblioentry">
[P0137R1] Richard Smith. 2016-06-23. Core Issue 1776: Replacement of
class objects containing reference members. <a href="https://wg21.link/p0137r1"><div class="csl-block">https://wg21.link/p0137r1</div></a>
</div>
<div id="ref-P1839R6" class="csl-entry" role="doc-biblioentry">
[P1839R6] Timur Doumler, Krystian Stasiowski, Brian Bi. 2024-10.
Accessing object representations. <a href="https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html"><div class="csl-block">https://open-std.org/jtc1/sc22/wg21/docs/papers/2024/p1839r6.html</div></a>
</div>
<div id="ref-P2795R5" class="csl-entry" role="doc-biblioentry">
[P2795R5] Thomas Köppe. 2024-03-22. Erroneous behaviour for
uninitialized reads. <a href="https://wg21.link/p2795r5"><div class="csl-block">https://wg21.link/p2795r5</div></a>
</div>
<div id="ref-P2883R0" class="csl-entry" role="doc-biblioentry">
[P2883R0] Alisdair Meredith. 2023-05-19. `offsetof` Should Be A Keyword
In C++26. <a href="https://wg21.link/p2883r0"><div class="csl-block">https://wg21.link/p2883r0</div></a>
</div>
</div>
<section id="footnotes" class="footnotes footnotes-end-of-document" role="doc-endnotes">
<hr />
<ol>
<li id="fn1"><p>In cases where the macro’s name is precisely
<code class="sourceCode cpp">container_of</code>, it appears that it
usually refers to the version defined by the Linux kernel. This version
uses <code class="sourceCode cpp"><span class="dt">void</span><span class="op">*</span></code>,
not <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>;
pointer arithmetic using <code class="sourceCode cpp"><span class="dt">void</span><span class="op">*</span></code>
is not proposed by this paper. However, <code class="sourceCode cpp"><span class="dt">char</span><span class="op">*</span></code>
is used in many other cases.<a href="#fnref1" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn2"><p>All citations to the Standard are to working draft N4988
unless otherwise specified.<a href="#fnref2" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn3"><p>In C,
<code class="sourceCode cpp"><span class="dt">void</span></code> is an
object type.<a href="#fnref3" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn4"><p>Note that the Provenance TS does not state that two
different complete objects always have different storage IDs. According
to section 3.20, a single allocation creates a single storage instance.
For example, when <code class="sourceCode cpp">malloc</code> succeeds,
it returns a pointer to “the allocated storage instance” (per section
7.22.3.4).<a href="#fnref4" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
<li id="fn5"><p>An example of dangerous UB is reading from uninitialized
variables. I’ve observed recent versions of Clang eliding branches along
which uninitialized variables are read, causing unit tests to fail when
Clang was upgraded. Such behavior will become (mostly) disallowed in
C++26 due to the adoption of <span class="citation" data-cites="P2795R5">[<a href="https://wg21.link/p2795r5" role="doc-biblioref">P2795R5</a>]</span>.<a href="#fnref5" class="footnote-back" role="doc-backlink">↩︎</a></p></li>
</ol>
</section>
</div>
</div>
</body>
</html>
