<html><head><meta http-equiv="Content-Type" content="text/html; charset=ISO-8859-1">
<!doctype html>
<title>Language support for empty objects</title>

<style type="text/css">
  ins, ins * { text-decoration:none; font-weight:bold; background-color:#A0FFA0 }
  del, del * { text-decoration:line-through; background-color:#FFA0A0 }
  #hidedel:checked ~ * del, #hidedel:checked ~ * del * { display:none; visibility:hidden }

blockquote {
  padding: .5em;
  border: .5em;
  border-color: silver;
  border-left-style: solid;
}

blockquote.std { color: #000000; background-color: #F1F1F1;
  border: 1px solid #D1D1D1;
  padding-left: 0.5em; padding-right: 0.5em; }
blockquote.stdins { text-decoration: underline;
  color: #000000; background-color: #C8FFC8;
  border: 1px solid #B3EBB3; padding: 0.5em; }
blockquote.stddel { text-decoration: line-through;
  color: #000000; background-color: #FFEBFF;
  border: 1px solid #ECD7EC;
  padding-left: 0.5em; padding-right: 0.5em; }

p.grammarlhs { margin-bottom: 0 }
p.grammarrhs { margin-left:8em; margin-top:0; margin-bottom:0; text-indent:-4em }
</style>

<div style="text-align: right; float: right">
ISO/IEC JTC1 SC22 WG21<br>
P0840R2<br>
Richard Smith<br>
richard@metafoo.co.uk<br>
2018-03-12<br>
</div>

<!--
  TODO:
  * concern about standard-layout-ness of types with zero-size members
-->

<h1>Language support for empty objects</h1>

<p>
For motivation, see <a href="http://wg21.link/p0840r0">P0840R0</a>.

<h2>Proposal</h2>

<p>
We propose the addition of an attribute, <tt>[[no_unique_address]]</tt> to
indicates that a unique address is not required for an non-static data member
of a class. A non-static data member with this attribute may share its address
(or its tail padding) with another object, if it could when used as a base class.

<h3>FAQ</h3>

<details>
<summary>
Q: Does this allow multiple subobjects <i>of the same type</i> to have the same address?
</summary>

No. Just like with base classes, if two members are of the same type or have subobjects of the same type, the common subobjects will be given distinct addresses.
</details>

<details>
<summary>
Q: Can a standard library switch from EBO to this attribute without an ABI break?
</summary>

The intent is that an ABI can specify the same layout rule for a member with the attribute as it does for a base class. In an ABI that makes that choice, yes.
</details>

<details>
<summary>
Q: Does this allow reuse of tail padding? (Eg, three bytes at the end of <tt>struct A { int n; char c; };</tt>)
</summary>

The general rule is that the layout is just like for a base class. Tail padding reuse is permitted for base classes, so it's also permitted for members with the attribute.
</details>

<details>
<summary>
Q: When is a type using this attribute considered "empty"? (As visible via <tt>std::is_empty</tt>)
</summary>

This is up to the ABI, so it will be implementation-defined whether a type is considered empty if it has no data members (transitively) other than those marked by the attribute.
</details>

<details>
<summary>
Q: Does the attribute affect whether a type is standard-layout?
</summary>

No.
</details>

<details>
<summary>
Q: Does the attribute affect the "common initial sequence" rule?
</summary>

Yes. For two <tt>struct</tt>s to be considered to have a common initial sequence, their initial sequences of common members must make consistent use of the attribute.
</details>

<details>
<summary>
Q: Suppose I have members <tt>a</tt>, <tt>b</tt>, <tt>c</tt> (in that order, with the same access).
Today we guarantee that <tt>&amp;a</tt> &lt; <tt>&amp;b</tt> &lt; <tt>&amp;c</tt>.
What happens if <tt>b</tt> has the attribute?
</summary>

Two cases:

<ol>
<li>If the type of <tt>b</tt> is empty, then there is no guarantee about the address of <tt>b</tt> (other than that it is somewhere within the containing object).
<li>If the type of <tt>b</tt> is nonempty, then we still guarantee that <tt>&amp;a</tt> &lt; <tt>&amp;b</tt> &lt; <tt>&amp;c</tt>.
</ol>
</details>

<details>
<summary>
Q: Applying a <tt>[[no_unique_address]]</tt> attribute to a function, constant variable, etc, could be used to permit ICF ("Identical Code Folding" / "Identical Constant Folding" / "Identical COMDAT Folding"). Is this proposed?
</summary>

Not as part of this proposal.
</details>

<h2>Wording</h2>

<p>
Add a new paragraph before [intro.object] (6.6.2) paragraph 7:
</p>

<blockquote class="stdins">
A <i>potentially-overlapping subobject</i>
is either:
<ul>
<li>
  a base class subobject, or
<li>
  a non-static data member declared with the <tt>no_unique_address</tt> attribute ([dcl.attr.nouniqueaddr]).
</ul>
</blockquote>

<p>
Change in [intro.object] (6.6.2) paragraph 7:
</p>

<blockquote class="std">
<ins>
An object has nonzero size if it
<ul>
<li>is not a potentially-overlapping subobject, or
<li>is not of class type, or
<li>is of a class type with virtual member functions or virtual base classes, or
<li>has subobjects of nonzero size or bit-fields of nonzero length.
</ul>
Otherwise, if the object is a base class subobject of
a standard-layout class type
with no non-static data members, it has zero size.
Otherwise, the circumstances under which the object
has zero size are implementation-defined.
</ins>
Unless it is a bit-field (12.2.4), <del>a most derived</del>
<ins>an</ins>
object <del>shall have a</del> <ins>with</ins> nonzero
size <del>and</del> shall occupy one or more bytes of storage<ins>,
including every byte that is occupied in full or in part by
any of its subobjects</ins>.
<del>Base class subobjects
may have zero size</del>.
An object of trivially copyable or standard-layout type (6.9)
shall occupy contiguous bytes of storage.
<!--
<ins>A bit-field of length <i>N</i> shall occupy <i>N</i> bits of storage.</ins>
-->
</blockquote>

<p>
Change in [intro.object] (6.6.2) paragraph 8:
</p>

<blockquote class="std">
Unless an object is a
bit-field or a <del>base class</del> subobject of zero size,
the address of that object is the address of the first byte it occupies.
Two objects <del><tt>a</tt> and <tt>b</tt></del> with overlapping lifetimes
that are not bit-fields
may have the same address if one is nested within the other, or if
at least one is a <del>base class</del> subobject
of zero size and they are of different types; otherwise, they have distinct addresses
<ins>and shall occupy disjoint bytes of storage</ins>.
[Footnote]
[Example]
<!--
<ins>
The bits occupied by a bit-field
shall be disjoint from the bits or bytes occupied by any other object
with overlapping lifetime that the bit-field is not nested within.
</ins>
-->
<ins>
The address of a non-bit-field subobject of zero size is that of a byte of storage
occupied by the complete object of the subobject.
</ins>
</blockquote>

<p>
Change in [basic.life] (6.6.3) bullet 8.4:
</p>

<blockquote class="std">
<ins>neither</ins> the original object
<ins>nor the new object
is a potentially-overlapping subobject (6.6.2 [intro.object])</ins>
<del>was a most derived object (6.6.2) of type T and
the new object is a most derived object of type T
(that is, they are not base class subobjects)</del>.
</blockquote>

<p>
Change in [basic.types] (6.7) paragraph 2:
</p>

<blockquote class="std">
For any object (other than a <del>base-class</del>
<ins>potentially-overlapping</ins> subobject) of
trivially copyable type [&hellip;]
</blockquote>

<p>
Change in [basic.types] (6.7) paragraph 3:
</p>

<blockquote class="std">
For any trivially copyable type T,
if two pointers to T point to distinct T objects obj1 and obj2,
where neither obj1 nor obj2 is a <del>base-class</del>
<ins>potentially-overlapping</ins> subobject,
if the underlying bytes (6.6.1) making up obj1 are copied into obj2,
obj2 shall subsequently hold the same value as obj1.
[&hellip;]
</blockquote>

<p>
Change in [expr.sizeof] (8.5.2.3) paragraph 1:
</p>

<blockquote class="std">
The <tt>sizeof</tt> operator yields the number of bytes
<ins>occupied by a non-potentially-overlapping object
of the type</ins>
<del>in the object representation</del> of its operand.
[&hellip;]
</blockquote>

<p>
Change in [expr.sizeof] (8.5.2.3) paragraph 2:
</p>

<blockquote class="std">
When applied to a reference or a reference type,
the result is the size of the referenced type.
When applied to a class,
the result is the number of bytes in an object of that class
including any padding required for placing objects of that type in an array.
<del>The size of a most derived class shall be greater than zero (6.6.2).</del>
The result of applying <tt>sizeof</tt>
to a <del>base class</del> <ins>potentially-overlapping</ins> subobject
is the size of the <del>base class</del> type<ins>, not the size of the subobject</ins>.
[ Footnote:
The actual size of a <del>base class</del> <ins>potentially-overlapping</ins> subobject
may be less than the result of applying <tt>sizeof</tt> to the subobject,
due to virtual base classes and less strict padding requirements on <del>base class</del>
<ins>potentially-overlapping</ins> subobjects. ]
When applied to an array, the result is the total
number of bytes in the array. This implies that the size of an array of n
elements is n times the size of an element.
</blockquote>

<p>
Change in [expr.rel] (8.5.9) paragraph 3:
</p>

<blockquote class="std">
Comparing unequal pointers to objects is defined as follows:
<ul>
<li>[&hellip;]
<li>If two pointers point to different non-static data members of the same object,
or to subobjects of such members, recursively, the pointer to the later declared
member compares greater provided the two members have the same access control (Clause 14)<ins>,
neither member is a subobject of zero size,</ins>
and <del>provided</del> their class is not a union.
<li>Otherwise, neither pointer compares greater than the other.
</blockquote>

<p>
Add a new subclause [dcl.attr.nouniqueaddr] after [dcl.attr.noreturn]:
</p>

<blockquote class="stdins">
<h3>10.6.9 No unique address attribute [dcl.attr.nouniqueaddr]</h3>

<p>
The <i>attribute-token</i> <tt>no_unique_address</tt>
specifies that a non-static data member
is a potentially-overlapping subobject (6.6.2 [intro.object]).
<!--
[ Note: As a consequence, such a member can have the same
address as another non-static data member of its class if
it is of zero size. ]
-->
It shall appear at most once in each <i>attribute-list</i>
and no <i>attribute-argument-clause</i> shall be present.
The attribute may appertain to a non-static data member
other than a bit-field.
</p>

<p>
[ <i>Note</i>: The non-static data member can share the address
of another non-static data member or that of a base class,
and any padding that would normally be inserted at the
end of the object can be reused as storage for other
members. &mdash; <i>end note</i> ] [ <i>Example</i>:
<pre>
template&lt;typename Key, typename Value,
         typename Hash, typename Pred, typename Allocator>
class hash_map {
  [[no_unique_address]] Hash hasher;
  [[no_unique_address]] Pred pred;
  [[no_unique_address]] Allocator alloc;
  Bucket *buckets;
  // ...
public:
  // ...
};
</pre>

Here, <tt>hasher</tt>, <tt>pred</tt>, and <tt>alloc</tt> could have the same
address as <tt>buckets</tt> if their respective types are all empty.
&mdash; <i>end example</i> ]
</blockquote>

<p>
Change in [class] (12) paragraph 4 and split into two paragraphs:
</p>

<blockquote class="std">
<ins>[ Note:</ins>
Complete objects <del>and member subobjects</del> of class type <del>shall</del> have nonzero size.
<del>[ Footnote:</del>
Base class subobjects
<ins>and members declared with the <tt>no_unique_address</tt> attribute ([dcl.attr.nouniqueaddr])</ins>
are not so constrained.
]
<p>
[ Note: ... ]
</blockquote>

<p>
Change in [class] (12) paragraph 7:
</p>

<blockquote class="std">
[&hellip;]
<ul>
  <li>
If X is a non-union class type <del>whose first</del> <ins>with a</ins>
non-static data member <del>has</del> <ins>of</ins> type X0
<ins>that is either of zero size or is the first non-static data member of <tt>X</tt></ins>
(where said member may be an anonymous union),
the set M(X) consists of X0 and the elements of M(X0).
</ul>
[&hellip;]
[Note: M(X) is the set of the types of all non-base-class subobjects
that <del>are guaranteed</del> <ins>may be at a zero offset</ins>
in <del>a standard-layout class to be at a zero offset in</del> X. —end note]
[&hellip;]
</blockquote>

<p>
Change in [class.mem] (12.2) paragraph 21:
</p>

<blockquote class="std">
The <i>common initial sequence</i> of two standard-layout struct (Clause 12) types
is the longest sequence of non-static data members and bit-fields in declaration order,
starting with the first such entity in each of the structs,
such that corresponding entities have layout-compatible types<ins>,
either both entities are declared with the <tt>no_unique_address</tt> attribute ([dcl.attr.nouniqueaddr]) or neither is,</ins>
and either <del>neither entity is a bit-field or</del> both <ins>entities</ins> are bit-fields with the same width <ins>or neither is a bit-field</ins>.
[ Example ]
</blockquote>

<p>
Change in [meta.unary.prop] (23.15.4.3) Table 42 ("Type property predicates"):
</p>

<blockquote class="std">
<table>
  <tr><th>Template<th>Condition<th>Predicate</tr>
  <tr><td colspan=3 align=center>&hellip;</tr>
    <tr><td><tt>template &lt;class T&gt; struct is_empty;</tt>
    <td>
      <tt>T</tt> is a class type, but not a union type,
      with no non-static data members
      other than <del>bit-fields of length 0</del>
      <ins>subobjects of zero size,</ins>
      no virtual member functions,
      no virtual base classes, and
      no base class <tt>B</tt> for which <tt>is_empty_v&lt;B&gt;</tt> is <tt>false</tt>.
    <td>
      If <tt>T</tt> is a non-union class type,
      <tt>T</tt> shall be a complete type.
  </tr>
  <tr><td colspan=3 align=center>&hellip;</tr>
</table>
</body></html>
