<html>
<head>
<title>Issue 273: POD classes and operator&amp;()</title>
</head>

<body>

Jens Maurer<br>
2001-09-13

<h1>Issue 273: POD classes and operator&amp;()</h1>

<h2>The Issue</h2>

The following is a quote from issue 273, as submitted by Andrei
Iltchenko:

<blockquote>
I think that the definition of a POD class in the current version of
the Standard is overly permissive in that it allows for POD classes
for which a user-defined operator function operator& may be
defined. Given that the idea behind POD classes was to achieve
compatibility with C structs and unions, this makes 'Plain old'
structs and unions behave not quite as one would expect them to.
<p>
[...]
<p>
The fact that the definition of a POD class allows for POD classes for
which a user-defined operator& is defined, may also present major
obstacles to implementers of the offsetof macro from &lt;cstddef&gt;
</blockquote>


<h2>POD</h2>

These sections in the standard define the semantics of "POD":

<ul>
<li>1.8p5: An object of POD type shall occupy contiguous bytes of storage.
<li>3.6.2p1: Objects of POD type with static storage duration
initialized with constant expressions shall be initialized before any
dynamic initialization takes place.
<li>3.8p5: Pointers to POD objects may be used outside of the lifetime of the
object.
<li>3.8p6: POD lvalues may be used outside of the lifetime of the
object.
<li>3.9p2: Complete POD objects can be copied to "char" or "unsigned
char" arrays, preserving value.
<li>3.9p3: POD objects can be copied with std::memcpy.
<li>3.9p4: The value representation defines a value.
<li>5.2.2p7: POD classes can be passed via the ellipsis.
<li>5.3.4p15: POD class objects created with "new" have an indeterminate
value.
<li>5.19p4: Expressions designating the address of a POD class object
are address constant expressions.  Similar for reference constant
expressions.
<li>6.7p3: Jumps across local POD object declarations without an
initializer are permitted.
<li>6.7p4: A local static POD object initialized with a constant
expression is initialized before its block is first entered.
<li>8.5p9: Differences in initialization.
<li>8.5.1p14: static initialization of aggregates that are PODs
<li>9p4: A POD-struct is an aggregate that has no non-static data
members of type pointer to member, non-POD-struct, non-POD-union (and
arrays thereof), or reference, and no user-defined copy assignment nor
user-defined destructor.
<li>9.2p16: May inspect common initial sequences of members of
POD-unions.
<li>9.2p26: A pointer to a POD-struct object points to its initial
member.
<li>12.6.2p4: Differences in initialization.
<li>17.1.3: A character container class must be a POD type.
<li>18.1p5: Argument of "offsetof" must be POD struct or POD union.
</ul>


<h2>Aggregate</h2>

These section in the standard define the semantics of "aggregate":

<ul>
<li>3.10p15: Aliasing is possible.
<li>8.5.1p1: Aggregate is an array or class with no user-declared
constructor, no private/protected non-static data members, no base
classes, no virtual functions.
<li>8.5.1p1: Aggregates can be initialized by an initializer-list in
braces.
</ul>


<h2>Analysis</h2>

Nothing in the current wording of the standard makes a struct or union
with a user-defined operator&amp; a non-POD type.  However, that is
certainly surprising.  At least one example in the standard (3.9p2)
assumes that operator&amp; has its default meaning.  The normative
text cautiously only talks about "underlying storage" (3.9p2) and
"pointer to T" (3.9p3).  It is not directly specified in the normative
text how that pointer may be obtained when operator&amp; is overloaded.
<p>
Older source code inherited from C which relies on the semantics laid
out in 3.9p2 will not have operator&amp; overloaded, since C does not
permit operator overloading.
<p>
The following sections demonstrate how the two problematic areas of
PODs with overloaded operator&amp; can be resolved: Getting the
address of an object and "offsetof".

<h3>Getting the address of an object with overloaded
operator&amp;</h3>

In order to retrieve the address of the storage for an object with
overloaded operator &amp; the following technique can be applied:
<pre>
#include &lt;cstring&gt;

struct A
{
  int x;
  void operator&() const;
};

int main()
{
  A a;
  unsigned char space[sizeof(A)];
  std::memcpy(space, &reinterpret_cast&lt;unsigned char&amp;&gt;(static_cast&lt;A&amp;&gt;(a)), sizeof(A));
}
</pre>

Note that getting the address for the storage of some object a
requires that a is an l-value, therefore it can legally be converted
into a reference.

<h3>offsetof</h3>

Offsetof is usually a macro implemented (non-conformingly) as
something like
<pre>
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)-&gt;MEMBER)
</pre>
(Implementation taken from GCC 3.0.1.)
<p>
This implementation fails if the type of the member has operator&amp;
overloaded:

<pre>
#include &lt;cstddef&gt;

struct A
{
  int x;
  void operator&amp;() const;
};

struct B
{
  A a;
};

int main()
{
  int o = offsetof(B, a);
}
</pre>

If "offsetof" is defined according to the idea above, the code
compiles with some compilers:

<pre>
#define offsetof(TYPE, MEMBER) \
  (unsigned int)&reinterpret_cast&lt;unsigned char&amp;&gt;(((TYPE*)0)->MEMBER)
</pre>

Note that "offsetof" could also be implemented by other non-standard
means proprietary to the implementation; this shows only one example.


<h3>Conclusion</h3>

Option 1: Forbidding overloading of operator&amp; in PODs puts the
burden on the users, because they potentially have to change existing
code exploiting this "feature".
<p>
Option 2: Continuing to allow operator&amp; in PODs puts the burden on
the implementors; they have to adjust their "offsetof" macros for this
newly discovered problem.
<p>
I favor option 2, but I'd like to have some guidance from the core
working group on this.


<h2>Proposed Resolution</h2>

<h3>Option 1</h3>

Replace in <a href="class.html#class">9</a> paragraph 4
<blockquote>
A POD-struct is an aggregate class that has no non-static data members
of type pointer to member, non-POD-struct, non-POD-union (or array of
such types) or reference, and has no user-defined copy assignment
operator and no user-defined destructor. Similarly, a POD-union is an
aggregate union that has no non-static data members of type pointer to
member, non-POD-struct, non-POD-union (or array of such types) or
reference, and has no user-defined copy assignment operator and no
user-defined destructor. A POD class is a class that is either a
POD-struct or a POD-union.
</blockquote>
by
<blockquote>
A POD-struct is an aggregate class that has no non-static data members
of type pointer to member, non-POD-struct, non-POD-union (or array of
such types) or reference, and has no user-defined copy assignment
operator, no user-defined destructor<strong>, and no overloaded
operator &amp;</strong>. Similarly, a POD-union is an aggregate union
that has no non-static data members of type pointer to member,
non-POD-struct, non-POD-union (or array of such types) or reference,
and has no user-defined copy assignment operator, no user-defined
destructor<strong>, and no overloaded operator &amp;</strong>. A POD
class is a class that is either a POD-struct or a POD-union.
</blockquote>

<h3>Option 2</h3>
<p>

Replace the example in <a href="basic.html#basic.types">3.9</a>
paragraph 2 by
<blockquote>
<pre>
    #define N sizeof(T)
    char buf[N];
    T obj;                          //   obj  initialized to its original value
    // Note: PODs can have operator&amp; overloaded.
    memcpy(buf, &reinterpret_cast&lt;unsigned char&amp;&gt;(static_cast&lt;T&amp;&gt;(obj), N);

    //  between these two calls to  memcpy, obj  might be modified
    memcpy(&reinterpret_cast&lt;unsigned char&amp;&gt;(static_cast&lt;T&amp;&gt;(obj), buf, N);
    //  at this point, each subobject of  obj  of scalar type
                                    //  holds its original value
</pre>
</blockquote>

Add a footnote in
<a href="lib-support.html#lib.support.types">18.1</a> paragraph 5:
<blockquote>
[<em>Footnote:</em> Note that PODs can have operator&amp; overloaded.]
</blockquote>

</body>
</html>