<h1 id="automatically-generate-operator-">Automatically Generate <code>operator-&gt;</code></h1>
<pre>Document Number: P3039R0
Date: 2023-11-07
Author: David Stone (davidfromonline@gmail.com)
Audience: Evolution Working Group (EWG), Library Evolution Working Group (LEWG)</pre>

<h2 id="summary">Summary</h2>
<p>This proposal follows the lead of <code>operator&lt;=&gt;</code> (see <a href="http://open-std.org/JTC1/SC22/WG21/docs/papers/2017/p0515r3.pdf">Consistent Comparison</a>) by generating rewrite rules if a particular operator does not exist in current code. This is a follow-up to P1046R2, but it includes only <code>operator-&gt;</code> and <code>operator-&gt;*</code>.</p>
<ul>
<li>Rewrite <code>lhs-&gt;rhs</code> as <code>(*lhs).rhs</code></li>
<li>Rewrite <code>lhs-&gt;*rhs</code> as <code>(*lhs).*rhs</code></li>
</ul>
<h2 id="design-goals">Design Goals</h2>
<p>The primary goal of this paper is that users should have little or no reason to write their own version of an operator that this paper proposes generating. It would be a strong indictment if there are many examples of well-written code for which this paper provides no simplification, so a fair amount of time in this paper will be spent looking into the edge cases and making sure that we generate the right thing. At the very least, types that would not have the correct operator generated should not generate an operator at all. In other words, it should be uncommon for users to define their own versions (the default should be good enough most of the time), and it should be extremely rare that users want to suppress the generated version (<code>= delete</code> should almost never appear).</p>
<p>This paper was split off fromP1046R2. That paper proposed generating many operators, this paper is just the arrow overloads. This is the part of that paper with the most motivation (as users cannot write the equivalent), the feature that gained the most support from the previous EWG meeting, and one of the least complicated changes.</p>
<p>This area has been well-studied for library solutions. This paper, however, traffics in rewrite rules (following the lead of <code>operator&lt;=&gt;</code>), not in terms of function calls. Because of this, we have one more option that the library-only solutions lack: we could define <code>lhs-&gt;rhs</code> as being equivalent to <code>(*lhs).rhs</code>. This neatly sidesteps all of the issues of library-only solutions (how do we get the address of the object? how do we handle temporaries?). It even plays nicely with existing rules around lifetime extension of temporaries. This solves many long-standing issues around proxy iterators that return by value.</p>
<h2 id="comparison-with-comparison-rewrites">Comparison with comparison rewrites</h2>
<p>This change tries to follow the lead of the existing rewrite rules we have. However, unlike the comparison operators this change is significantly simpler and less likely to have all the corner cases we&#39;ve seen with <code>operator==</code> and <code>operator&lt;=&gt;</code>. Much of the complexity around the comparison operators has come from reversed candidates and dealing with potentially multiple arguments. The specification of these operators should be significantly simpler.</p>
<h3 id="-operator-"><code>operator-&gt;</code></h3>
<p><code>operator-&gt;</code> is the simplest. It cannot be defined outside of the class, so there is exactly one place to look for whether it exists today. It is also a unary operator (it does not depend on the right-hand side). This makes overload resolution and name lookup very simple.</p>
<h3 id="-operator-"><code>operator-&gt;*</code></h3>
<p><code>operator-&gt;*</code> is slightly more complicated. It can be defined as a free operator, a member function, and a hidden friend. However, we&#39;ve already solved the problems associated with that for comparison operators. It is a binary operator, but the order is meaningful and thus there are no complexities around reversed candidates. One of the base operators for this, <code>operator.*</code>, also cannot be overloaded. This means that the only overloaded operator that is relevant to <code>operator-&gt;*</code> is the same as for <code>operator-&gt;</code>: <code>operator*</code>, which is a unary operator. That fact also eliminates much of the complexity of <code>operator==</code> vs <code>operator!=</code>.</p>
<h2 id="design">Design</h2>
<p>If any of this conflicts with how <code>operator!=</code> is defined in terms of <code>operator==</code> and is not explicitly called out as a difference, that difference is unintentional.</p>
<p>All of these examples assume there is a variable <code>lhs</code> of type <code>LHS</code>.</p>
<h3 id="-operator-"><code>operator-&gt;</code></h3>
<p>If the expression <code>lhs-&gt;rhs</code> is encountered:</p>
<ul>
<li>If overload resolution for <code>operator-&gt;</code> would succeed, then that operator is called. This can also select a deleted overload.</li>
<li>If overload resolution does not succeed, but the expression <code>*lhs</code> is well-formed, then the expression is rewritten to <code>(*lhs).rhs</code></li>
<li>Otherwise, the expression is ill-formed</li>
</ul>
<h3 id="-operator-"><code>operator-&gt;*</code></h3>
<p>If the expression <code>lhs-&gt;*rhs</code> is encountered:</p>
<ul>
<li>If overload resolution for <code>operator-&gt;*</code> would succeed, then that operator is called. This can also select a deleted overload.</li>
<li>If overload resolution does not succeed, but the expression <code>*lhs</code> is well-formed, then the expression is rewritten to <code>(*lhs).*rhs</code></li>
<li>Otherwise, the expression is ill-formed</li>
</ul>
<h2 id="library-impact">Library impact</h2>
<p>Given the path we took for <code>operator&lt;=&gt;</code> of removing manual definitions of operators that can be synthesized and assuming we want to continue with that path by approving <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2019/p1732r1.pdf">&quot;Do not promise support for function syntax of operators&quot;</a>, this will allow removing a large amount of specification in the standard library by replacing it with blanket wording that these rewrites apply. I have surveyed the standard library to get an overview of what would change in response to this, and to ensure that the changes would work properly. This covers every type that was in the standard library as of early 2020.</p>
<h3 id="types-that-will-gain-operator-and-this-is-a-good-thing">Types that will gain <code>operator-&gt;</code> and this is a good thing</h3>
<ul>
<li><code>move_iterator</code> currently has a deprecated <code>operator-&gt;</code></li>
<li><code>counted_iterator</code></li>
<li><code>istreambuf_iterator</code></li>
<li><code>istreambuf_iterator::proxy</code> (exposition only type)</li>
<li><code>iota_view::iterator</code></li>
<li><code>transform_view::iterator</code></li>
<li><code>split_view::outer_iterator</code></li>
<li><code>split_view::inner_iterator</code></li>
<li><code>basic_istream_view::iterator</code></li>
<li><code>elements_view::iterator</code></li>
</ul>
<p>Most of these are iterators that return either by value or by <code>decltype(auto)</code> from some user-defined function. It is not possible to safely and consistently define <code>operator-&gt;</code> for these types, so we do not always do so, but under this proposal they would all do the right thing.</p>
<h3 id="types-that-will-technically-gain-operator-but-it-is-not-observable">Types that will technically gain <code>operator-&gt;</code> but it is not observable</h3>
<ul>
<li><code>insert_iterator</code></li>
<li><code>back_insert_iterator</code></li>
<li><code>front_insert_iterator</code></li>
<li><code>ostream_iterator</code></li>
</ul>
<p>The insert iterators and <code>ostream_iterator</code> technically gain an <code>operator-&gt;</code>, but <code>operator*</code> returns a reference to <code>*this</code> and the only members of those types are types, constructors, and operators, none of which are accessible through <code>operator-&gt;</code> using the syntaxes that are supported to access the standard library.</p>
<h3 id="types-that-will-gain-operator-and-it-s-weird-either-way">Types that will gain <code>operator-&gt;</code> and it&#39;s weird either way</h3>
<ul>
<li><code>ostreambuf_iterator</code></li>
</ul>
<p><code>ostreambuf_iterator</code> is the one example for which we might possibly want to explicitly delete <code>operator-&gt;</code>. It has an <code>operator*</code> that returns <code>*this</code>, and it has a member function <code>failed()</code>, so it would allow calling <code>it-&gt;failed()</code> with the same meaning as <code>it.failed()</code>.</p>
<h3 id="types-that-have-operator-now-and-it-behaves-the-same-as-the-synthesized-operator">Types that have <code>operator-&gt;</code> now and it behaves the same as the synthesized operator</h3>
<p>All types in this section have an <code>operator-&gt;</code> that is identical to the synthesized version if we do not wish to support users calling with the syntax <code>thing.operator-&gt;()</code>.</p>
<ul>
<li><code>optional</code></li>
<li><code>unique_ptr</code> (single object)</li>
<li><code>shared_ptr</code></li>
<li><code>weak_ptr</code></li>
<li><code>basic_string::iterator</code></li>
<li><code>basic_string_view::iterator</code></li>
<li><code>array::iterator</code></li>
<li><code>deque::iterator</code></li>
<li><code>forward_list::iterator</code></li>
<li><code>list::iterator</code></li>
<li><code>vector::iterator</code></li>
<li><code>map::iterator</code></li>
<li><code>multimap::iterator</code></li>
<li><code>set::iterator</code></li>
<li><code>multiset::iterator</code></li>
<li><code>unordered_map::iterator</code></li>
<li><code>unordered_set::iterator</code></li>
<li><code>unordered_multimap::iterator</code></li>
<li><code>unordered_multiset::iterator</code></li>
<li><code>span::iterator</code></li>
<li><code>istream_iterator</code></li>
<li><code>valarray::iterator</code></li>
<li><code>tzdb_list::const_iterator</code></li>
<li><code>filesystem::path::iterator</code></li>
<li><code>directory_iterator</code></li>
<li><code>recursive_directory_iterator</code></li>
<li><code>match_results::iterator</code></li>
<li><code>regex_iterator</code></li>
<li><code>regex_token_iterator</code></li>
<li><code>reverse_iterator</code></li>
<li><code>common_iterator</code></li>
<li><code>filter_view::iterator</code></li>
<li><code>join_view::iterator</code></li>
</ul>
<p>All of these types that are adapter types define their <code>operator-&gt;</code> as deferring to the base iterator&#39;s <code>operator-&gt;</code>. However, the Cpp17InputIterator requirements specify that <code>a-&gt;m</code> is equivalent to <code>(*a).m</code>, so anything a user passes to <code>reverse_iterator</code> must already meet this. <code>common_iterator</code>, <code>filter_view::iterator</code>, and <code>join_view::iterator</code> were added in C++20 and require <code>input_or_output_iterator</code> of their parameter, which says nothing about <code>-&gt;</code>. Its <code>operator-&gt;</code> is defined as the first in a series that compiles:</p>
<p>1) Try calling member <code>operator-&gt;</code> on the base iterator
2) Try taking the address of the value returned from <code>operator*</code>
3) Create a proxy object that stores by-value returns and returns the address of that</p>
<p>If this paper were accepted, we have two options.</p>
<p>1) Get rid of the manual definition of <code>operator-&gt;</code> from those new types, which is a breaking change for iterator types with an <code>operator-&gt;</code> that does something meaningfully different from what their <code>operator*</code> does, or
2) Manually define it only when the wrapped type has a member <code>operator-&gt;</code>. This would keep step 1, but eliminate steps 2 and 3.</p>
<h3 id="-iterator_traits-"><code>iterator_traits</code></h3>
<p><code>std::iterator_traits&lt;I&gt;::pointer</code> is essentially defined as <code>typename I::pointer</code> if such a type exists, otherwise <code>decltype(std::declval&lt;I &amp;&gt;().operator-&gt;())</code> if that expression is well-formed, otherwise <code>void</code>. The type appears to be unspecified for iterators into any standard container, depending on how you read the requirements. The only relevant requirement on standard container iterators (anything that meets Cpp17InputIterator) are that <code>a-&gt;m</code> is equivalent to <code>(*a).m</code>. We never specify that any other form is supported, nor do we specify that any of them contain the member type <code>pointer</code>. There are three options here:</p>
<ol>
<li>Change nothing. This would make <code>pointer</code> defined as <code>void</code> for types that have a synthesized <code>operator-&gt;</code></li>
<li>Specify a further fallback of <code>decltype(std::addressof(*a))</code> to maintain current behavior and allow users to delete their own <code>operator-&gt;</code> without changing the results of <code>iterator_traits</code></li>
<li>Deprecate or remove the <code>pointer</code> typedef, as it is not used anywhere in the standard except to define other <code>pointer</code> typedefs and it seems to have very little usefulness outside the standard.</li>
</ol>
<p>My recommendation is 2, 3, or both.</p>
<h4 id="-to_address-and-pointer_traits-"><code>to_address</code> and <code>pointer_traits</code></h4>
<p>[pointer.conversion] specifies <code>to_address</code> in terms of calling <code>p.operator-&gt;()</code>, so some thought will need to be put in there on what to do.</p>
<p>The following standard types can be used to instantiate <code>pointer_traits</code>:</p>
<ul>
<li><code>T *</code></li>
<li><code>unique_ptr</code></li>
<li><code>shared_ptr</code></li>
<li><code>weak_ptr</code></li>
<li><code>span</code></li>
</ul>
<p>However, none of them are specified to have member <code>to_address</code>.</p>
<p>Note that <code>span</code> does not have <code>operator-&gt;</code> and is thus not relevant to the below discussion at all. <code>unique_ptr</code>, <code>shared_ptr</code>, and <code>weak_ptr</code> are not iterators, and are thus minimally relevant to the below discussion.</p>
<p><code>std::to_address</code> is specified as calling <code>pointer_traits&lt;Ptr&gt;::to_address(p)</code> if that is well formed, otherwise calling <code>operator-&gt;</code> with member function syntax. This leaves us with several options:</p>
<ol>
<li>Leave this function as-is and specify that all of the types that currently have <code>operator-&gt;</code> have a specialization of <code>pointer_traits</code> that defines <code>pointer_traits&lt;T&gt;::to_address</code></li>
<li>Specify that all types that currently have <code>operator-&gt;</code> work with <code>std::to_address</code></li>
<li>Define a second fallback if <code>p.operator-&gt;()</code> is not valid that would be defined as <code>std::addressof(*p)</code>. This is similar to the question for <code>std::iterator_traits&lt;I&gt;::pointer</code>.</li>
</ol>
<p>1 and 2 feel like the wrong approach -- they would mean that authors of iterator types still need to define their own <code>operator-&gt;</code>, or they must specialize some class template (if we agree that the current semantics with regard to iterators are correct), or they must overload <code>to_address</code> and we make that a customization point found by ADL.</p>
