Document number J16/01-0010 = WG21 N1296

User-supplied specializations of standard library algorithms
============================================================

Peter Dimov, pdimov@mmltd.net
     with advice from
David Abrahams, abrahams@mediaone.net


The Problem
-----------

Consider the user-defined type

  template<class T> class X
  {
   public:
    X(): pt(new T) {}

    X(X const & rhs): pt(new T(*rhs.pt)) {}

    ~X() { delete pt; }

    void swap(X & rhs) { std::swap(pt, rhs.pt); }

    X & operator= (X const & rhs) { X(rhs).swap(*this); return *this; }

    friend inline bool operator< (X const & lhs, X const & rhs)
    {
      return *lhs.pt < *rhs.pt;
    }

   private:
    T * pt;
  };

which is a textbook example of the "Handle/Body" idiom, used among other things
to achieve the strong exception guarantee in X's assignment operator.

X can be inserted in standard containers, and it's possible to use standard
algorithms like std::sort on containers of X.

"Possible" does not mean "efficient" in this case, however. std::sort will
probably have to swap elements of type X, and it's likely that it will use the
standard algorithm std::swap to do it. The default std::swap will copy-construct
a temporary X and use two assignments. These operations involve three dynamic
memory allocations.

Being a model citizen, our class defines a specialized swap that is considerably
faster, does no memory allocations and will never throw; we only need to somehow
convince the standard library to use it:

  namespace std
  {
    template<class T> void swap(X<T> & lhs, X<T> & rhs) { lhs.swap(rhs); }
  }

The standard classes use this idiom, so it must be a good thing, right?

Wrong. Let's turn our attention to 17.4.3.1/1:

"It is undefined for a C++ program to add declarations or definitions to
namespace std or namespaces within namespace std unless otherwise specified. A
program may add template specializations for any standard library template to
namespace std. Such a specialization (complete or partial) of a standard library
template results in undefined behavior unless the declaration depends on a
user-defined name of external linkage and unless the specialization meets the
standard library requirements for the original template."

Our std::swap definition is not a specialization, but an overload. There is no
such thing as partial specialization of a function template. This means that it
results in undefined behavior, although the compiler will most likely not
diagnose it. Furthermore, even if this specialization were legal, there is no
guarantee that std::sort would find and use it.

The expectations raised by 17.4.3.1/1 include:

* Namespace std is a closed scope. The implementor has complete control over the
  declarations and definitions in std. The only exception are user-supplied
  template specializations that cannot introduce new templates and cannot
  interfere with overload resolution. In principle, this allows more efficient
  implementation and better error reporting.

* Users are able to provide more efficient versions of standard library
  algorithms for use with their own types, with std::swap being an important
  example.

* The standard library will take advantage of the more efficient algorithms
  provided by the users.

As the example shows, none of these expectations have been met in practice.

* Current compilers do not enforce the special properties of namespace std;
  inexperienced users have the tendency to "fix" problems with their code by
  introducing names into std.

* It is not possible to provide a specialized algorithm for a user-defined
  class template; see LWG issue 226.

* The standard library is not required to implement algorithms in terms of other
  more primitive ones (e.g. swap). These implementations fail to recognize user
  specializations, resulting in severe performance penalties. 

* On the other end of the spectrum, some implementations find (via argument
  depending lookup) and call user functions that match the signature of a
  standard library algorithm but do not satisfy the 17.4.3.1/1
  requirements. (See LWG issues 187, 225, 229.)


The Solutions
-------------

Of the three points outlined above, all but the second may be classified as
Quality of Implementation issues. The inability for a user to supply a
specialized algorithm for some of his/her own types, however, is an important
problem that has no practical answer within the current language rules. Possible
solutions include

* Open namespace std, allowing users to add overloads of standard library
  algorithms;

* Define, for each library algorithm, a set of function names that are used
  without qualification. This will allow user-defined functions to be found via
  argument-dependent lookup and participate in the overload resolution;

* Define, for each library algorithm that is a template, a corresponding helper
  class template that can be partially specialized by users;

* Do nothing about the problem in general and only fix std::swap

* Introduce a new feature into the core language - partial specialization of
  function templates.

Note: Only the first two solutions will help users who want to specialize
non-template library functions like std::abs.


Allow std:: overloads
---------------------

Making namespace std an ordinary, non-magical, namespace and allowing users to
add overloads into it is the most straightforward solution. Its benefits are (1)
simplicity, (2) it takes effect immediately.

Cons:

* std:: loses some of its special properties; this may or may not impact
  compiler implementors.

* It is not clear which user declarations (definitions) are legal in std; the
  standard library cannot be expected to gracefully handle anything the users
  throw at it. For example,

  namespace std
  {
    void swap(int &, double &);
  }

should not be allowed.

* Some user overloads, while obviously legal, allow incorrect code to compile:

  #include <algorithm>

  struct base {};

  namespace std { void swap(base&, base&); }

  struct derived: base {};

  base b; derived d;

  std::swap(b, d); // compiles

Without the overload the last line will not compile.

* It is possible to overload a standard algorithm without first including its
  corresponding header file, producing programs which silently compile but fail
  to work properly:

-- base.hpp:

  struct base {};
  namespace std { void swap(base&, base&); }

-- derived.hpp:

  #include "base.hpp"
  struct derived : base
  {
    int x; // might not be properly swapped!
  };

-- derived.cpp:

  #include <set> // some standard header
  #include "derived.hpp"

  derived a, b;

  std::swap(a, b); // which swap?

The behavior of the last line depends on whether <set> makes the default
std::swap visible or not. If not, std::swap(base&, base&) will be called with
potentially disastrous results. Either case will compile but the results will
silently change from implementation to implementation.


Enable argument-dependent lookup on standard names
--------------------------------------------------

Another possibility is to let users define their specialized algorithms in the
namespace of the corresponding user-defined type. Standard library functions
that use unqualified calls will find the overload via argument-dependent lookup:

  namespace N
  {
    template<class T> struct X { T * pt };

    template<class T> void swap(X<T> & a, X<T> & b)
    {
      std::swap(a.pt, b.pt);
    }
  }

  namespace std
  {
    template<class It> void iter_swap(It i, It j)
    {
      swap(*i, *j);
    }
  }

  void f()
  {
    X<int> a, b;
    std::iter_swap(&a, &b); // will find N::swap<X<int> >(X<int> &, X<int> &)
  }

This solution avoids opening std::, but otherwise has all the same problems as
the approach of adding std:: overloads. By keeping user overloads distinct from
the general algorithms in std::, it allows users to choose whether to use the
generalized or specialized algorithms. It may be worth considering whether this
flexibility (and associated complexity) is an advantage or not.

However, each and every unqualified call from within a standard
library function must be documented, because this effectively "reserves" the
unqualified name in all namespaces. Undocumented unqualified calls are illegal
(LWG issue 225.)

Reserving a significant number of names in all namespaces is obviously not
appropriate, but a limited subset may be acceptable (swap, abs, etc.) This may
have an impact on existing user code.

Another problem with this approach is that third-party library authors (or
generic algorithm authors in general) must not use the familiar syntax of:

  template<class T> void f(T & x, T & y)
  {
    std::swap(x, y);
  }

but the following quirky modification instead:

  template<class T> void f(T & x, T & y)
  {
    using std::swap;
    swap(x, y);
  }

(the using declaration makes the unqualified swap() call work when T is a
built-in type.)


Provide helper classes
----------------------

This approach is illustrated by the following example:

  namespace std
  {
    template<class T> struct swap_helper
    {
      static void do_swap(T & a, T & b)
      {
        T t(a);
        a = b;
        b = t;
      }
    };

    template<class T> void swap(T & a, T & b)
    {
      std::swap_helper<T>::do_swap(a, b);
    }
  }

Now users can partially specialize std::swap_helper for their own types. No
change to 17.4.3.1/1 is necessary.

This kind of helper classes are commonly used to emulate partial specialization
of function templates.

A disadvantage of this solution is the number of new class templates that have
to be introduced into the standard library. The ability of users to specialize
both the algorithm and its corresponding helper class template may also prove
problematic and lead to subtle bugs.

As already noted, this solution cannot be applied to standard functions that are
not templates, like std::abs. This problem may be solved by turning the
functions that need specializing into templates.


Only fix std::swap
------------------

Although this paper so far discusses the broad problem of standard functions in
general, many developers actually need to specialize one and only one algorithm,
namely std::swap. For typical uses of the standard library, the performance of
std::swap is very important.  Moreover, we have all been educated on the
importance of providing a "fast, non-throwing" swap for our classes and that
swap() is a central technique for building strongly exception-safe types out of
types which may provide only the basic guarantee. Yet the default std::swap is
neither fast nor non-throwing. Had the standard swap been adequate, this
discussion might not even have materialized.

There are two ways to improve the default definition of std::swap:

* Make std::swap(a, b) call a.swap(b) for user-defined types and retain the
  current semantics for built-in types.

This would still require users to explicitly provide the fast, non-throwing
swap, but will eliminate the need to specialize std::swap for every user-defined
type. The std::swap overloads for std::vector, std::string, etc. may be removed.

* Make the compiler generate std::swap specializations if necessary.

This proposal treats swap as the fundamental operation that many believe it
is. The rationale for this proposal is that a precedent has been already set
with the copy constructor and the assignment operator. The ability of the
compiler to compose new functions by recursively applying a function to
subobjects can easily be extended to std::swap. Such an automatically generated
swap() specialization will nearly always be adequate (which is not true for the
automatically generated assignment, for instance.) An explicit user
specialization will override the compiler-generated version, as
usual. Unfortunately, it doesn't help the user who wishes to define a
specialized swap for UDT templates.

(Note: It may be worth discussing whether the signature of std::swap should change to

  template<class T> void swap(T &, T &) throw();

in both cases.)


Partial specialization of function templates
--------------------------------------------

Allowing partial specialization of function template would solve part of the
problem in an elegant way without any changes to the existing standard library
definition, while at the same time avoiding most of the problems associated with
the other approaches discussed so far. It also addresses some other "holes" in
the current core language definition. See the follow-on paper "Partial
Specialization of Function Templates" for a detailed description.

Its disadvantages are that

* it requires a core language change to solve what is generally perceived as a
  library problem;

* it would take a while for compiler vendors to implement the feature, while in
  the meantime the problem will remain unsolved (although the std::swap change
  outlined in the previous section may be used as a stopgap measure.)

* As already noted, this solution cannot be applied to standard functions that
  are not templates, like std::abs. This problem may be solved by turning the
  functions that need specializing into templates.

