<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
    "http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
  <meta content="text/html; charset=iso-8859-1">
  <title>Adding Multimethods to C++</title>
  <meta name="GENERATOR" content="Microsoft FrontPage 5.0">
</head>

<body lang="en">
<h1> <a name="1">Draft proposal for adding Multimethods to C++</a></h1>

<p>Document number: WG21/N1529 = J16/03-0112</p>

<p>Author: Julian Smith (<a
href="mailto:jules@op59.net">jules@op59.net</a>)</p>

<p>Version: 2nd draft, $Date: 2003/09/22 00:34:28 $ $Revision: 1.23 $</p>
<htmlcontents><blockquote>
    <a href="#1.1">1 History</a><blockquote>
      <a href="#1.1.1">1.1 Changes from post-Oxford mailing version (28 Apr 2003) to pre-Kona mailing version (22 Sep 2003)</a></blockquote>
    <a href="#1.2">2 Multimethods as a C++ extension</a><blockquote>
      <a href="#1.2.1">2.1 Syntax</a><br>
      <a href="#1.2.2">2.2 Dispatch algorithm</a><br>
      <a href="#1.2.3">2.3 Dispatch speed</a><br>
      <a href="#1.2.4">2.4 Miscellaneous issues</a><blockquote>
        <a href="#1.2.4.1">2.4.1 Polymorphic classes</a><br>
        <a href="#1.2.4.2">2.4.2 Allowed virtual parameter types</a><br>
        <a href="#1.2.4.3">2.4.3 Multimethod exceptions</a><br>
        <a href="#1.2.4.4">2.4.4 Exception specifications</a><br>
        <a href="#1.2.4.5">2.4.5 Multimethod overloads</a><br>
        <a href="#1.2.4.6">2.4.6 Default parameters</a><br>
        <a href="#1.2.4.7">2.4.7 Use of multimethods before <code>main()</code> is
entered</a></blockquote></blockquote>
    <a href="#1.3">3 Impact on the standard</a><br>
    <a href="#1.4">4 Possible extensions</a><blockquote>
      <a href="#1.4.1">4.1 Friendship</a><br>
      <a href="#1.4.2">4.2 Use of static types in the dispatch algorithm</a><br>
      <a href="#1.4.3">4.3 Use with templated virtual parameters</a></blockquote>
    <a href="#1.5">5 Omitted features</a><blockquote>
      <a href="#1.5.1">5.1 Covarient returns</a><br>
      <a href="#1.5.2">5.2 Multimethod member functions</a><br>
      <a href="#1.5.3">5.3 Multimethod templates</a><br>
      <a href="#1.5.4">5.4 Multimethod overloads</a></blockquote>
    <a href="#1.6">6 Other issues</a><blockquote>
      <a href="#1.6.1">6.1 Multimethod syntax, and getting direct access to
multimethod implementations</a><blockquote>
        <a href="#1.6.1.1">6.1.1 Using a naming convention for multimethod
implementations</a><br>
        <a href="#1.6.1.2">6.1.2 User specification of multimethod implementation
names</a><br>
        <a href="#1.6.1.3">6.1.3 Special namespace for multimethod
implementations</a><br>
        <a href="#1.6.1.4">6.1.4 Use <code>static_cast</code></a><br>
        <a href="#1.6.1.5">6.1.5 Using pure-virtual-like syntax</a></blockquote>
      <a href="#1.6.2">6.2 Raw pointer to multimethod implementation
function</a><br>
      <a href="#1.6.3">6.3 Use with dynamic loading of code</a><br>
      <a href="#1.6.4">6.4 Diagnostics</a></blockquote>
    <a href="#1.7">7 Multimethods resources</a><br>
    <a href="#1.8">8 Acknowledgements</a><br>
    <a href="#1.9">9 Cmm implementation issues</a><blockquote>
      <a href="#1.9.1">9.1 Generated code for multimethod dispatch</a><br>
      <a href="#1.9.2">9.2 Registering multimethods using static
initialisation</a><br>
      <a href="#1.9.3">9.3 Dispatch caches</a><br>
      <a href="#1.9.4">9.4 Dispatch functions</a><br>
      <a href="#1.9.5">9.5 Raw pointer to implementation function</a><br>
      <a href="#1.9.6">9.6 Constant-time dispatch speed</a></blockquote></blockquote>
</htmlcontents>


<h2>1 <a name="1.1">History</a></h2>

<h3>1.1 <a name="1.1.1">Changes from post-Oxford mailing version (28 Apr 2003) to pre-Kona mailing version (22 Sep 2003)</a></h3>

<p>Altered: multimethod implementation syntax changed from trailing underscores to use of <code>static</code> qualifiers. exception specification requirements fixed. removed ability to directly call multimethod implementations. covarient returns prohibited. exception class name changed. document number changed from 03-0046/N1463 to WG21/N1529 = J16/03-0111

<p>New: auto-generated contents. need unambiguous down casts. virtual params can be pointers as well as references, with volatile as well as const qualifiers. multimethod templates discussion. default parameters prohibited in implementations. added discussions about possible extensions (friendship, static types in dispatch, templated virtual parameters such as smart pointers). prohibit overloaded multimethods. alternative syntaxes discussed.


<h2>2 <a name="1.2">Multimethods as a C++ extension</a></h2>

<h3>2.1 <a name="1.2.1">Syntax</a></h3>

<p>Stroustrup's <cite>Design and Evolution of C++</cite> suggests a syntax
for multimethods in C++, which this proposal approximately follows.</p>

<p>A function prototype is a multimethod function if one or more of its
parameters are qualified with the keyword <code>virtual</code>.
Implementations of the multimethod have these parameters qualified with a
<code>static</code>.</p>

<p>The parameters that correspond to the virtual parameters in the virtual
function prototype must be derived from the original types, and have the same
modifiers (<code>const</code>/<code>volatile</code>). They must also derive
unambiguously from the original types - for example a cast from the base type
to the derived type must not give an `ambiguous base class' error.</p>

<p>Virtual function declarations shall occur before multimethod
implementation function definitions/declarations. Multimethod implementation
functions have the same name as the multimethod function but are not directly
callable, e.g. they are renamed internally by the compiler.</p>

<p>Here is the classic multimethods example, an <code>overlap()</code>
function that takes references to a <code>shape</code> base class. Detecting
whether two shapes overlap requires different code for each combination of
shapes:</p>
<pre><code>struct  shape            {...};
struct  square   : shape {...};
struct  triangle : shape {...};

bool    overlap( virtual shape&amp; a, virtual shape&amp; b);
    
bool    overlap( static square&amp;   a, static triangle&amp; b) {...}
bool    overlap( static triangle&amp; a, static square&amp;   b) {...}
bool    overlap( static shape&amp;    a, static square&amp;   b) {...}
bool    overlap( static square&amp;   a, static shape&amp;    b) {...}</code></pre>

<p>The <code>overlap( virtual shape&amp; a, virtual shape&amp; b)</code>
prototype is replaced by a dispatch function with prototype <code>overlap(
shape&amp; a, shape&amp; b)</code>. Internally, this dispatch function uses
C++ RTTI to choose one of the available <code>overlap()</code> functions,
based on the dynamic types of its parameters.</p>

<h3>2.2 <a name="1.2.2">Dispatch algorithm</a></h3>

<p>It is possible that there isn't any matching multimethod implementation
function for a particular combination of dynamic types. In this case, the
generated dispatch function will throw an exception. It will also throw an
exception if there is no clear best multimethod implementation for a
particular combination of dynamic types. See below for the new exception
types that can be thrown.</p>

<p>A multimethod implementation is considered best only if both the following
two conditions apply:</p>
<ol>
  <li>All of the best multimethod implementation's parameter types match the
    dynamic types.</li>
  <li>Each of the best multimethod implementation's parameter types is at
    least as derived as the corresponding parameter type in any other
    matching multimethod implementation.</li>
</ol>

<p>This is the same as the lookup rule used by C++ to resolve overloaded
functions at compile time, apart from compile-time C++'s use of conversion
operators and special-casing of built-in types.</p>

<p>Note that we cannot have duplicate multimethod implementations, so the
second condition implies that for each other matching multimethod
implementation X, the best multimethod implementation must have at least one
parameter type that is more derived than the corresponding parameter type in
X.</p>

<p>So if the available implementations of <code>overlap</code> are as in the
above code example, then:</p>
<pre><code>shape&amp;  s = *new square;
shape&amp;  t = *new triangle;
overlap( t, t); // Throws - no matching multimethod implementation.
overlap( s, t); // Calls overlap( static square&amp;,   static triangle&amp;).
overlap( t, s); // Calls overlap( static triangle&amp;, static square&amp;).
overlap( s, s); // Throws - two multimethod implementations
                // both match, but neither is a
                // better match than the other:
                //   overlap( static shape&amp;,  static square&amp;);
                //   overlap( static square&amp;, static shape&amp;);</code></pre>

<p>There is also the possibility of ambiguities caused by a dynamic type
multiply-inheriting from the same class more than once (a similar error can
already occur at compile time if a static type multiply-inherits from the
same class more than once).</p>

<h3>2.3 <a name="1.2.3">Dispatch speed</a></h3>

<p>The basic dispatch algorithm described above involves much use of RTTI and
is linear in the number of multimethod implementations available, so it is
very slow. But dispatch speed can be significantly improved using a cache.</p>

<p>For example. O(LogN) dispatch speed can be easily obtained using a
<code>std::map</code>-style dispatch cache, mapping from arrays of
<code>std::type_info</code>'s to function pointers.</p>

<p>It is also possible to get constant-time dispatch if classes are assigned
small unique integer identifiers, which could be stored in the vtable. This
allows a dispatch array of pointers to functions to be created for each
multimethod, using the small integer identifiers as indices (in effect, this
is making vtables that belong to individual multimethod dispatch functions
rather than individual classes).</p>

<p>To avoid whole-programme analysis at build-time, it may be best to
initialise these small integers to zero, and assign non-zero values at
runtime whenever a class is first involved in a multimethod call; this would
also avoid wasting integer space on classes that aren't involved in
multimethod calls, which in turn would reduce the sizes of the dispatch
arrays.</p>

<p>A dispatch function for the <code>overlap()</code> example using these
indices would look something like:</p>
<pre><code>bool overlap( shape&amp; a, shape&amp; b)
{
    typedef bool (*fnptr)( shape&amp; a, shape&amp; b);
    
    static void fnptr** cache = NULL;

    int a_index = a._vtable-&gt;small_integer;
    int b_index = b._vtable-&gt;small_integer;
        
    if (   a_index
        &amp;&amp; b_index
        &amp;&amp; cache
        &amp;&amp; ((int) cache[0]) &gt; a_index
        &amp;&amp; cache[ a_index]
        &amp;&amp; ((int) cache[ a_index][0]) &gt; b_index
        &amp;&amp; cache[ a_index][ b_index])
    {
        // fast dispatch:
        return cache[ a_index][ b_index]( a, b);
    }
    else
    {
        /* Slow dispatch.
        Ensure _vtable-&gt;small_integer's are assigned.
        Ensure cache points to array of at least
            a_index+1 items, (storing array size in
            cache[0]).
        Ensure cache[ a_index] points to array of at
            least b_index+1 items, (storing array
            size in cache[ a_index][0]).
        Fill the cache using the slow algorithm
        Make function call. */
        ...
    }
}</code></pre>

<p>The code leading to the `fast dispatch' could be implemented in very few
assembler instructions (e.g. around 10 instructions on an ARM), resulting in
multimethod dispatching that is comparable in speed to existing virtual
function dispatch.</p>

<h3>2.4 <a name="1.2.4">Miscellaneous issues</a></h3>

<h4>2.4.1 <a name="1.2.4.1">Polymorphic classes</a></h4>

<p>Classes that are used in multimethods shall be polymorphic - i.e. have at
least one virtual function. (This is necessary to allow the generated
dispatch code to use RTTI on the classes.)</p>

<h4>2.4.2 <a name="1.2.4.2">Allowed virtual parameter types</a></h4>

<p>Each virtual parameter must be one of:</p>
<ul>
  <li>A reference to a const/volatile-qualified class.</li>
  <li>A pointer to a const/volatile-qualified class.</li>
</ul>

<p>If a virtual parameter is a pointer, passing a NULL pointer gives
undefined behaviour.</p>

<p>Multimethod implementations must match the use of reference/pointer
parameters in the original multimethod:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape*);
...
bool overlap( static square&amp; a, static triangle* b){...}
  // Ok.
bool overlap( static circle&amp; a, static ellipse&amp; b) {...}
  // Error - not an implementation of the multimethod,
  // because second parameter should be a pointer.</code></pre>

<p>One can use non-virtual parameters in a multimethod in addition to the
virtual parameters. So a multimethod could look like:</p>
<pre><code>int foo(
    std::string&amp;,
    virtual Base&amp;,
    int x,
    virtual const Base&amp;);</code></pre>

<h4>2.4.3 <a name="1.2.4.3">Multimethod exceptions</a></h4>

<p>There is a new abstract <code>std::dispatch_error</code> class derived
from <code>std::exception</code>. Derived from
<code>std::dispatch_error</code> are two new classes,
<code>std::dispatch_ambiguous</code> and
<code>std::dispatch_unmatched</code>. How much information these classes
convey is implementation-defined.</p>

<h4>2.4.4 <a name="1.2.4.4">Exception specifications</a></h4>

<p>If a multimethod has exception specifications, then implementations of the
multimethod must have exceptions specifications that are at least as
restrictive. However, multimethod implementations can ommit any
<code>std::dispatch_error</code> from their multimethod's exception
specification if they do not call any multimethods themselves - dispatch
exceptions are generated before a multimethod's implementation is called.</p>

<p>Interaction of multimethod dispatch with these exception specifications is
much the same as for any other function. For example, if a multimethod has an
exception specification that does not include
<code>std::dispatch_error</code>, and a multimethod call is ambiguous, then
<code>std::unexpected()</code> will be called.</p>

<h4>2.4.5 <a name="1.2.4.5">Multimethod overloads</a></h4>

<p>It is an error for a multimethod to overload a different multimethod with
compatible virtual types.</p>

<p>So:</p>
<pre><code>bool overlap( virtual shape&amp;,  virtual shape&amp;, int);

bool overlap( virtual square&amp;, virtual shape&amp;, int);
// Error - square is derived from shape, and all other
// parameters are identical, so this overloads the
// previous multimethod.

bool overlap( virtual square&amp;, virtual shape&amp;, double);
// Ok - the third parameter makes things unambiguous.</code></pre>

<h4>2.4.6 <a name="1.2.4.6">Default parameters</a></h4>

<p>Multimethod declarations can have default parameters, but multimethod
implementations cannot have default parameters.</p>

<h4>2.4.7 <a name="1.2.4.7">Use of multimethods before <code>main()</code> is
entered</a></h4>

<p>Multimethods shall not be called before <code>main()</code> - i.e. static
initialisation code shall not call multimethods.</p>

<p>This restriction is purely to allow implementations that initialise
globals before <code>main()</code> is entered, to use their own global
initialisation support to register multimethods at runtime, avoiding the need
to do whole-programme analysis at build time.</p>

<h2>3 <a name="1.3">Impact on the standard</a></h2>

<p>This proposal's syntax doesn't conflict with any existing C++ syntax.</p>

<p>Implementations that do not perform link-time analysis to collect all the
multimethod implementations in a programme, and do not initialise globals
before <code>main()</code> is entered, will have to use a custom techique to
ensure that multimethod implementations are registered correctly before
<code>main()</code> is entered.</p>

<h2>4 <a name="1.4">Possible extensions</a></h2>

<h3>4.1 <a name="1.4.1">Friendship</a></h3>

<p>Particular classes could nominate a multimethod as a friend; this would
make all implementations of the multimethod friends of the class too.</p>

<p>Friendship could also be specified using a mix of static/virtual
qualifiers to grant friendship to subsets of the available multimethod
implementations:</p>
<pre><code>
void foo( virtual base1&amp;, virtual base2&amp;);

struct derived1 : base1
{
    friend void bar( static derived1&amp;, static base2&amp;);
    // Simple case: only this exact implementation is
    // a friend.

    friend void foo( static derived1&amp;, virtual base2&amp;);
    // All multimethod implementations that take a
    // `derived1&amp;' as the first parameter are friends
    // of derived1, regardless of the second parameter
    // type.
    // E.g. the following would be friends of derived1:
    //     void foo( static derived1&amp; x, static base2&amp;);
    //     void foo( static derived1&amp; x, static derived2&amp;);
    
    friend void foo( virtual derived1&amp;, virtual base2&amp;);
    // all implementations are friends of derived1.
};</code></pre>

<h3>4.2 <a name="1.4.2">Use of static types in the dispatch algorithm</a></h3>

<p>We could allow the caller of a multimethod to mark some virtual parameters
with a <code>static</code> prefix, telling the multimethod dispatcher to use
static types for these parameters, rather than their dynamic types.</p>

<p>If all virtual parameters are marked in this way, then the dispatch can in
theory be resolved at build time. Note that this is not the same as
conventional overloading, because the dispatch will still see matching
implementations in all compilation units, not just the current compilation
unit:</p>
<pre><code>bool overlaps = overlap( a, b);
// Conventional dynamic dispatch using the dynamic types
// of both `a' and `b'.

bool overlaps = overlap( static a, b);
// Do multimethod dispatch but use the static type
// of `a' rather than its dynamic type, and the dynamic
// type of `b' as before.

bool overlaps = overlap( static a, static b);
// Do multimethod dispatch but use the static types
// of both `a' and `b' rather than their dynamic types.
// This call can be fully resolved at link-time.

bool overlaps = overlap(
    static static_cast&lt; triangle&amp;&gt;( a),
    static static_cast&lt; square&amp;&gt;( b));
// Example of using casts in conjunction with a request to
// use the static type. This avoids any runtime dispatch
// when we know at something about the dynamic types at
// compile time.</code></pre>

<p>This is analogous to qualifying virtual function calls, where it is
guaranteed that all matching virtual functions have seen (because all base
classes  must be visible when a derived class is visible). It is different
from conventional compile-time overloading, which only considers what is
visible in the current compilation unit.</p>

<h3>4.3 <a name="1.4.3">Use with templated virtual parameters</a></h3>

<p>Sometimes it is useful to pass a smart pointer to a function rather than a
plain pointer or reference. For example the called function might want to
copy the smart pointer into a persistent data structure before returning.</p>

<p>The implicit <code>this</code> parameter passed to C++ member functions
doesn't allow this sort of usage without the user passing the smart pointer
explicitly:</p>
<pre><code>struct foo
{
    virtual void fn( shared_ptr&lt; foo&gt; ptr);
}
...
shared_ptr&lt; foo&gt; x = ...;
x-&gt;fn( x); // have to pass the smart pointer explicity</code></pre>

<p>Multimethods could be extended to dispatch on the type of what a smart
pointer parameter points to, rather than the type of the parameter itself:</p>
<pre><code>void foo( shared_ptr&lt; virtual base&gt; x);
...
void foo( shared_ptr&lt; static derived&gt;      x) {...}
void foo( shared_ptr&lt; static otherderived&gt; x) {...}</code></pre>

<p>This requires three function templates to be provided that behave
similarly to <code>dynamic_cast</code>, <code>static_cast</code> and
<code>typeid</code>, except that they know about particular class templates
and operate on what the class templates are handles for, rather than the
class templates themslves:</p>
<ol>
  <li>A function template <code>template_dynamic_test</code> that is similar
    to <code>dynamic_cast</code>, except that it returns
    <code>true</code>/<code>false</code> depending on whether the dynamic
    type of the pointee means that one could cast from one instantiation of
    the class template to a different instantiation of the same class
    template.</li>
  <li>A function template <code>template_static_cast</code> that is similar
    to <code>static_cast</code>, except that it casts from one instantiation
    of the class template to a different instantiation of the same class
    template. It assumes that a similar call to
    <code>template_dynamic_test</code> would return <code>true</code>.</li>
  <li>A function <code>template_typeid</code> that is similar to
    <code>typeid</code> except that it takes an instance of an instantiation
    of a class template and returns the <code>std::type_info</code> of the
    object that it refers to.</li>
</ol>

<p>When <code>static</code> is used inside the declaration of a function
parameter type template, it identifies the class that that parameter should
be treated as when performing dispatch. This means that one can write a whole
set of implementations that take a smart pointer to various derived types,
and the dispatch behaviour will be the same as if plain references had been
used instead of smart pointers.</p>

<p>For example, to make this scheme work with Boost's
<code>boost::shared_ptr</code>, you would use the following definitions of
the three function templates:</p>
<pre><code>#include "boost/smart_ptr.hpp"

template&lt; class T&gt;
  const std::type_info&amp;
    template_typeid(
      boost::shared_ptr&lt; T&gt; x)
{
  return typeid( *x);
}
template&lt; class derived, class base&gt;
  bool
    template_dynamic_test(
      boost::shared_ptr&lt; base&gt; x)
{
  if ( boost::shared_dynamic_cast&lt; derived&gt;( x)).get()
    return true;
  return false;
}
template&lt; class derived, class base&gt;
  boost::shared_ptr&lt; derived&gt;
    template_static_cast(
      boost::shared_ptr&lt; base&gt; x)
{
  return boost::shared_static_cast&lt; derived&gt;( x);
}</code></pre>

<p>These template specialisations have to be seen before each multimethod
implementation.</p>

<p>Cmm 0.25 and later support templated virtual parameters (with slightly
different names for the function templates).</p>

<h2>5 <a name="1.5">Omitted features</a></h2>

<p>Here are some features that have been suggested, with reasons why they are
not included in the proposal.</p>

<h3>5.1 <a name="1.5.1">Covarient returns</a></h3>

<p>By analogy with virtual functions, implementations of a multimethod that
returns a reference or pointer to a type <code>T</code>, could be allowed to
return a reference or pointer to anything derived from <code>T</code>.
However, this would only be useful if the multimethod implementations could
be called directly, which is not the case. The suggested extension to support
qualifying parameters at the point of call with the <code>static</code>
keyword does not change this - the function call is still in principle a call
to the dispatch function, not a specific multimethod implementation.</p>

<h3>5.2 <a name="1.5.2">Multimethod member functions</a></h3>

<p>It has been suggested that it should be possible to make member functions
into multimethods. I don't like this idea because it adds complications for
no real gain in functionality. In particular, mixing virtual member functions
and multimethods would be plain confusing and not do anything that couldn't
be done by a straightforward multimethod.</p>

<h3>5.3 <a name="1.5.3">Multimethod templates</a></h3>

<p>In principle, it may be possible to have multimethod templates, such
as:</p>
<pre><code>template&lt; class shapebase&gt;
bool overlap( virtual shapebase&amp;, virtual shapebase&amp;);
// Dispatches to an overlap&lt; shapebase&gt;() implementation.

template&lt; class shapebase&gt;
bool overlap( static triangle&lt; shapebase&gt;&amp; t) {...}</code></pre>

<p>This would require extending export to instantiate any matching
multimethod implementations in all source files whenever a particular
multimethod template was instantiated.</p>

<p>In contrast, conventional class templates can have virtual functions
without requiring separate compilation of templates, because one can put the
virtual function definitions inside the class template itself, ensuring that
it is instantiated whenever the class template is instantiated.</p>

<p>However, if export was extended to support these multimethod templates,
then it would probably be straightforward to also have the equivalent of
templated virtual member functions, of both ordinary classes and class
templates, which are not supported in C++. This is because the restrictions
imposed by vtable implementations of virtual functions don't really exist
with multimethods. For example, multimethod dispatch tables are `owned' by
the multimethod, not by a class.</p>

<h3>5.4 <a name="1.5.4">Multimethod overloads</a></h3>

<p>In theory it would be possible to provide compile-time overloads of
multimethods:</p>
<pre><code>bool overlap( virtual shape&amp;,  virtual shape&amp;);
bool overlap( virtual square&amp;, virtual shape&amp;);</code></pre>

<p>This could be used to restrict the number of possible multimethod
implementations that are available for consideration at runtime, so possibly
improving dispatch speed where some type information is available at compile
time. It would require that multimethod implementations are registered with
more than one multimethod.</p>

<p>I suspect that this would be a little-used feature, which would add
unnecessary complexity to the proposal, so it has been specifcally
excluded.</p>

<h2>6 <a name="1.6">Other issues</a></h2>

<h3>6.1 <a name="1.6.1">Multimethod syntax, and getting direct access to
multimethod implementations</a></h3>

<p>Some may question the use of the <code>static</code> keyword in
multimethod implementation function parameters. These keywords aren't
strictly required - the compiler knows whether a function definition is a
multimethod implementation by seeing whether it matches a previously
occurring multimethod declaration.</p>

<p>However, it is important to distinguish multimethod implementation
functions from coventional functions, because they are not visible directly.
Furthemore, the use of a special syntax enables the compiler to give a
warning when the user mis-types the name of a multimethod implementation so
that it doesn't match a multimethod declaration.</p>

<p>While <code>static</code> is already used for many different concepts in
C++, it has a useful similarity with the phrase <em>static type</em>, and it
is not currently used inside function parameters declarations and so its
usage here doesn't cause confusion.</p>

<p>Some alternatives are:</p>

<h4>6.1.1 <a name="1.6.1.1">Using a naming convention for multimethod
implementations</a></h4>

<p>Multimethod implementations could use a specific naming convention, such
as having a trailing underscore:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape&amp;);
// A multimethod
bool overlap_( square&amp; a, triangle&amp; b) {...}
// An implementation of the above multimethod.</code></pre>

<p>This is simple and easy to understand, but seems too arbitrary to be
mandated in the C++ standard.</p>

<h4>6.1.2 <a name="1.6.1.2">User specification of multimethod implementation
names</a></h4>

<p>Instead of imposing a particular naming convention, we could let the user
specify it:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape&amp;) = overlap_impl;
// A multimethod

bool overlap_impl( square&amp; a, triangle&amp; b) {...}
// An implementation of the above multimethod.</code></pre>

<h4>6.1.3 <a name="1.6.1.3">Special namespace for multimethod
implementations</a></h4>

<p>Another possibility would be to put multimethod implementations inside a
special namespace:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape&amp;);
// Dispatches at runtime to one of the best-matching
// std_mm::overlap() functions.

namespace std_mm
{
  bool overlap( square&amp; a, triangle&amp; b) {...}
  ...
}

bool overlaps = overlap( a, b); // calls multimethod

bool overlaps = std_mm::overlap( a, b);
// calls one of the visible implementations of overlap(),
// using conventional compile-time overloading.</code></pre>

<h4>6.1.4 <a name="1.6.1.4">Use <code>static_cast</code></a></h4>

<p>A different possibility is to use the existing function casting mechanism
to select a particular multimethod implementation:</p>
<pre><code>
bool overlaps =
    static_cast&lt; bool (&amp;)( square&amp;, triangle&amp;)&gt;( overlap)
        ( shape1, shape2);</code></pre>

<p>There are two problems with this. First, casting to the base-most
parameters will give a base-most implementation (if it exists), which has the
curious result that casting the multimethod to its own type can give a
different function - the implementation that takes all base parameters:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape&amp;);
bool overlap( square&amp;, square&amp;) {...} // implementation 1.
bool overlap( shape&amp;,  shape&amp;)  {...} // implementation 2.

static_cast&lt; bool (&amp;)( square&amp;, square&amp;)&gt;( overlap);
// Ok, gives implementation 1.

static_cast&lt; bool (&amp;)( shape, shape)&gt;( overlap);
// Gives implementation 2, or could give the multimethod
// itself, because they both have the same signature.

static_cast&lt; bool (&amp;)( shape&amp;, square&amp;)&gt;( overlap);
// Compile-time error - no exact match.
// But we might like it to give either implementation
// 2 or the multimethod, because they both match.</code></pre>

<p>The second problem is that the specified types in the
<code>static_cast</code> must match an implementation exactly. This is
contrary to the way qualified virtual fucntions work - calling
<code>base::some_fn()</code> will look for <code>some_fn</code> in class
<code>base</code> or any of <code>base</code>'s base classes.</p>

<h4>6.1.5 <a name="1.6.1.5">Using pure-virtual-like syntax</a></h4>

<p>One way of avoiding the use of <code>static</code> in multimethod
declarations and multimethod implementations, is to use <code>virtual</code>
for both, and use the pure-virtual syntax <code>=0</code> for
multimethod.:</p>
<pre><code>bool overlap( virtual base&amp;, virtual base&amp;) = 0;
// multimethod declaration.

bool overlap( virtual square&amp; a, virtual triangle&amp; b)
{...}
// multimethod implementation.</code></pre>

<h3>6.2 <a name="1.6.2">Raw pointer to multimethod implementation
function</a></h3>

<p>For speed critical loops, it might be useful to provide a way for the user
to get a pointer to the best multimethod implementation function for a
particular set of dynamic types. This enables client code that calls a
multimethod on the same parameters many times in a loop, to cache the
function pointer and so avoid any dispatch overhead.</p>

<p>For example, Cmm implements this as an extra dispatch function with the
same name as the multimethod, but with a suffix <code>cmm_getimpl</code>.</p>

<h3>6.3 <a name="1.6.3">Use with dynamic loading of code</a></h3>

<p>If support for dynamic loading of code is added to the C++ standard in the
future, then multimethod implementations in shared libraries/DLLs should be
automatically registered/deregistered when shared libraries/DLLs are
loaded/unloaded. The Cmm implementation already does this for Unix and Cygwin
platforms.</p>

<h3>6.4 <a name="1.6.4">Diagnostics</a></h3>

<p>If a multimethod implementation takes a virtual parameter that derives
multiply from the base class used in the multimethod's corresponding virtual
parmeter, then it will never be possible for that mutimethod to be called
because down-casting from the base class to the derived class is ambiguous.
Maybe this should thus cause a diagnostic at compile time?</p>

<h2>7 <a name="1.7">Multimethods resources</a></h2>

<p>Bjarne Stroustrup writes about multimethods in <cite>The Design and
Evolution of C++</cite> (Section 13.8, pages 297-301).</p>

<p>The Dylan language (see <a
href="http://www.functionalobjects.com/resources/index.phtml">http://www.functionalobjects.com/resources/index.phtml</a>)
supports multimethods natively, as does CLOS (Common Lisp).</p>

<p>If one restricts oneself to Standard C++, it is possible to approximate
multimethods at the expense of verbosity in source code. See Item 31 in Scott
Meyers' <cite>More Effective C++</cite>, and chapter 11 of Andrei
Alexandrescu's <cite>Modern C++ Design</cite>, where template techniques are
used extensively. Bill Weston has a slightly different template technique
that supports non-exact matches, see <a
href="http://homepage.ntlworld.com/w.weston/">http://homepage.ntlworld.com/w.weston/</a>.</p>

<p>Cmm, a C++ multimethods implementation in the form of a source-code
translator, is available from <a
href="http://www.op59.net/cmm/readme.html">http://www.op59.net/cmm/readme.html</a>.</p>

<p>A paper about multimethods, C++ and Cmm is on the ACCU 2003 conference CD,
also available online at <a
href="http://www.op59.net/accu-2003-multimethods.html">http://www.op59.net/accu-2003-multimethods.html</a>.</p>

<p>The Frost project (<a
href="http://frost.flewid.de/">http://frost.flewid.de</a>) adds support for
multimethods to the i386-version of the gcc compiler.</p>

<h2>8 <a name="1.8">Acknowledgements</a></h2>

<p>Thanks to Bill Weston for many interesting and useful discussions about
multimethods.</p>

<p>Thanks to Loise Goldthwaite, Kevlin Henney and Anthony Williams and other
members of the UK C++ commitee for many comments about this proposal.</p>

<h2>9 <a name="1.9">Cmm implementation issues</a></h2>

<p>Cmm (See <a
href="http://www.op59.net/cmm/readme.html">http://www.op59.net/cmm/readme.html</a>)
is a source-code processor which implements a multimethods language extension
for C++. The description of Cmm's implementation here corresponds to cmm-0.26
(8 September 2003).</p>

<p>Cmm takes individual compilation units containing multimethod code that
have already been run through the C++ preprocessor (e.g.
<code>#include</code>'s have been expanded), and generates legal C++
compilation units, which can then be compiled and linked together
conventionally.</p>

<p>The generated C++ code calls some support functions that are provided as a
single source file called <code>dispatch.cpp</code>. This contains functions
that manage data structures that store all known multimethod functions and
their implementations, the actual runtime dispatch functions, functions to
support dispatch caching and also support for the exception types thrown for
ambiguous/unmatched dispatches.</p>

<p>Cmm has been designed to generate multimethod code that supports dynamic
loading and unloading of code, which means that all information about
multimethods and their implementations are stored in dynamic data strucures.
This is probably not directly relevent to this proposal, because the C++
standard doesn't concern itself with dynamic loading of code. However, it
gives an example of the flexibility that the multimethods model can give.</p>

<p>Such implementation issues are not directly part of this proposal, but Cmm
demonstrates that multimethods need not break the simple C compiler/linker
model.</p>

<h3>9.1 <a name="1.9.1">Generated code for multimethod dispatch</a></h3>

<p>In the simple <em>overlap</em> multimethod example described earlier, The
virtual function was:</p>
<pre><code>bool overlap( virtual shape&amp;, virtual shape&amp;);</code></pre>

<p>Consider an implementation of <code>overlap</code> that is specialised for
a first parameter <code>square</code> and second parameter
<code>triangle</code>. This will look like:</p>
<pre><code>// user-written multimethod implementation function
bool overlap_( square&amp; a, triangle&amp; b) {...}</code></pre>

<p>In order to perform multimethod dispatch, one has to first decide which
multimethod implementations match the dynamic types, and then try to find one
of the multimethod implementations which can be considered a better match
then all of the others.</p>

<p>The first step is done by calling auxiliary functions that Cmm creates for
each multimethod implementation function, which take the base parameters and
return <code>true</code> only if each of the parameters can be
<code>dynamic_cast</code>-ed to the multimethod implementation's parameters.
Because these functions takes the virtual function's base parameters, we
cannot use conventional overloading to distinguish them, and so Cmm makes the
function names unique using a mangling scheme which, for simplicity, will be
denoted by <code>_XYZ</code> in the following:</p>
<pre><code>// Cmm-generated match function for the function
// bool overlap( square&amp; a, triangle&amp; b);
bool overlap_cmm_match_XYZ( shape&amp; a, shape&amp; b)
{
    if ( !dynamic_cast&lt; square*  &gt;( &amp;a)) return false;
    if ( !dynamic_cast&lt; triangle*&gt;( &amp;b)) return false;
    return true;
} </code></pre>

<p>This separate function is generated in the same compilation unit as the
multimethod implementation, which enables the <code>dynamic_cast</code> to
work with derived types defined in anonymous namespaces. [Actually, the
<code>overlap_cmm_match_XYZ</code> function takes an array of two
<code>void*</code>'s rather than a separate parameter for each virtual type,
each of which is <code>static_cast</code>-ed to <code>shape*</code> before
the <code>dynamic_cast</code> is attempted. This is to enable generic
dispatch code to be used for different virtual functions.]</p>

<p>The second step requires that the inheritance tree for each dynamic type
is known. The dispatch code can then compare the types taken by each matching
multimethod implementation, and select the multimethod implementation for
which each virtual parameter is no less derived than any other matching
multimethod implementation's virtual parameter. As discussed earlier, this
corresponds to the way conventional overloading works at compile time.</p>

<p>The information about the inheritance trees is encoded in C-style strings
using a mangling scheme similar to that used by C++ compilers when generating
symbol names. This allows static initialisation to be used to register
multimethod implementations at runtime.</p>

<p>[Cmm can also register multimethod implementations at build time by
requiring a separate link-style invocation, but this made builds very
complicated and slow, and precludes use with dynamic loading of code. The
only advantages of this scheme are that dispatch time may be slightly faster,
and all multimethod implementations are usable by static initialisation
code.]</p>

<p>Finally, the generic dispatch code calls the actual multimethod
implementation via a wrapper function that takes the base parameters, casts
them directly to the derived types, and calls the multimethod implementation.
Again, this function name is mangled:</p>
<pre><code>// Cmm-generated call function for the function
// bool overlap_( square&amp; a, triangle&amp; b);
bool overlap_cmm_call_XYZ( shape&amp; a, shape&amp; b)
{
    return overlap(
        *static_cast&lt; square*  &gt;( &amp;a),
        *static_cast&lt; triangle*&gt;( &amp;b));
}</code></pre>

<p>This function's precondition is that the derived types are correct and so
the <code>static_cast</code>'s are legal. Using this wrapper function enables
the dispatch code to work in terms of generic function pointers even if
multimethod implementations use derived classes in anonymous namespace.</p>

<p>[The function should use <code>dynamic_cast</code> rather than
<code>static_cast</code> when <code>derived</code> inherits virtually from
<code>base</code>, but this hasn't been implemented yet.]</p>

<h3>9.2 <a name="1.9.2">Registering multimethods using static
initialisation</a></h3>

<p>For each implementation, Cmm generates a global variable whose
initialisation registers the implementation with the dispatch function:</p>
<pre><code>static cmm_implementation_holder
    overlap_XYZ(
        "7overlap2_1_5shape1_5shape",
           // the multimethod
        "8overlap_2_2_5shape_6square2_5shape_8triangle",
           // the multimethod implementation
        overlap_cmm_implmatch_XYZ,
        overlap_cmm_implcall_XYZ);</code></pre>

<p><code>cmm_implementation_holder</code> is a class defined in
<code>dispatch.h</code>/<code>dispatch.cpp</code>, whose constructor
de-mangles the first two textual parameters to extract complete information
about the inheritance tree for each virtual parameter taken by the virtual
function and the implementation function. Together with the
<code>overlap_cmm_match</code> functions, this is sufficient to enable
multimethod dispatch to be performed.</p>

<p>In this example, the first mangled string means: "A function called
<em>overlap</em> with 2 virtual parameters, the first a class containing one
item in its inheritance tree, <em>shape</em>, and the second also containing
the same single class in its inheritance tree, <em>shape</em>". The second
mangled string means: "A function called <em>overlap_</em> with 2 virtual
parameters, the first one being a class with 2 items in its inheritance tree,
<em>shape</em> followed by <em>square</em>, while the second parameter's type
also contains 2 items in its inheritance tree, the first one being
<em>shape</em>, and the second <em>triangle</em>".</p>

<p>This use of static initialisers to register implementations allows
dynamically loaded code to automatically register new implementations with
the dispatch functions. Furthermore, the destructor of the
<code>cmm_implementation_holder</code> class unregisters the implementations,
so one can load/unload code at will.</p>

<p>The handling of implementation functions in dynamically loaded code has
been succesfully tested on OpenBSD 3.2, OpenBSD 3.3, Slackware Linux 8.1 and
Cygwin/Windows, using the <code>dlopen()</code> and <code>dlclose()</code>
functions.</p>

<h3>9.3 <a name="1.9.3">Dispatch caches</a></h3>

<p>Figuring out which of a set of implementation functions to call for a
particular set of dynamic types is very slow, so some sort of caching scheme
is required. Caching is performed by the code in the
<code>dispatch.cpp</code> library. Currently this uses a
<code>std::map</code> to map from a <code>std::vector</code> of
<code>std::type_info</code>'s to a function pointer. This gives a dispatch
speed of O(Log N), where N is the number of different combinations of dynamic
types that have been encountered (some of which may be mapped to the same
function pointer). On OpenBSD 3.2 with gcc 2.95, the dispatch time for two
virtual parameters is around 10-100 times slower than a conventional virtual
function call.</p>

<p>It would probably be better to have special cache support for multimethods
with one or two virtual parameters, using a <code>std::map</code> with key
types <code>std::type_info[1]</code> and <code>std::type_info[2]</code>. No
doubt templates could be used to do this with maximum obfuscation.</p>

<h3>9.4 <a name="1.9.4">Dispatch functions</a></h3>

<p>The actual virtual dispatch function is very simple, because it uses code
in <code>dispatch.cpp</code> to do all the real work. This means that it is
practical to generate a separate copy in all compilation units as an inline
function, looking like:</p>
<pre><code>inline bool overlap( shape&amp; a, shape&amp; b)
{
    static cmm_virtualfn&amp;   virtualfn =
        cmm_get_virtualfn( "7overlap2_1_5shape1_5shape");
    typedef bool(*cmm_fntype)( shape&amp;, shape&amp;);
    
    const void*           params[] = { &amp;a, &amp;b};
    const std::type_info* types[]  =
        { &amp;typeid( a), &amp;typeid( b)};
    
    cmm_fntype cmm_fn = reinterpret_cast&lt; cmm_fntype&gt;(
            cmm_lookup( virtualfn, params, types));
    
    return cmm_fn( a, b);
}</code></pre>

<p>The <code>cmm_lookup</code> function uses <code>types</code> as an index
into the internal <code>std::map</code> dispatch cache. If this fails, the
actual parameters <code>params</code> are used in the slow lookup algorithm
described earlier. It returns a generic function pointer, which has to be
cast into the correct type using <code>reinterpret_cast</code>.</p>

<h3>9.5 <a name="1.9.5">Raw pointer to implementation function</a></h3>

<p>Cmm provides an extra dispatch function that doesn't actually call the
implementation. Instead, it returns a pointer to the best implementation
function. This enables client code that calls a multimethod on the same
parameters many times in a loop, to cache the function pointer and so avoid
any dispatch overhead.</p>

<p>This extra dispatch function has the same name as the virtual function,
with a suffix <code>_cmm_getimpl</code>. Using the <code>overlap</code>
example, if you have one collection of shapes that you know are all squares,
and you want to search for an overlap with a particular shape, you would
usually do:</p>
<pre><code>void Fn( std::vector&lt; square*&gt; squares, shape&amp; s)
{
    std::vector&lt; square*&gt;::iterator it;
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( overlap( **it, s)) {...}
    }
}</code></pre>

<p>With the generated <code>overlap_cmm_get_impl</code> function, you can
avoid the multimethod dispatch overhead in each iteration:</p>
<pre><code>void Fn( std::vector&lt; square*&gt; squares, shape&amp; s)
{
    std::vector&lt; square*&gt;::iterator it;
    if ( squares.empty()) return;
    
    bool (*fn)( shape&amp;, shape&amp;) =
        overlap_cmm_get_impl( s, squares[0]);
    
    for( it=squares.begin(); it!=squares.end() ++ it)
    {
        if ( fn( **it, s)) {...}
    }
}</code></pre>

<h3>9.6 <a name="1.9.6">Constant-time dispatch speed</a></h3>

<p>As discussed earlier, it is possible to get constant-time dispatch speed
if all types are assigned a unique small integer, by looking in a
multi-dimensional array using the small integers as indices. Cmm has a
command-line switch that makes the generated code use this technique.</p>

<p>In pseudo code, the dispatch for a function with two virtual parameters
looks like:</p>
<pre><code>void foo( virtual base&amp; a, virtual base&amp; b
    int a_smallint = a.get_small_integer();
    int b_smallint = b.get_small_integer();
    fn = cache[a_smallint][b_smallint];
    return fn( a, b);</code></pre>

<p>In this case, <code>cache</code> is essentially of type<code>
fn_ptr**</code>.</p>

<p>To reduce memory usage, Cmm's <code>dispatch.cpp</code> contains an
implementation of this dispatch method that allocates the array lazily so
that memory is only allocated for those rows that are actually used.</p>

<p>Getting a unique small integer for each dynamic type is slightly tricky.
In an ideal world, the compiler and linker would conspire to make space
available in the vtable, which would enable very fast lookup. Cmm can't do
this though, so instead it inserts an inline virtual function body into all
polymorphic classes, containing a static int to enable fast access to the
unique integers:</p>
<pre><code>class base
{
    // Next function inserted by Cmm:
    virtual int cmm_get_small_integer() const
    {
        static int id=0;
        if ( !id) id =
            cmm_create_small_integer( typeid( *this));
        return id;
    }
};</code></pre>

<p>The function <code>cmm_get_small_integer()</code> is in the Cmm library
<code>dispatch.cpp</code> along with all of the other support function. It
maintains an internal map of <code>std::type_info</code>'s to
<code>int</code>s so that it returns the same integer if called more than
once for the same type. This is required to make things work when the C++
environment doesn't implement the static int <code>id</code> correctly; for
example, under OpenBSD 3.2 and 3.3, each compilation unit that contains the
inline function <code>cmm_get_small_integer()</code> will have its own
copy.</p>

<p>Cmm's constant time dispatch system is not robust. It adds a virtual
function to all polymorphic classes, but this only works if all code is
passed through Cmm. Other code, such as code in libraries that are linked
into the final executable, may break at runtime because of assuming a
different vtable layout.  To avoid breaking code in system libraries, Cmm
doesn't insert the function into polymorphic classes defined in the
<code>std</code> namespace, but of course this means that you cannot do
constant-time multimethod dispatch on base classes that are defined in
<code>std</code>, such as <code>std::exception</code>.</p>
</body>
</html>