<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"> <html lang="en"> <head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">

<title>Mutex, Lock, Condition Variable Rationale</title>

	<style>
	p {text-align:justify}
	li {text-align:justify}
	blockquote.note
	{
		background-color:#E0E0E0;
		padding-left: 15px;
		padding-right: 15px;
		padding-top: 1px;
		padding-bottom: 1px;
	}
	ins {background-color:#A0FFA0}
	del {background-color:#FFA0A0}
	</style>

</head>

<body>

<address align=right>
Document number: N2406=07-0266<br>
<br>
<a href="mailto:hinnant@twcny.rr.com">Howard E. Hinnant</a><br>
2007-09-09
</address>
<hr>
<h1 align=center>Mutex, Lock, Condition Variable Rationale</h1>

<h2>Contents</h2>

<ul>
<li><a href="#Introduction">Introduction</a></li>

<li><a href="#mutex">Unique Ownership Mutexes and Locks</a>
<ul>
<li><a href="#mutex_synopsis"><tt>&lt;mutex&gt;</tt> Synopsis</a></li>
<li><a href="#mutex_rationale">Mutex Rationale</a></li>
<li><a href="#mutex_imp">Reference <tt>mutex</tt> implementation on POSIX</a></li>
<li><a href="#lock_rationale">Lock Rationale</a></li>
<li><a href="#gen_lock_rationale">Rationale for generic locking algorithms</a></li>
</ul>
</li>

<li><a href="#condition">Condition Variables</a>
<ul>
<li><a href="#cond_var_synop"><tt>&lt;cond_var&gt;</tt> Synopsis</a></li>
<li><a href="#condition_rationale">Condition Variable Rationale</a>
<ul>
<li><a href="#cond_var"><tt>cond_var</tt></a></li>
<li><a href="#gen_cond_var"><tt>gen_cond_var</tt></a></li>
<li><a href="#constrained_cv">Constrained condition variables</a></li>
</ul>
</li>
</ul>
</li>

<li><a href="#shared_mutex">Shared / Unique Ownership Mutexes and Locks</a>
<ul>
<li><a href="#shared_mutex_synop"><tt>&lt;shared_mutex&gt;</tt> Synopsis</a></li>
<li><a href="#shared_mutex_rationale"><tt>shared_mutex</tt> Rationale</a></li>
<li><a href="#shared_mutex_imp"><tt>shared_mutex</tt> Reference Implementation</a></li>
<li><a href="#shared_lock_rationale"><tt>shared_lock</tt> Rationale</a></li>
<li><a href="#upgrade_mutex_rationale"><tt>upgrade_mutex</tt> Rationale</a></li>
<li><a href="#upgrade_mutex_imp"><tt>upgrade_mutex</tt> Implementation</a></li>
<li><a href="#upgrade_lock_rationale"><tt>upgrade_lock</tt> Rationale</a></li>
<li><a href="#transfer_lock_rationale"><tt>transfer_lock</tt> Rationale</a></li>
<li><a href="#transfer_lock_imp"><tt>transfer_lock</tt> Implementation</a></li>
</ul>
</li>

<li><a href="#Acknowledgements">Acknowledgements</a></li>
</ul>

<h2><a name="Introduction"></a>Introduction</h2>

<p>
This paper adds rationale for the design decisions made for mutexes, locks and
condition variables.  The TR2-targeted shared mutexes and locks are included here
only to help show how the complete package works together.  Namespace tr2 is used
to indicate the targeting intent.
</p>

<p>
Threads, thread pools and futures are not addressed in this paper.
</p>

<h2><a name="mutex"></a>Unique Ownership Mutexes and Locks</h2>

<p>
Below is the proposed synopsis of the header <tt>&lt;mutex&gt;</tt>.  Below the
synopsis an informal description and rationale is given for each of the components
and design decisions.
</p>

<h3><a name="mutex_synopsis"></a><tt>&lt;mutex&gt;</tt> Synopsis</h3>

<blockquote><pre>
namespace std {

<font color="#C00000">// A basic mutex</font>
class mutex
{
public:
    mutex();
    ~mutex();

    mutex(const mutex&amp;) = delete;
    mutex&amp; operator=(const mutex&amp;) = delete;

    void lock();
    bool try_lock();
    void unlock();

    typedef <i>unspecified</i> native_handle_type;  // conditionally present.  example: pthread_mutex_t*
    native_handle_type native_handle();      // conditionally present
};

<font color="#C00000">// Add recursive functionality to the basic mutex</font>
class recursive_mutex
{
public:
    recursive_mutex();
    ~recursive_mutex();

    recursive_mutex(const recursive_mutex&amp;) = delete;
    recursive_mutex&amp; operator=(const recursive_mutex&amp;) = delete;

    void lock();
    bool try_lock();
    void unlock();

    typedef <i>unspecified</i> native_handle_type;  // conditionally present.  example: pthread_mutex_t*
    native_handle_type native_handle();      // conditionally present
};

<font color="#C00000">// Add timed lock functionality to the basic mutex</font>
class timed_mutex
{
public:
    timed_mutex();
    ~timed_mutex();

    timed_mutex(const timed_mutex&amp;) = delete;
    timed_mutex&amp; operator=(const timed_mutex&amp;) = delete;

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

    typedef <i>unspecified</i> native_handle_type;  // conditionally present.  example: pthread_mutex_t*
    native_handle_type native_handle();      // conditionally present
};

<font color="#C00000">// Add timed lock functionality to the recursive mutex</font>
class recursive_timed_mutex
{
public:
    recursive_timed_mutex();
    ~recursive_timed_mutex();

    recursive_timed_mutex(const recursive_timed_mutex&amp;) = delete;
    recursive_timed_mutex&amp; operator=(const recursive_timed_mutex&amp;) = delete;

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

    typedef <i>unspecified</i> native_handle_type;  // conditionally present.  example: pthread_mutex_t*
    native_handle_type native_handle();      // conditionally present
};

<font color="#C00000">// An exception type for lock related errors</font>
class lock_error
    : public exception
{
public:
    virtual const char* what() const throw();
};

<font color="#C00000">// Tag types for lock construction options</font>
struct do_not_lock_type    {};
struct try_to_lock_type    {};
struct already_locked_type {};

extern do_not_lock_type    do_not_lock;
extern try_to_lock_type    try_to_lock;
extern already_locked_type already_locked;


<font color="#C00000">// RAII wrapper for locking and unlocking a mutex within a scope.</font>
<font color="#C00000">// It <b>always</b> references a mutex with the lock owned.</font>
template &lt;class Mutex&gt;
class scoped_lock
{
public:
    typedef Mutex mutex_type;

    explicit scoped_lock(mutex_type&amp; m);
    scoped_lock(mutex_type&amp; m, already_locked_type);
    ~scoped_lock();

    scoped_lock(scoped_lock const&amp;) = delete;
    scoped_lock&amp; operator=(scoped_lock const&amp;) = delete;
};

<font color="#C00000">// Forward declaration of proposed TR2 locks</font>
namespace tr2 {

template &lt;class Mutex&gt; class shared_lock;
template &lt;class Mutex&gt; class upgrade_lock;

} // tr2

<font color="#C00000">// A movable (multi-scope) RAII wrapper for locking and unlocking mutexes.</font>
<font color="#C00000">// It may or may not reference a mutex, and if it does, the
//    lock may not be owned.</font>
template &lt;class Mutex&gt;
class unique_lock
{
public:
    typedef Mutex mutex_type;

    unique_lock();
    explicit unique_lock(mutex_type&amp; m);
    unique_lock(mutex_type&amp; m, do_not_lock_type);
    unique_lock(mutex_type&amp; m, try_to_lock_type);
    unique_lock(mutex_type&amp; m, already_locked_type);
    unique_lock(mutex_type&amp; m, nanoseconds rel_t);
    ~unique_lock();

    unique_lock(unique_lock const&amp;) = delete;
    unique_lock&amp; operator=(unique_lock const&amp;) = delete;

    unique_lock(unique_lock&amp;&amp; u);
    unique_lock&amp; operator=(unique_lock&amp;&amp; u);

// TR2 constructors begin
    <font color="#C00000">// convert upgrade ownership to unique ownership</font>
    explicit unique_lock(tr2::upgrade_lock&lt;mutex_type&gt;&amp;&amp;);
    explicit unique_lock(const tr2::upgrade_lock&lt;mutex_type&gt;&amp;) = delete;

    unique_lock(tr2::upgrade_lock&lt;mutex_type&gt;&amp;&amp;, try_to_lock_type);
    unique_lock(const tr2::upgrade_lock&lt;mutex_type&gt;&amp;, try_to_lock_type) = delete;

    unique_lock(tr2::upgrade_lock&lt;mutex_type&gt;&amp;&amp;, nanoseconds rel_t);
    unique_lock(const tr2::upgrade_lock&lt;mutex_type&gt;&amp;, nanoseconds rel_t) = delete;

    <font color="#C00000">// convert shared ownership to unique ownership</font>
    unique_lock(tr2::shared_lock&lt;mutex_type&gt;&amp;&amp;, try_to_lock_type);
    unique_lock(const tr2::shared_lock&lt;mutex_type&gt;&amp;, try_to_lock_type) = delete;

    unique_lock(tr2::shared_lock&lt;mutex_type&gt;&amp;&amp;, nanoseconds rel_t));
    unique_lock(const tr2::shared_lock&lt;mutex_type&gt;&amp;, nanoseconds rel_t)) = delete;
// TR2 constructors end

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

    bool owns_lock() const;
    mutex_type* mutex() const;

    void swap(unique_lock&amp;&amp; u);
    mutex_type* release();
};

template &lt;class Mutex&gt; void swap(unique_lock&lt;Mutex&gt;&amp;  x, unique_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(unique_lock&lt;Mutex&gt;&amp;&amp; x, unique_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(unique_lock&lt;Mutex&gt;&amp;  x, unique_lock&lt;Mutex&gt;&amp;&amp; y);

<font color="#C00000">// Generic locking algorithms used to avoid deadlock.</font>
template &lt;class L1, class L2, class ...L3&gt; int try_lock(L1&amp;, L2&amp;, L3&amp;...);
template &lt;class L1, class L2, class ...L3&gt; void lock(L1&amp;, L2&amp;, L3&amp;...);
}  // std
</pre></blockquote>

<h3><a name="mutex_rationale"></a>Mutex Rationale</h3>

<p>
The overall intent of the above design is to allow user defined mutexes to work with
standard defined locks, and to also allow standard defined mutexes to work with
user defined locks.  The standard defined mutexes and locks create mutex and lock
concepts, which if followed allow for the interoperability between user defined and
standard defined components (much like our existing containers and algorithms interoperate
with user defined containers and algorithms).
</p>

<p>
Examples of user-defined mutexes might include debugging mutexes, mutexes which
atomically lock and destroy within the destructor to aid in orderly shut down, and
read / write mutexes with different priority policies for readers and writers.
</p>

<p>
Examples of user-defined locks might include the bundling of mutexes and files to
create <i>locked files</i> which both lock and open on <tt>lock()</tt> and close
and unlock on <tt>unlock()</tt>.
</p>

<p>
There are four standard-defined mutexes:
</p>

<ol>
<li><tt>mutex</tt></li>
<li><tt>recursive_mutex</tt></li>
<li><tt>timed_mutex</tt></li>
<li><tt>recursive_timed_mutex</tt></li>
</ol>

<p>
Like boost, and unlike POSIX, this design separates these functionalities into several distinct
types which form a hierarchy of concepts.  The rationale for the different types is to keep the
most common, and simplest mutex type as small and efficient as possible.  Each successive concept
adds both functionality, and potentially expense.  As mutexes are a very low level facility, it
is very important that efficiency on a variety of platforms be a priority.
</p>

<blockquote>
<p>
High level facilities
can build on this low level, adding functionality (and expense) as required.  But higher level layers
can not subtract expense from the lower level layers.
</p>
</blockquote>

<p>
Unlike boost, <tt>try_mutex</tt> is not separated out.  The reason for this is because adding the
try-functionality adds no expense to all known mutex implementations.  Furthermore the <tt>try_lock</tt>
functionality is known to be valuable and heavily used in practice.  Therefore the <tt>try_lock</tt>
functionality is included in all of the mutex types (as part of the base Mutex concept).
</p>

<p>
Unlike boost, the mutexes have public member functions for <tt>lock()</tt>, <tt>unlock()</tt>, etc.
This is necessary to support one of the primary goals:  User defined mutexes can be used with
standard defined locks.  If there were no interface for the user defined mutex to implement, there
would be no way for a standard defined lock to communicate with the user defined mutex.
</p>

<p>
Like both boost and POSIX, the mutex types are neither copyable nor movable.  Because
several threads will access the mutex simultaneously, it would be bad for one thread
to move it while another is locking or unlocking it.
</p>

<h3><a name="mutex_imp"></a>Reference <tt>mutex</tt> implementation on POSIX</h3>

<p>
A reference implementation is given to show the intent of the proposal.  Part of the
intent is to be a thin wrapper around OS primitives (such as <tt>pthread_mutex_t</tt>).
For brevity, only a <tt>mutex</tt> implementation is shown, and not an implementation
for the other three mutex types.  Also, only for brevity, code is inlined.  Asserts are
used to indicate requirements on the client (such as one can't destruct a locked <tt>mutex</tt>).
</p>

<blockquote><pre>
class mutex
{
    pthread_mutex_t mut_;
public:
    mutex()
    {
        error_code::value_type ec = pthread_mutex_init(&amp;mut_, 0);
        if (ec)
            throw system_error(ec, native_category, "mutex constructor failed");
    }

    ~mutex()
    {
        int e = pthread_mutex_destroy(&amp;mut_);
        assert(e == 0);
    }

    mutex(const mutex&amp;) = delete;
    mutex&amp; operator=(const mutex&amp;) = delete;

    void lock()
    {
        error_code::value_type ec = pthread_mutex_lock(&amp;mut_);
        if (ec)
            throw system_error(ec, native_category, "mutex lock failed");
    }

    bool try_lock()
    {
        return pthread_mutex_trylock(&amp;mut_) == 0;
    }

    void unlock()
    {
        int e = pthread_mutex_unlock(&amp;mut_);
        assert(e == 0);
    }

    typedef pthread_mutex_t* native_handle_type;
    native_handle_type native_handle() {return &amp;mut_;}
};
</pre></blockquote>

<h3><a name="lock_rationale"></a>Lock Rationale</h3>

<p>
The lock types are used as RAII devices for the <i>locked state</i> of a mutex.
That is, the locks don't own the mutexes they reference.  They just own the
lock on the mutex.
</p>

<p>
Unlike boost, the locks are namespace scope objects to enable user defined mutexes
to work with standard defined locks.  If the locks existed only as nested types of
the standard mutex types, then user defined mutexes would have to reinvent the lock
type for every new mutex type.
</p>

<p>
Boost has several lock concepts:
</p>

<ul>
<li>Lock Concept</li>
<li>ScopedLock Concept</li>
<li>TryLock Concept</li>
<li>ScopedTryLock Concept</li>
<li>TimedLock Concept</li>
<li>ScopedTimedLock Concept</li>
</ul>

<p>
This proposal simplifies the above list of "lock concepts" down to two:
</p>

<ul>
<li><tt>scoped_lock</tt></li>
<li><tt>unique_lock</tt></li>
</ul>

<p>(Two more locks are proposed for TR2 to support read/write locks).</p>

<p>
The boost list of locks is necessary because each mutex specifies its own list of lock types.
In this design the locks are namespace scope objects, templated on the mutex type, and
designed to work with a wide variety of mutex types.  Thus each different lock template
is not designed for a specific mutex, but for specific use cases instead.
</p>

<p>
The <tt>scoped_lock</tt> is meant to address the most common use case, which is also
the simplest use case:  Lock the mutex at the beginning of a scope and unlock it at the
end of the scope.  That mutex could be a <tt>mutex</tt>, or a <tt>recursive_timed_mutex</tt>,
or a <tt>MySpecialMutex</tt>.  <tt>scoped_lock</tt> does not care.  All it requires is
that the mutex template parameter provide <tt>lock()</tt> and <tt>unlock()</tt>.  The
<tt>scoped_lock</tt> constructor <tt>lock()</tt>'s the mutex, and the <tt>scoped_lock</tt>
destructor <tt>unlock()</tt>'s the mutex.  It is as simple as that, and nothing more.
</p>

<blockquote><pre>
std::mutex mut;

void foo()
{
    std::scoped_lock&lt;std::mutex&gt; _(mut);  // mut.lock() called here
    // do protected work here
}   // mut.unlock() called here, no matter how foo() is exited
</pre></blockquote>

<p>
An invariant of <tt>scoped_lock</tt> is that it <i>always</i> owns the lock on the
referenced mutex.  The boost::mutex::scoped_lock does not have this invariant.  An
advantage of this invariant is that there is no <tt>if</tt> statement in
<tt>~scoped_lock()</tt> which tests whether the mutex needs to be unlocked or not.
In this design <tt>~scoped_lock()</tt> <i>always</i> unlocks the mutex because it
<i>always</i> owns the lock on the mutex.
</p>

<p>
On rare occasions, the client may have a mutex that it already owns the lock,
and wants to transfer the ownership of that lock into a <tt>scoped_lock</tt>.
This functionality can be added to <tt>scoped_lock</tt> with zero cost.  One simply
adds an extra constructor which does not lock the mutex (because it is already locked):
</p>

<blockquote><pre>
std::mutex mut;

void foo()
{
    // for whatever reasons this thread already owns the lock on mut
    std::scoped_lock&lt;std::mutex&gt; _(mut, std::already_locked);  // mut.lock() not called here
    // do protected work here
}   // mut.unlock() called here, no matter how foo() is exited
</pre></blockquote>

<p>
See the implementation of <tt>gen_cond_var</tt> later in this paper for a real-life use
case of the <tt>already_locked</tt> <tt>scoped_lock</tt> constructor.
</p>

<p>
Despite the fact that <tt>scoped_lock</tt> efficiently covers most of the use cases
for locking and unlocking a mutex, it does not cover all of the use cases.  To cover
the rest of these use cases <tt>unique_lock</tt> is introduced.  This adds both functionality
and a small amount of expense.  It no longer has an invariant that it always owns the lock
on the referenced mutex.  Indeed it may not even reference a mutex.  And if it does reference
a mutex, it may or may not own the lock on it, so <tt>unique_lock</tt> must add a bool
data member and test that before it tries to unlock the mutex (such as in <tt>~unique_lock()</tt>).
</p>

<p>
<tt>unique_lock</tt> is more similar to the boost::mutex::scoped_lock, but adds functionality
without adding expense over and above that required for boost::mutex::scoped_lock.  Part of
this added functionality is move semantics, heretofore unavailable to boost code.  Thus
<tt>unique_lock</tt> can be returned from factory functions and stored in containers.
</p>

<p>
<tt>unique_lock</tt> can perform the functionality of <tt>scoped_lock</tt>, but with the added
expense of the <tt>if</tt> statement in the destructor, and with the added expense imposed upon
the code reviewer who now has to scan the entire function to know if mutex lock ownership is
transferred away from this <tt>unique_lock</tt> or not.  When your use case fits into the
limited functionality of <tt>scoped_lock</tt> it is best to use <tt>scoped_lock</tt> not
only for efficiency reasons, but also for documentation reasons.  You more clearly state what
you are doing with the mutex.
</p>

<p>
<tt>unique_lock</tt> has an array of constructors allowing the client to choose among:
</p>

<ul>
<li>lock the mutex (like <tt>scoped_lock</tt>)</li>
<li>do not lock the mutex (with <tt>do_not_lock</tt>)</li>
<li>try-lock the mutex (with <tt>try_to_lock</tt>)</li>
<li>transfer in lock ownership of the mutex from an external source(with <tt>already_locked</tt>)</li>
<li>timed-lock the mutex (passing in the maximum wait time)</li>
<li>transfer lock ownership from another rvalue <tt>unique_lock</tt></li>
</ul>

<p>
It is proposed for TR2 that the list of <tt>unique_lock</tt> constructors be
expanded to include converting shared or upgrade ownership to unique ownership
(in support of converting read/write mutexes).
</p>

<p>
Like boost, this lock supports member <tt>lock()</tt>, <tt>try_lock()</tt>,
<tt>unlock()</tt> and <tt>timed_lock()</tt> member functions.  Unlike boost,
these are all in the same lock "concept" whether the mutex type supports
this functionality or not.  If the mutex doesn't support <tt>timed_lock</tt>
(for example), no harm is done unless the <tt>timed_lock</tt> member function
of <tt>unique_lock</tt> is instantiated.  This is no different than <tt>vector</tt>
only requiring its <tt>value_type</tt> to be <tt>DefaultConstructible</tt> if certain
<tt>vector</tt> members are instantiated.
</p>

<p>
<tt>owns_lock()</tt> tests whether the <tt>unique_lock</tt> owns the lock on
the mutex or not.  Boost calls this member <tt>locked()</tt>.  I prefer
<tt>owns_lock()</tt> because this member doesn't tell whether the mutex
is locked or not.  Another thread may own the lock on the mutex at the time.
This member function only tells whether <i>this</i> <tt>unique_lock</tt>
owns the locks on the mutex or not.
</p>

<p>
The <tt>mutex()</tt> member returns a pointer to the referenced mutex (or null if 
there is no referenced mutex).  Some algorithms may need access to the referenced
mutex, say for example, to take its address for mutex ordering or error
checking purposes.
</p>

<p>
The <tt>swap</tt> member is included for full move semantics support.
</p>

<p>
The <tt>release()</tt> member releases the lock ownership of the mutex to
the client, without unlocking the mutex (transfers lock ownership out of the
<tt>unique_lock</tt>).  This is necessary for transferring lock ownership from
a <tt>unique_lock</tt> to a user-defined lock.
</p>

<h3><a name="gen_lock_rationale"></a>Rationale for generic locking algorithms</h3>

<p>
Occasionally a programmer finds it necessary to lock two (or more) mutexes at
once.  If such an operation is not done properly there exists a risk of
deadlock.  By using <tt>std::lock(l1, l2, ...)</tt> the programmer can
safely lock multiple locks or mutexes without fear of deadlock.  The only
requirements on these lock types is that they support <tt>lock()</tt>,
<tt>unlock()</tt>, and <tt>try_lock()</tt>.  The locks need not all be
of the same type.  This algorithm can be used to easily build class
lock types which refer to multiple mutexes (say <tt>Lock2</tt>).  The <tt>lock()</tt> member
function of <tt>Lock2</tt> might use <tt>std::lock</tt> to lock all of the
mutexes.  One could then use <tt>Lock2</tt> in a (generalized)
condition variable wait statement.  This is another good example of a
user defined lock (which works with standard defined mutexes).
</p>

<blockquote><pre>
template &lt;class M1, class M2&gt;
class Lock2
{
    M1&amp; m1_;
    M2&amp; m2_;
public:
    Lock2(M1&amp; m1, M2&amp; m2) : m1_(m1), m2_(m2)
    {
        lock();
    }

    ~Lock2() {unlock();}

    Lock2(const Lock2&amp;) = delete;
    Lock2&amp; operator=(const Lock2&amp;) = delete;

    void lock() {<b>std::lock(m1_, m2_);</b>}

    void unlock()
    {
        m1_.unlock();
        m2_.unlock();
    }
};
</pre></blockquote>

Generalizing <tt>Lock2&lt;M1, M2&gt;</tt> to <tt>MultiLock&lt;M1, M2, ...Mn&gt;</tt>
is an interesting exercise and the existence of <tt>std::lock</tt> (and <tt>tuple</tt>) greatly
eases the implementation burden of <tt>MultiLock&lt;M1, M2, ...Mn&gt;</tt>.

<h2><a name="condition"></a>Condition Variables</h2>

<p>
Below is the proposed synopsis of the header <tt>&lt;cond_var&gt;</tt>.  Below the
synopsis an informal description and rationale is given for each of the components
and design decisions.
</p>

<h3><a name="cond_var_synop"></a><tt>&lt;cond_var&gt;</tt> Synopsis</h3>

<blockquote><pre>
namespace std {

<font color="#C00000">// A basic condition variable.</font>
<font color="#C00000">// It can wait only on a unique_lock&lt;mutex&gt;.</font>
class cond_var
{
public:
   
    cond_var();
    ~cond_var();

    cond_var(const cond_var&amp;) = delete;
    cond_var&amp; operator=(const cond_var&amp;) = delete;

    void notify_one();
    void notify_all();
    void wait(unique_lock&lt;mutex&gt;&amp; lock);
    template &lt;class Predicate&gt;
        void wait(unique_lock&lt;mutex&gt;&amp; lock, Predicate pred);
    bool timed_wait(unique_lock&lt;mutex&gt;&amp; lock, const utc_time&amp; abs_time);
    template &lt;class Predicate&gt;
        bool timed_wait(unique_lock&lt;mutex&gt;&amp; lock, const utc_time&amp; abs_time, Predicate pred);
};

<font color="#C00000">// A generalized condition variable.</font>
<font color="#C00000">// It can wait on anything which supports lock() and unlock().</font>
class gen_cond_var
{
public:
   
    gen_cond_var();
    ~gen_cond_var();

    gen_cond_var(const gen_cond_var&amp;) = delete;
    gen_cond_var&amp; operator=(const gen_cond_var&amp;) = delete;

    void notify_one();
    void notify_all();
    template &lt;class Lock&gt;
        void wait(Lock&amp; lock);
    template &lt;class Lock, class Predicate&gt;
        void wait(Lock&amp; lock, Predicate pred);
    template &lt;class Lock&gt;
        bool timed_wait(Lock&amp; lock, const utc_time&amp; abs_time);
    template &lt;class Lock, class Predicate&gt;
        bool timed_wait(Lock&amp; lock, const utc_time&amp; abs_time, Predicate pred);
};

}  // std
</pre></blockquote>

<h3><a name="condition_rationale"></a>Condition Variable Rationale</h3>

<p>
The condition variable has been a most difficult class.  While it is
an indispensable part of the multithreaded tool set, it is also extremely
low level, and has an inherent complexity to its interface.  This interface
is not of my choosing, but is the product of several decades of research and experience in the
multithreaded arena, independent of C++.  Condition variables are arguably lower
level than the mutex.  Because of their low level, efficiency is a major concern:
</p>

<blockquote>
<p>
High level facilities
can build on this low level, adding functionality (and expense) as required.  But higher level layers
can not subtract expense from the lower level layers.
</p>
</blockquote>

<p>
For those platforms which offer a native condition variable (e.g. <tt>pthread_cond_t</tt>),
C++ should offer as thin a wrapper as possible around that OS functionality.
</p>

<p>
At the same time, the condition variable is very powerful if generalized.
A generalized (but more expensive) condition variable could offer the programmer
a sufficiently higher level of abstraction to enable significantly more powerful
multithreaded programming paradigms (analogous to moving from assembly to C).
</p>

<p>
I have studied condition variables extensively over several years within the context
of C++.  This study has included <tt>boost::condition</tt> and several variations of it.
I even implemented a version of the <tt>boost::condition</tt> variables for the CodeWarrior
(Metrowerks/Motorola/Freescale) C++ library.  I have come to the following conclusions:
</p>

<ol>
<li>I do not believe user defined condition variables are practical except
as adaptors layered over the standard supplied condition variable.</li>

<li>Standard supplied condition variables can interoperate with user
defined mutexes/locks, but this functionality will definitely add expense
to the condition variable over a native condition variable (e.g. <tt>pthread_cond_t</tt>).</li>

<li>The semantics of <tt>boost::condition</tt> do not interoperate with user defined
mutexes/locks.</li>

<li>The syntactic interface of <tt>boost::condition</tt> can be specified to
interoperate with user defined mutexes/locks but at an added expense over
the native condition variable (e.g. <tt>pthread_cond_t</tt>).</li>

<li>
Well over 90% of the condition variable use cases in current day code require
only the semantics of those equivalent to <tt>pthread_cond_t</tt>.
</li>

<li>
It is not an intrinsic error to wait on a single condition variable with
multiple mutexes unless more than one mutex is waited upon at the same time.
Multiple threads can wait on the same condition variable with the same mutex
at the same time.
</li>

<li>
There exist use cases for waiting on the same condition variable with multiple
mutexes.  The simplest example involves only one waiting thread which uses a different
mutex on each wait.
</li>

<li>
There exist use cases for waiting on the same condition variable with the
same mutex but locked in different ways simultaneously.  For example one
thread might want to wait on a <tt>shared_mutex</tt> locked with unique
ownership, while another thread waits on the same <tt>shared_mutex</tt>
locked with shared ownership.  A third thread might not care if it is
signaling a reader or a writer.
</li>
</ol>

<p>
Because of these conclusions I believe we need to support both a razor
thin layer over OS supplied condition variables, and also supply a more
generalized condition variable which can work with user defined mutexes/locks.
These are at least two different types of condition variables.
</p>

<p>
I have experimented
with templating the condition variable but have discovered problems with this approach.
</p>

<ul>
<li>
If the condition is templated on lock type, then the wait functions are not templated.
This destroys the ability to simultaneously wait on a <tt>unique_lock&lt;shared_mutex&gt;</tt>
and a <tt>shared_lock&lt;shared_mutex&gt;</tt> on the same <tt>shared_mutex</tt>.
</li>
<li>
If the condition is templated on mutex type, then the wait functions can be templated on lock
type, solving the previous problem.  However one is still depending on a specialization of
this condition to provide the razor thin layer over the OS condition variable (e.g.
<tt>pthread_cond_t</tt>).  That specialization can not reliably have its wait functions
templated on lock type.  Such a lock would be required to do nothing but lock/unlock the
mutex, which would outlaw user defined lock types such as the <i>locked file</i> example
mentioned in the mutex rationale.  The specialization must only allow waiting on a
standard lock type (i.e. <tt>unique_lock&lt;mutex&gt;</tt>).
</li>
</ul>

<p>
Because the condition specialized on the native mutex type can not have the same
interface as the primary condition template (it can't wait on <i>any</i> lock
type), specialization is not appropriate for this application (reference the <tt>vector&lt;bool&gt;</tt>
example).
</p>

<p>
The only conclusion I can come to which supports both a razor thin layer over the
native OS condition variable, and a generalized condition variable which works with
user defined mutexes/locks (such as the <tt>Lock2</tt> example) is two distinct
types:
</p>

<blockquote>
<table border="0">
<tr>
<td>&bull; <tt>cond_var</tt></td>
<td>: A condition variable that can wait on nothing but <tt>unique_lock&lt;mutex&gt;</tt> (or perhaps <tt>mutex</tt>).</td>
</tr>
<tr>
<td>&bull; <tt>gen_cond_var</tt></td>
<td>: A condition variable that can wait on anything which supports <tt>lock()</tt> and <tt>unlock()</tt>.</td>
</tr>
</table>
</blockquote>

<p>
Boost uses the name <tt>condition</tt> for a condition variable.  With our recent addition of
<tt>conditional</tt> to the type traits library I fear that using <tt>condition</tt> will be
confusing.  <tt>cond_var</tt> is a readable abreviation of <tt>condition_variable</tt>.
</p>

<h4><a name="cond_var"></a><tt>cond_var</tt></h4>

<p>
Below is an example implementation of <tt>cond_var</tt> on top of <tt>pthread_cond_t</tt>.
The reference implementation is meant to demonstrate how thinly <tt>cond_var</tt> maps to
<tt>pthread_cond_t</tt> (or whatever native OS condition variable is available).
</p>

<blockquote><pre>
class cond_var
{
    pthread_cond_t cv_;
public:
   
    cond_var()
    {
        error_code::value_type ec = pthread_cond_init(&amp;cv_, 0);
        if (ec)
            throw system_error(ec, native_category, "cond_var constructor failed");
    }

    ~cond_var()
    {
        int ec = pthread_cond_destroy(&amp;cv_);
        assert(ec == 0);
    }

    cond_var(const cond_var&amp;) = delete;
    cond_var&amp; operator=(const cond_var&amp;) = delete;

    void notify_one()
    {
        error_code::value_type ec = pthread_cond_signal(&amp;cv_);
        if (ec)
            throw system_error(ec, native_category, "cond_var notify_one failed");
    }

    void notify_all()
    {
        error_code::value_type ec = pthread_cond_broadcast(&amp;cv_);
        if (ec)
            throw system_error(ec, native_category, "cond_var notify_all failed");
    }

    void wait(unique_lock&lt;mutex&gt;&amp; lock)
    {
        error_code::value_type ec = pthread_cond_wait(&amp;cv_, lock.mutex()-&gt;native_handle());
        if (ec)
            throw system_error(ec, native_category, "cond_var wait failed");
    }

    template &lt;class Predicate&gt;
        void wait(unique_lock&lt;mutex&gt;&amp; lock, Predicate pred)
        {
            while (!pred())
                wait(lock);
        }

    bool timed_wait(unique_lock&lt;mutex&gt;&amp; lock, const utc_time&amp; abs_time)
    {
        timespec tm = convert_to_timespec(abs_time);
        error_code::value_type ec = pthread_cond_timedwait(&amp;cv_, lock.mutex()-&gt;native_handle(), &amp;tm);
        if (ec != 0 &amp;&amp; ec != ETIMEDOUT)
            throw system_error(ec, native_category, "cond_var timed_wait failed");
        return ec == 0;
    }

    template &lt;class Predicate&gt;
        bool timed_wait(unique_lock&lt;mutex&gt;&amp; lock, const utc_time&amp; abs_time, Predicate pred)
        {
            while (!pred())
                if (!timed_wait(lock, abs_time))
                    return pred();
            return true;
        }
};
</pre></blockquote>

<p>
The above assumes that either there is no cancellation / interruption in the C++ threading API,
or that it is handled by <tt>pthread_cancel</tt>.  If we do have interruption which needs to be
handled independently of <tt>pthread_cancel</tt> on some platform, the above wait function would
simply need to register the <tt>cond_var</tt> with thread local data set aside so that other
threads would know which <tt>cond_var</tt> to notify in case it needed to interrupt the waiting
thread.  Perhaps something like:
</p>

<blockquote><pre>
struct done_with_cv
{
    void operator()(thread_state* ts) {ts-&gt;done_with_cv();}
};

void wait(unique_lock&lt;mutex&gt;&amp; lock)
{
    thread_state* ts = get_thread_state();
    if (ts)
        ts-&gt;wait_on_cv(&amp;cv_);
    unique_ptr&lt;thread_state, done_with_cv&gt; _(ts);
    error_code::value_type ec = pthread_cond_wait(&amp;cv_, lock.mutex()-&gt;native_handle());
    if (ec)
        throw system_error(ec, native_category, "cond_var wait failed");
}
</pre></blockquote>

<p>
For those OS's which do not support a native condition variable, but do support mutexes
and semaphores, one has to use Alexander Terekhov's "algorithm 8a".  This requires two
semaphores, a mutex and three counters.
</p>

<h4><a name="gen_cond_var"></a><tt>gen_cond_var</tt></h4>

<p>
Once a minimalist but portable layer over a condition variable exists, then it becomes
possible to build higher-level layers on top of <tt>cond_var</tt>.  Those clients who
don't need or want this higher-level functionality should program directly to the
lowest-level <tt>cond_var</tt> to avoid unwanted costs associated with the higher-level
functionality (e.g. see the <tt>shared_mutex</tt> reference implementation below).
</p>

<p>
The first higher-level layer I would like to explore is <tt>gen_cond_var</tt>.  This class
is capable of waiting on any object that supports <tt>lock()</tt> and <tt>unlock()</tt>.
For example given the earlier user-defined <tt>Lock2</tt> example, one could with <tt>gen_cond_var</tt>, compose
two mutexes with a single lock and wait on that lock:
</p>

<blockquote><pre>
std::mutex m1;
std::timed_mutex m2;
std::gen_cond_var cv;
...

void foo()
{
    Lock2&lt;std::mutex, std::timed_mutex&gt; lk(m1, m2);
    // m1 and m2 locked here
    while (not_ready_to_proceed())
        cv.wait(lk);  // m1 and m2 unlocked while sleeping
    // m1 and m2 locked here
    ...
}   // m1 and m2 unlocked here
</pre></blockquote>

<p>
The first question one may reasonably ask about <tt>gen_cond_var</tt> is:  This looks great.  But why
does this rise to the level of needing to be standardized?  Can't user's portably create <tt>gen_cond_var</tt>
themselves?
</p>

<p>
I'd like to answer the second part of that question first:  Yes, anyone could portably create <tt>gen_cond_var</tt>
himself, layered on top of <tt>cond_var</tt>.  That is the good news.  The bad news is that while the implementation
is relatively brief, the code is exceptionally hard to get right.  Here's a first cut:
</p>

<blockquote><pre>
class gen_cond_var
{
    cond_var cv_;
    mutex    mut_;
public:
   
    gen_cond_var() = default;
    ~gen_cond_var() = default;

    gen_cond_var(const gen_cond_var&) = delete;
    gen_cond_var& operator=(const gen_cond_var&) = delete;

    void notify_one()
    {
        cv_.notify_one();
    }

    void notify_all()
    {
        cv_notify_all();
    }

    template <class Lock>
        void wait(Lock& lock)
        {
            unique_lock<mutex> lk(mut_);
            lock.unlock();
            cv_.wait(lk);
            lock.lock();
        }

    template <class Lock, class Predicate>
        void wait(Lock& lock, Predicate pred)
        {
            while (!pred())
                wait(lock);
        }

    template <class Lock>
        bool timed_wait(Lock& lock, const utc_time& abs_time)
        {
            unique_lock<mutex> lk(mut_);
            lock.unlock();
            cv_.timed_wait(lk, abs_time);
            lock.lock();
        }

    template <class Lock, class Predicate>
        bool timed_wait(Lock& lock, const utc_time& abs_time, Predicate pred)
        {
            while (!pred())
                if (!timed_wait(lock, abs_time))
                    return pred();
            return true;
        }
};
</pre></blockquote>

<p>
The first problem is fairly obvious:  <tt>cv_.wait(lk)</tt> in <tt>wait</tt> might throw
and that would result in the external <tt>lock</tt> being incorrectly left unlocked as
<tt>wait</tt> exited exceptionally.  The fix is simple, and the same for both <tt>wait</tt>
and <tt>timed_wait</tt>:
</p>

<blockquote><pre>
struct lock_it
{
    template &lt;class Lock&gt;
        void operator()(Lock* lk) {lk-&gt;lock();}
};

...

    template &lt;class Lock&gt;
        void wait(Lock&amp; lock)
        {
            unique_lock&lt;mutex&gt; lk(mut_);
            lock.unlock();
            <b>unique_ptr&lt;Lock, lock_it&gt; _(&amp;lock);</b>
            cv_.wait(lk);
        }
</pre></blockquote>

<p>
Now we're exception safe, and I would expect most programmers to get this far.  But there looms a
more subtle multithread related problem.  Consider two threads executing nearly simultaneously
within <tt>wait()</tt> and <tt>notify_one()</tt>.
</p>

<blockquote>
<table border="1">

<tr>
<th>Thread A</th><th>Thread B</th>
</tr>

<tr>
<td><tt>lock.lock()</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td>check predicate, decide to wait</td> <td>&nbsp;</td>
</tr>

<tr>
<td>enter <tt>gen_cond_var::wait</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><tt>unique_lock&lt;mutex&gt; lk(mut_)</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><tt>lock.unlock();</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td>&nbsp;</td> <td><tt>lock.lock()</tt></td>
</tr>

<tr>
<td>&nbsp;</td> <td>change predicate, wake Thread A</td>
</tr>

<tr>
<td>&nbsp;</td> <td>enter <tt>gen_cond_var::notify_one</tt></td>
</tr>

<tr>
<td>&nbsp;</td> <td><tt>cv_.notify_one()</tt></td>
</tr>

<tr>
<td>&nbsp;</td> <td>exit <tt>gen_cond_var::notify_one</tt></td>
</tr>

<tr>
<td>&nbsp;</td> <td><tt>lock.unlock()</tt></td>
</tr>

<tr>
<td><tt>cv_.wait(lk)</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><table align="center"><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr></table></td> <td>&nbsp;</td>
</tr>

<tr>
<td>Thread A never wakes!</td> <td>Thread B believes it has notified Thread A</td>
</tr>

</table>
</blockquote>

<p>
Thread A waits forever waiting for Thread B to change the predicate, even though Thread B has already 
changed the predicate and notified A.  The
problem is that the <tt>unlock/wait</tt> sequence is no longer atomic.  This is exactly the reason
the condition variable was invented for in the first place.  To make the <tt>unlock/wait</tt> sequence atomic, it is important
to realize that it only needs to be atomic with respect to the notify functions.  Now we have:
</p>

<blockquote><pre>
struct lock_it
{
    template &lt;class Lock&gt;
        void operator()(Lock* lk) {lk-&gt;lock();}
};

class gen_cond_var
{
    cond_var cv_;
    mutex    mut_;
public:
   
    gen_cond_var() = default;
    ~gen_cond_var() = default;

    gen_cond_var(const gen_cond_var&) = delete;
    gen_cond_var& operator=(const gen_cond_var&) = delete;

    void notify_one()
    {
        <b>scoped_lock&lt;mutex&gt; _(mut_);</b>
        cv_.notify_one();
    }

    void notify_all()
    {
        <b>scoped_lock&lt;mutex&gt; _(mut_);</b>
        cv_notify_all();
    }

    template <class Lock>
        void wait(Lock& external)
        {
            unique_lock&lt;mutex&gt; lk(mut_);
            external.unlock();
            unique_ptr&lt;Lock, lock_it&gt; lock_last(&amp;external);
            cv_.wait(lk);
        }

    template <class Lock, class Predicate>
        void wait(Lock& lock, Predicate pred)
        {
            while (!pred())
                wait(lock);
        }

    template <class Lock>
        bool timed_wait(Lock& external, const utc_time& abs_time)
        {
            unique_lock<mutex> lk(mut_);
            external.unlock();
            unique_ptr&lt;Lock, lock_it&gt; lock_last(&amp;external);
            cv_.timed_wait(lk, abs_time);
        }

    template <class Lock, class Predicate>
        bool timed_wait(Lock& lock, const utc_time& abs_time, Predicate pred)
        {
            while (!pred())
                if (!timed_wait(lock, abs_time))
                    return pred();
            return true;
        }
};
</pre></blockquote>

<p>
We are now exception safe.  And we have atomic  <tt>unlock/wait</tt>
with respect to the <tt>notify</tt> functions.  Now we're good to go,
right?  <b>Wrong!</b>  I personally got this far, but two
experts I highly respect didn't, at least on the first try.  And the
above code is still buggy.  It took me another year of study before I
even knew I still had a bug.  My test cases would <i>very</i> rarely
deadlock.  At times I wrote it off as an operating system bug.  But
persistent analysis turned the blame for this
very-difficult-to-reproduce behavior on myself.  Consider:
</p>

<blockquote>
<table border="1">

<tr>
<th>Thread A</th><th>Thread B</th>
</tr>

<tr>
<td><tt>external.lock()</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td>check predicate, decide to wait</td> <td>&nbsp;</td>
</tr>

<tr>
<td>enter <tt>gen_cond_var::wait</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><tt>unique_lock&lt;mutex&gt; lk(mut_)</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><tt>external.unlock();</tt></td> <td>&nbsp;</td>
</tr>

<tr>
<td><table><tr><td rowspan="2"><tt>cv_.wait(lk)</tt></td><td><tt>mut_.unlock()</tt></td></tr><tr><td bgcolor="#A0FFA0"><tt>mut_.lock()</tt></td></tr></table></td> <td bgcolor="#FFA0A0"><tt>external.lock()</tt></td>
</tr>

<tr>
<td bgcolor="#FFA0A0"><tt>external.lock()</tt></td> <td><tt>check predicate, decide to wait</tt></td>
</tr>

<tr>
<td rowspan="3"><table align="center"><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr></table></td> <td>enter <tt>gen_cond_var::wait</tt></td>
</tr>

<tr>
 <td bgcolor="#A0FFA0"><tt>unique_lock&lt;mutex&gt; lk(mut_)</tt></td>
</tr>

<tr>
 <td><table align="center"><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr></table></td>
</tr>

<tr>
<td>Thread A deadlocked!</td> <td>Thread B deadlocked!</td>
</tr>

</table>
</blockquote>

<p>
In a nutshell, as Thread A exits the <tt>wait</tt> it is locking the internal mutex, and then the external lock.  If
Thread B begins a <tt>wait</tt> at about the same time, it is locking these two mutexes in the opposite order:
external and then internal.  This is the classic recipe for deadlock!
</p>

<p>
The fix is to unlock the internal mutex first as you exit the <tt>wait</tt> and then lock the external mutex.  The
final correct code is as follows:
</p>

<blockquote><pre>
struct lock_it
{
    template &lt;class Lock&gt;
        void operator()(Lock* lk) {lk-&gt;lock();}
};

class gen_cond_var
{
    cond_var cv_;
    mutex    mut_;
public:
   
    gen_cond_var() = default;
    ~gen_cond_var() = default;

    gen_cond_var(const gen_cond_var&) = delete;
    gen_cond_var& operator=(const gen_cond_var&) = delete;

    void notify_one()
    {
        scoped_lock&lt;mutex&gt; _(mut_);
        cv_.notify_one();
    }

    void notify_all()
    {
        scoped_lock&lt;mutex&gt; _(mut_);
        cv_notify_all();
    }

    template <class Lock>
        void wait(Lock& external)
        {
            unique_lock&lt;mutex&gt; internal(mut_);
            external.unlock();
            unique_ptr&lt;Lock, lock_it&gt;       lock_last(&amp;external);
            <b>scoped_lock&lt;unique_lock&lt;mutex&gt;&gt; unlock_next(internal, already_locked);</b>
            cv_.wait(internal);
        }  <font color="#C00000">// mut_.unlock(), external.lock()</font>

    template <class Lock, class Predicate>
        void wait(Lock& lock, Predicate pred)
        {
            while (!pred())
                wait(lock);
        }

    template <class Lock>
        bool timed_wait(Lock& external, const utc_time& abs_time)
        {
            unique_lock&lt;mutex&gt; internal(mut_);
            external.unlock();
            unique_ptr&lt;Lock, lock_it&gt;       lock_last(&amp;external);
            <b>scoped_lock&lt;unique_lock&lt;mutex&gt;&gt; unlock_next(internal, already_locked);</b>
            cv_.timed_wait(internal, abs_time);
        }  <font color="#C00000">// mut_.unlock(), external.lock()</font>

    template <class Lock, class Predicate>
        bool timed_wait(Lock& lock, const utc_time& abs_time, Predicate pred)
        {
            while (!pred())
                if (!timed_wait(lock, abs_time))
                    return pred();
            return true;
        }
};
</pre></blockquote>

<p>
Finally we have:
</p>

<ul>
<li>Exception safety</li>
<li>Atomic <tt>unlock</tt>/<tt>wait</tt> with respect to <tt>notify</tt></li>
<li>Proper ordering of operations as <tt>wait</tt> is exited to avoid deadlock</li>
</ul>

<p>
I'm sure there are people more experienced with condition variables than than I am for whom this tutorial
was a fairly boring and obvious exercise.  However I am personally
impressed that <tt>gen_cond_var</tt> is sufficiently difficult for the
average programmer to get right that standardizing this adaptor is
appropriate.  <tt>gen_cond_var</tt> is arguably very useful. There are
no obvious alternative options for implementing <tt>gen_cond_var</tt>.
And reinventing it is extremely error prone, even for experts.
</p>

<h4><a name="constrained_cv"></a>Constrained condition variables</h4>

<p>
The majority of use cases with condition variables involve only one mutex/lock
for the lifetime of the condition variable.  Thus there is a desire to explicitly
bind the desired mutex to the condition variable, and either check that the correct
mutex is passed to <tt>wait</tt>, or just eliminate the parameter to <tt>wait</tt>
altogether.
</p>

<p>
It is important to recognize that this one-to-one binding between condition variable
and mutex is a very common use case, but is not the <i>only</i> use case.  Therefore
it would be inappropriate for the <i>constrained condition variable</i> to be the
C++ client's only option.
</p>

<p>
A constrained condition variable might be templated on the condition variable
type so that it could work with a <tt>cond_var</tt> or <tt>gen_cond_var</tt>
or even another condition variable adaptor.  It also might be templated on
the mutex type to which it will be bound.  Below is a first cut implementation
of a constrained condition variable:
</p>

<blockquote><pre>
template &lt;class CondVar, class Mutex&gt;
class constrained_cond_var
{
    CondVar cv_;
    Mutex&amp; mut_;

    void on_error()
    {
        throw std::runtime_error("cv - mutex mismatch");
    }
public:
    explicit constrained_cond_var(Mutex&amp; mut) : mut_(mut) {}

    void notify_one() {cv_.notify_one();}
    void notify_all() {cv_.notify_all();}

    template &lt;class Lock&gt;
        void wait(Lock&amp; lock)
        {
            if (lock.mutex() != &amp;mut_)
                on_error();
            cv_.wait(lock);
        }

    template &lt;class Lock, class Predicate&gt;
        void wait(Lock&amp; lock, Predicate pred)
        {
            if (lock.mutex() != &amp;mut_)
                on_error();
            cv_.wait(lock, pred);
        }

    template &lt;class Lock&gt;
        bool timed_wait(Lock&amp; lock, const std::utc_time&amp; abs_time)
        {
            if (lock.mutex() != &amp;mut_)
                on_error();
            return cv_.timed_wait(lock, abs_time);
        }

    template &lt;class Lock, class Predicate&gt;
        bool timed_wait(Lock&amp; lock, const std::utc_time&amp; abs_time, Predicate pred)
        {
            if (lock.mutex() != &amp;mut_)
                on_error();
            return cv_.timed_wait(lock, abs_time, pred);
        }
};
</pre></blockquote>

<p>
There are several things to notice about the above code:
</p>

<ul>
<li>
The code itself is straight forward.  There are no subtlties.  It is easy to get
right.  You simply check for the desired error, and if everything is ok, just
forward to the <tt>CondVar</tt>.
</li>

<li>
<p>
It is not obvious that the design choices I made above are the correct ones.  Certainly
they will serve a large number of use cases well.  But here are some equally valid
alternative choices:
</p>
<ul>
<li>One might want to <tt>assert</tt> on error instead of throw.</li>
<li>One might want to eliminate the <tt>Lock</tt> parameter from
the <tt>wait</tt> and pass in the reference to the mutex (or lock)
that you already have.</li>
<li>One might not care to template on <tt>CondVar</tt> and just choose
<tt>gen_cond_var</tt> since it will handle everything.</li>
<li>One might take a minimalist approach, and target only <tt>cond_var</tt>.</li>
<li>One might want to create a combined condition variable/mutex object which you can both lock
and wait on.</li>
</ul>
</li>
</ul>

<p>
There are several alternatives, and they are all fairly easy to build on top of
<tt>cond_var</tt> and/or <tt>gen_cond_var</tt>.  Therefore I do not think it is
appropriate to standardize a constrained condition variable.  Nor do I believe we
could gain consensus on what it should look like, at least not at this time.
</p>

<h2><a name="shared_mutex"></a>Shared / Unique Ownership Mutexes and Locks</h2>

<p>
This section explores and motivates the TR2-targeted shared (read/write) mutexes.
The reason for including these in this paper is to impress upon the reader how the
entire package works together.  Some of the design choices for the C++0X-targeted
components are not obvious until one considers these TR2 components (e.g. <tt>cond_var</tt>,
<tt>gen_cond_var</tt> and namespace scope templated locks).
</p>

<h3><a name="shared_mutex_synop"></a><tt>&lt;shared_mutex&gt;</tt> Synopsis</h3>

<blockquote><pre>
namespace std {
namespace tr2 {

<font color="#C00000">// A mutex supporting both unique (write) and shared (read) ownership</font>
class shared_mutex
{
public:

    shared_mutex();
    ~shared_mutex();

    shared_mutex(const shared_mutex&amp;) = delete;
    shared_mutex&amp; operator=(const shared_mutex&amp;) = delete;

<font color="#C00000">// Unique ownership</font>

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

<font color="#C00000">// Shared ownership</font>

    void lock_shared();
    bool try_lock_shared();
    bool timed_lock_shared(nanoseconds rel_time);
    void unlock_shared();
};

<font color="#C00000">// A mutex supporting both unique and shared ownership,</font>
<font color="#C00000">//   and the ability to convert between unique and shared ownership.</font>
<font color="#C00000">// Upgrade ownership is exclusive among unique and upgrade ownerships, but shares with</font>
<font color="#C00000">//   shared ownership.  It alone can perform a non-timed, non-try conversion to unique ownership.</font>
class upgrade_mutex
{
public:

    upgrade_mutex();
    ~upgrade_mutex();

    upgrade_mutex(const upgrade_mutex&amp;) = delete;
    upgrade_mutex&amp; operator=(const upgrade_mutex&amp;) = delete;

<font color="#C00000">// Unique ownership</font>

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

<font color="#C00000">// Shared ownership</font>

    void lock_shared();
    bool try_lock_shared();
    bool timed_lock_shared(nanoseconds rel_time);
    void unlock_shared();

<font color="#C00000">// Upgrade ownership</font>

    void lock_upgrade();
    bool try_lock_upgrade();
    bool timed_lock_upgrade(nanoseconds rel_time);
    void unlock_upgrade();

<font color="#C00000">// Shared &lt;-&gt; Unique</font>

    bool try_unlock_shared_and_lock();
    bool timed_unlock_shared_and_lock(nanoseconds rel_time);
    void unlock_and_lock_shared();

<font color="#C00000">// Shared &lt;-&gt; Upgrade</font>

    bool try_unlock_shared_and_lock_upgrade();
    bool timed_unlock_shared_and_lock_upgrade(nanoseconds rel_time);
    void unlock_upgrade_and_lock_shared();

<font color="#C00000">// Upgrade &lt;-&gt; Unique</font>

    void unlock_upgrade_and_lock();  <font color="#C00000">// This conversion is unique to upgrade ownership</font>
    bool try_unlock_upgrade_and_lock();
    bool timed_unlock_upgrade_and_lock(nanoseconds rel_time);
    void unlock_and_lock_upgrade();
};

<font color="#C00000">// A movable (multi-scope) RAII wrapper for share-locking and share-unlocking mutexes supporting
//   shared ownership such as shared_mutex and upgrade_mutex.  unique_lock is used for locking and unlocking
//   such mutexes for unique (write) ownership.</font>
template &lt;class Mutex&gt;
class shared_lock
{
public:
    typedef Mutex mutex_type;

    shared_lock();
    explicit shared_lock(mutex_type&amp; m);
    shared_lock(mutex_type&amp; m, do_not_lock_type);
    shared_lock(mutex_type&amp; m, try_to_lock_type);
    shared_lock(mutex_type&amp; m, already_locked_type);
    shared_lock(mutex_type&amp; m, nanoseconds rel_t);
    ~shared_lock();

    shared_lock(shared_lock const&amp;) = delete;
    shared_lock&amp; operator=(shared_lock const&amp;) = delete;

    shared_lock(shared_lock&amp;&amp; u);
    shared_lock&amp; operator=(shared_lock&amp;&amp; u);

    <font color="#C00000">// convert upgrade ownership to shared ownership</font>
    explicit shared_lock(upgrade_lock&lt;mutex_type&gt;&amp;&amp;);
    explicit shared_lock(const upgrade_lock&lt;mutex_type&gt;&amp;) = delete;

    <font color="#C00000">// convert unique ownership to shared ownership</font>
    explicit shared_lock(unique_lock&lt;mutex_type&gt;&amp;&amp;);
    explicit shared_lock(const unique_lock&lt;mutex_type&gt;&amp;) = delete;

    void lock();                         <font color="#C00000">// calls lock_shared()</font>
    bool try_lock();                     <font color="#C00000">// calls try_lock_shared()</font>
    bool timed_lock(nanoseconds rel_t);  <font color="#C00000">// calls timed_lock_shared()</font>
    void unlock();                       <font color="#C00000">// calls unlock_shared()</font>

    bool owns_lock() const;
    mutex_type* mutex() const;

    void swap(shared_lock&amp;&amp; u);
    mutex_type* release();
};

template &lt;class Mutex&gt; void swap(shared_lock&lt;Mutex&gt;&amp;  x, shared_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(shared_lock&lt;Mutex&gt;&amp;&amp; x, shared_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(shared_lock&lt;Mutex&gt;&amp;  x, shared_lock&lt;Mutex&gt;&amp;&amp; y);

<font color="#C00000">// A movable (multi-scope) RAII wrapper for upgrade-locking and upgrade-unlocking mutexes supporting
//   upgrade ownership such as upgrade_mutex.  unique_lock is used for locking and unlocking
//   such mutexes for unique (write) ownership.  shared_lock is used for locking and unlocking
//   such mutexes for shared ownership.</font>
template &lt;class Mutex&gt;
class upgrade_lock
{
public:
    typedef Mutex mutex_type;

    upgrade_lock();
    explicit upgrade_lock(mutex_type&amp; m);
    upgrade_lock(mutex_type&amp; m, do_not_lock_type);
    upgrade_lock(mutex_type&amp; m, try_to_lock_type);
    upgrade_lock(mutex_type&amp; m, already_locked_type);
    upgrade_lock(mutex_type&amp; m, nanoseconds rel_t);
    ~upgrade_lock();

    upgrade_lock(upgrade_lock const&amp;) = delete;
    upgrade_lock&amp; operator=(upgrade_lock const&amp;) = delete;

    upgrade_lock(upgrade_lock&amp;&amp; u);
    upgrade_lock&amp; operator=(upgrade_lock&amp;&amp; u);

    <font color="#C00000">// convert shared ownership to upgrade ownership</font>
    upgrade_lock(shared_lock&lt;mutex_type&gt;&amp;&amp;, try_to_lock_type);
    upgrade_lock(const shared_lock&lt;mutex_type&gt;&amp;, try_to_lock_type) = delete;

    upgrade_lock(shared_lock&lt;mutex_type&gt;&amp;&amp;, nanoseconds rel_t);
    upgrade_lock(const shared_lock&lt;mutex_type&gt;&amp;, nanoseconds rel_t) = delete;

    <font color="#C00000">// convert unique ownership to upgrade ownership</font>
    explicit upgrade_lock(unique_lock&lt;mutex_type&gt;&amp;&amp;);
    explicit upgrade_lock(const unique_lock&lt;mutex_type&gt;&amp;) = delete;

    void lock();                         <font color="#C00000">// calls lock_upgrade()</font>
    bool try_lock();                     <font color="#C00000">// calls try_lock_upgrade()</font>
    bool timed_lock(nanoseconds rel_t);  <font color="#C00000">// calls timed_lock_upgrade()</font>
    void unlock();                       <font color="#C00000">// calls unlock_upgrade()</font>

    bool owns_lock() const;
    mutex_type* mutex() const;

    void swap(upgrade_lock&amp;&amp; u);
    mutex_type* release();
};

template &lt;class Mutex&gt; void swap(upgrade_lock&lt;Mutex&gt;&amp;  x, upgrade_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(upgrade_lock&lt;Mutex&gt;&amp;&amp; x, upgrade_lock&lt;Mutex&gt;&amp;  y);
template &lt;class Mutex&gt; void swap(upgrade_lock&lt;Mutex&gt;&amp;  x, upgrade_lock&lt;Mutex&gt;&amp;&amp; y);

template &lt;class ToLock, class FromLock&gt;
class transfer_lock
{
    ToLock to_lock_;       // exposition only
    FromLock&amp; from_lock_;  // exposition only
public:
    typedef typename FromLock::mutex_type mutex_type;

    explicit transfer_lock(FromLock&amp; fl);  <font color="#C00000">// Transfers ownership from FromLock to ToLock</font>
    ~transfer_lock();                      <font color="#C00000">// Transfers ownership from ToLock to FromLock</font>

    transfer_lock(const transfer_lock&amp;) = delete;
    transfer_lock&amp; operator=(const transfer_lock&amp;) = delete;

    void lock();                 <font color="#C00000">// Transfers ownership from FromLock to ToLock</font>
    void try_lock();             <font color="#C00000">// Tries to transfers ownership from FromLock to ToLock</font>
    void unlock();               <font color="#C00000">// Transfers ownership from ToLock to FromLock</font>

    bool owns_lock() const;
    mutex_type* mutex() const;
};

}  // tr2
}  // std
</pre></blockquote>

<h3><a name="shared_mutex_rationale"></a><tt>shared_mutex</tt> Rationale</h3>

<p>
<tt>shared_mutex</tt> is roughly equivalent to <tt>pthread_rwlock_t</tt>.  It is a
"read/write" mutex.  The reason the name <i>shared</i> was chosen over <i>read</i>
or <i>read-write</i> is to emphasize the behavior of the mutex, instead of an example
use case of the mutex.  There are many more reasons for a group of threads to share
ownership of a resource other than "reading it".
</p>

<p>
There are more naming differences between <tt>shared_mutex</tt> and <tt>pthread_rwlock_t</tt>,
largely due to generic programming concerns.  Here is a comparison of <tt>shared_mutex</tt>
names and their POSIX counterparts:
</p>

<blockquote>
<table border="1">
<tr>
<th><tt>shared_mutex</tt></th> <th><tt>pthread_rwlock_t</tt></th> <th>Semantics</th>
</tr>
<tr>
<td><tt>lock</tt></td> <td><tt>wrlock</tt></td> <td>Lock the mutex in unique ownership mode</td> 
</tr>
<tr>
<td><tt>try_lock</tt></td> <td><tt>trywrlock</tt></td> <td>Try lock the mutex in unique ownership mode</td> 
</tr>
<tr>
<td><tt>timed_lock</tt></td> <td><tt>timedwrlock</tt></td> <td>Timed lock the mutex in unique ownership mode</td> 
</tr>
<tr>
<td><tt>unlock</tt></td> <td><tt>unlock</tt></td> <td>unlock the mutex from unique ownership mode</td> 
</tr>
<tr>
<td><tt>lock_shared</tt></td> <td><tt>rdlock</tt></td> <td>Lock the mutex in shared ownership mode</td> 
</tr>
<tr>
<td><tt>try_lock_shared</tt></td> <td><tt>tryrdlock</tt></td> <td>Try lock the mutex in shared ownership mode</td> 
</tr>
<tr>
<td><tt>timed_lock_shared</tt></td> <td><tt>timedrdlock</tt></td> <td>Timed lock the mutex in shared ownership mode</td> 
</tr>
<tr>
<td><tt>unlock_shared</tt></td> <td><tt>unlock</tt></td> <td>unlock the mutex from shared ownership mode</td> 
</tr>
</table>
</blockquote>

<p>
The most important point to notice in the above table is that the names for locking a <tt>shared_mutex</tt>
in unique_ownership mode are identical to the names used for the unique ownership mutexes (<tt>mutex</tt>, 
<tt>recursive_mutex</tt>, <tt>timed_mutex</tt>, <tt>recursive_timed_mutex</tt>).  This means that <tt>shared_mutex</tt>
can be used in generic mutex code which assumes unique ownership semantics.  A prime example of
such generic code is <tt>unique_lock</tt>.  The recommended  technique for locking a <tt>shared_mutex</tt>
in unique ownership mode is to construct a <tt>unique_lock&lt;shared_mutex&gt;</tt>.
</p>

<p>
A second difference to notice is that <tt>shared_mutex</tt> uses different names for unlocking
from unique ownership as opposed to unlocking from shared ownership.  POSIX uses the same name
for both operations.  The different names are desired because these are two distinct operations.
Client code either knows how the mutex is locked (and thus how to unlock it), or is using an adaptor/lock wrapped around
the <tt>shared_mutex</tt> that knows how it is locked (e.g. a <tt>shared_lock</tt> or <tt>unique_lock</tt>).
</p>

<p>
Additionally, some operating systems (e.g. Windows) have different names for unlocking shared vs unlocking
unique as well.  Use of different names in the C++ API allows for a more efficient binding to such OS
API's.
</p>

<p>
When viewed in terms of concepts, <tt>timed_mutex</tt> is a <tt>mutex</tt> that adds one more signature
(<tt>timed_lock</tt>).  And <tt>shared_mutex</tt> is a <tt>timed_mutex</tt> that adds four more
signatures (for shared locking/unlocking).
</p>

<p>
Like POSIX, <tt>shared_mutex</tt> allows recursive shared mode locking, but not recursive unique
ownership mode locking.  A recursive shared mutex could be built upon <tt>shared_mutex</tt> if
desired, but this is not proposed.
</p>

<h3><a name="shared_mutex_imp"></a><tt>shared_mutex</tt> Reference Implementation</h3>

<p>
<tt>shared_mutex</tt> can certainly be implemented on top of an OS supplied read-write mutex.
However a portable subset of the implementation is shown here for the purpose of motivating the existence
of <tt>cond_var</tt>:  the <b>razor-thin</b> layer over the OS condition variable.
</p>

<p>
A secondary
motivation is to explain the lack of reader-writer priority policies in <tt>shared_mutex</tt>.
This is due to an algorithm credited to Alexander Terekhov which lets the OS decide which thread
is the next to get the lock without caring whether a unique lock or shared lock is being sought.
This results in a complete lack of reader or writer starvation.  It is simply fair.
</p>

<p>
Below is most of the implementation of <tt>shared_mutex</tt> demonstrating that with this
proposal, very high quality synchronization devices can be easily coded, and with the same
efficiency as if they were coded straight to the OS.
The timed-locking functions are omitted only for brevity.  They are similar to the locking
functions but make use of <tt>cond_var::timed_wait</tt>.  Dependence on the lower-level
C++ API is highlighted.
</p>

<blockquote><pre>
class shared_mutex
{
    <b>mutex    mut_;
    cond_var gate1_;
    cond_var gate2_;</b>
    unsigned state_;

    static const unsigned write_entered_ = 1U &lt;&lt; (sizeof(unsigned)*CHAR_BIT - 1);
    static const unsigned n_readers_ = ~write_entered_;

public:

    shared_mutex() : state_(0) {}

<font color="#C00000">// Exclusive ownership</font>

    void lock();
    bool try_lock();
    bool timed_lock(nanoseconds rel_time);
    void unlock();

<font color="#C00000">// Shared ownership</font>

    void lock_shared();
    bool try_lock_shared();
    bool timed_lock_shared(nanoseconds rel_time);
    void unlock_shared();
};

<font color="#C00000">// Exclusive ownership</font>

void
shared_mutex::lock()
{
    std::this_thread::disable_interruption _;
    <b>unique_lock&lt;mutex&gt; lk(mut_);</b>
    while (state_ &amp; write_entered_)
        <b>gate1_.wait(lk);</b>
    state_ |= write_entered_;
    while (state_ &amp; n_readers_)
        <b>gate2_.wait(lk);</b>
}

bool
shared_mutex::try_lock()
{
    <b>unique_lock&lt;mutex&gt; lk(mut_, try_to_lock);</b>
    if (<b>lk.owns_lock()</b> &amp;&amp; state_ == 0)
    {
        state_ = write_entered_;
        return true;
    }
    return false;
}

void
shared_mutex::unlock()
{
    {
    <b>scoped_lock&lt;mutex&gt; _(mut_);</b>
    state_ = 0;
    }
    <b>gate1_.notify_all();</b>
}

<font color="#C00000">// Shared ownership</font>

void
shared_mutex::lock_shared()
{
    std::this_thread::disable_interruption _;
    <b>unique_lock&lt;mutex&gt; lk(mut_);</b>
    while ((state_ &amp; write_entered_) || (state_ &amp; n_readers_) == n_readers_)
        <b>gate1_.wait(lk);</b>
    unsigned num_readers = (state_ &amp; n_readers_) + 1;
    state_ &amp;= ~n_readers_;
    state_ |= num_readers;
}

bool
shared_mutex::try_lock_shared()
{
    <b>unique_lock&lt;mutex&gt; lk(mut_, try_to_lock);</b>
    unsigned num_readers = state_ &amp; n_readers_;
    if (<b>lk.owns_lock()</b> &amp;&amp; !(state_ &amp; write_entered_) &amp;&amp; num_readers != n_readers_)
    {
        ++num_readers;
        state_ &amp;= ~n_readers_;
        state_ |= num_readers;
        return true;
    }
    return false;
}

void
shared_mutex::unlock_shared()
{
    <b>scoped_lock&lt;mutex&gt; _(mut_);</b>
    unsigned num_readers = (state_ &amp; n_readers_) - 1;
    state_ &amp;= ~n_readers_;
    state_ |= num_readers;
    if (state_ &amp; write_entered_)
    {
        if (num_readers == 0)
            <b>gate2_.notify_one();</b>
    }
    else
    {
        if (num_readers == n_readers_ - 1)
            <b>gate1_.notify_one();</b>
    }
}
</pre></blockquote>

<h3><a name="shared_lock_rationale"></a><tt>shared_lock</tt> Rationale</h3>

<p>
<tt>shared_lock</tt> is a movable RAII wrapper for locking a mutex in shared lock mode.  It is
very analogous to <tt>unique_lock</tt>.  The chief difference is that when you <tt>lock()</tt>
a <tt>shared_lock</tt> it calls <tt>lock_shared()</tt> on the referenced mutex.
</p>

<p>
The reason that the member functions of <tt>shared_lock</tt> use the member functions
<tt>lock</tt>, <tt>try_lock</tt>, <tt>timed_lock</tt>, and <tt>unlock</tt>, as opposed
to <tt>lock_shared</tt>, <tt>try_lock_shared</tt>, <tt>timed_lock_shared</tt>, and <tt>unlock_shared</tt>,
is to facilitate generic code for locks.  For example the standard defined generic locking algorithm
which locks multiple locks without deadlock can just as easily work on a <tt>shared_lock&lt;shared_mutex&gt;</tt>
as a <tt>unique_lock&lt;mutex&gt;</tt>.
</p>

<p>
The <tt>shared_lock</tt> constructors which convert from <tt>unique_lock</tt> and from
<tt>upgrade_lock</tt> will not compile for <tt>shared_lock&lt;shared_mutex&gt;</tt>.  In
order to use these constructors one must instantiate <tt>shared_lock</tt> with a mutex
type which supports <tt>unlock_and_lock_shared</tt> and <tt>unlock_upgrade_and_lock_shared</tt>
respecitively.  <tt>upgrade_mutex</tt> is a mutex which meets these requirements.
</p>

<p>
Similarly, <tt>shared_mutex</tt> does not support the interface required to convert a <tt>shared_lock</tt>
to a <tt>unique_lock</tt>.  Again, this is what <tt>upgrade_mutex</tt> is designed to do.
</p>

<p>
Here is example code which locks a <tt>shared_mutex</tt> in both unique and shared ownership modes:
</p>

<blockquote><pre>
std::tr2::shared_mutex mut;

void example_reader()
{
    std::tr2::shared_lock&lt;std::tr2::shared_mutex&gt; _(mut);
    <font color="#C00000">// mut is now shared-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>

void example_writer()
{
    std::scoped_lock&lt;std::tr2::shared_mutex&gt; _(mut);
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>
</pre></blockquote>

<p>
Here is example code which waits on a <tt>shared_mutex</tt> in both unique and shared ownership modes:
</p>

<blockquote><pre>
std::tr2::shared_mutex mut;
std::gen_cond_var cv;

void wait_in_shared_ownership_mode()
{
    std::tr2::shared_lock&lt;std::tr2::shared_mutex&gt; shared_lk(mut);
    <font color="#C00000">// mut is now shared-locked</font>
    <font color="#C00000">// ...</font>
    while (not_ready_to_proceed())
        cv.wait(shared_lk);  // shared-lock released while waiting
    <font color="#C00000">// mut is now shared-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>

void wait_in_unique_ownership_mode()
{
    std::unique_lock&lt;std::tr2::shared_mutex&gt; lk(mut);
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
    while (not_ready_to_proceed())
        cv.wait(lk);  // unique-lock released while waiting
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>
</pre></blockquote>

<p>
Here is example code which implements a copy assignment operator for a class, shared-locking
the source (which it will not modify), and unique-locking the target.  Each instance of the
class carries a member <tt>shared_mutex</tt> to protect the instance data.  Care must be
taken to avoid deadlock if one thread assigns <tt>a1 = a2;</tt> at the same time that
another thread assigns  <tt>a2 = a1;</tt>.
</p>

<blockquote><pre>
class A
{
    typedef std::tr2::shared_mutex         mutex_t;
    typedef std::tr2::shared_lock&lt;mutex_t&gt; ReadLock;
    typedef std::unique_lock&lt;mutex_t&gt;      WriteLock;

    mutex_t mut_;
    <font color="#C00000">// ... more data ...</font>
public:
    // ...
    A&amp; operator=(const A&amp; a)
    {
        if (this != &amp;a)
        {
            WriteLock this_lock(mut_, do_not_lock);
            ReadLock that_lock(a.mut_, do_not_lock);
            std::lock(this_lock, that_lock);  // lock both locks "atomically" (without deadlock)
            <font color="#C00000">// mut_ is now unique-locked and a.mut_ is now share-locked</font>
            <font color="#C00000">// ... now safe to assign data ...</font>
        }   <font color="#C00000">// mut_ and a.mut_ now unlocked, even if there was an exception</font>
        return *this;
    }
    // ...
};
</pre></blockquote>

<h3><a name="upgrade_mutex_rationale"></a><tt>upgrade_mutex</tt> Rationale</h3>

<p>
In the mutex concept hierarchy, <tt>upgrade_mutex</tt> is a <tt>shared_mutex</tt> and adds
a third ownership mode (unique ownership, shared ownership and now upgrade ownership), and adds
the ability to convert between these three ownership modes.  The reason <tt>upgrade_mutex</tt>
exists at all, is to facilitate converting shared ownership to unique ownership without
unlocking.
</p>

<p>
The value in converting shared ownership to unique ownership without unlocking is that
there may have been considerable work done under shared ownership that would have to be
redone if the shared lock is released.  In this use case the client needs to be sure that
he is the next owner with unique ownership rights.  To understand why a third ownership
mode is needed to achieve this, consider:
</p>

<blockquote>
<table border="1">
<tr>
<th>Thread A</th> <th>Thread B</th> <th>Thread C</th>
</tr>
<tr>
<td><tt>mut.lock_shared();</tt></td> <td><tt>mut.lock_shared();</tt></td> <td><tt>mut.lock_shared();</tt></td>
</tr>
<tr>
<td><tt>mut.unlock_shared_and_lock();</tt></td> <td><tt>mut.unlock_shared_and_lock();</tt></td>  <td><tt>mut.unlock_shared_and_lock();</tt></td>
</tr>
<tr>
<td><table align="center"><tr><td>blocked</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td> <td><table align="center"><tr><td>blocked</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td> <td><table align="center"><tr><td>blocked</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td>
</tr>
<tr>
<td>Thread A deadlocked</td> <td>Thread B deadlocked</td>  <td>Thread C deadlocked</td>
</tr>
</table>
</blockquote>

<p>
We can promise Thread A <b>or</b> Thread B <b>or</b> Thread C that they
are the next one to get unique ownership rights, but we can't make that
promise to all threads (or even two).  Therefore the hypothetical
<tt>unlock_shared_and_lock()</tt> member shown above <b>does not
exist</b>.  So how does a third ownership mode (upgrade ownership) help this?
</p>

<p>
Upgrade ownership is a <i>privileged</i> form of shared ownership.  It shares ownership with
other threads that have shared ownership of the mutex.  But it does not share ownership
with other threads wanting upgrade ownership (much less unique ownership).  This allows
the programmer to choose a single shared ownership thread and designate it as the one
thread among all other shared ownership threads that can convert its ownership to
unique:
</p>

<blockquote>
<table border="1">
<tr>
<th>Thread A</th> <th>Thread B</th> <th>Thread C</th>
</tr>
<tr>
<td><tt>mut.lock_shared();</tt></td> <td><tt>mut.lock_shared();</tt></td> <td><tt>mut.lock_upgrade();</tt></td>
</tr>
<tr>
<td><table align="center"><tr><td>working</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td> <td><table align="center"><tr><td>working</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td> <td><table align="center"><tr><td>working</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td>
</tr>
<tr>
<td align="center">&bull;</td> <td align="center">&bull;</td>  <td><tt>mut.unlock_upgrade_and_lock();</tt></td>
</tr>
<tr>
<td><table align="center"><tr><td align="center">&bull;</td></tr><tr><td><tt>mut.unlock_shared()</tt></td></tr><tr><td align="center">working without ownership</td></tr></table></td> <td><table align="center"><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr><tr><td><tt>mut.unlock_shared();</tt></td></tr></table></td> <td><table align="center"><tr><td>blocked</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td>
</tr>
<tr>
<td><table align="center"><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr><tr><td>&bull;</td></tr></table></td> <td><table align="center"><tr><td>working without ownership</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td> <td><table align="center"><tr><td>working with unique ownership</td></tr><tr><td align="center">&bull;</td></tr><tr><td align="center">&bull;</td></tr></table></td>
</tr>
</table>
</blockquote>

<p>
The next question one may reasonably ask is:  If only one thread has shared ownership, why can't I then convert it
to unique ownership?  The answer is you can.  This is the purpose of the <tt>try_unlock_shared_and_lock()</tt> member
of <tt>upgrade_mutex</tt>.  We can't promise to give the owner of shared ownership the next unique ownership.  But we
can try.  Additionally there is even a <tt>timed_unlock_shared_and_lock()</tt> member of <tt>upgrade_mutex</tt> in
case you want to wait a little while before giving up.  Naturally this is plenty of rope for the client to hang
himself if he abuses this interface with long wait times (this is C++, where the tools have good handles, but are
sharp nonetheless).
</p>

<p>
In summary with <tt>upgrade_mutex</tt>, we have three ownership modes, and the ability to convert among all three
of these modes:
</p>

<blockquote><pre>
unique
  |   \
  |    \
  |    upgrade
  |    /
  |   /
shared
</pre></blockquote>

<p>
Ownership conversions heading down this graph are non-blocking.  There are try and timed-conversions heading up the graph.
And <i>only</i> between <tt>upgrade</tt> and <tt>unique</tt> there is an upward <i>indefinitely blocking</i>
ownership conversion.  At any one time, there can be only one unique ownership owner, which shares ownership with no
one else.  Or there can be one upgrade ownership owner which will share with zero or more shared ownership owners.
Or there can be zero or more shared ownership owners.
</p>

<p>
This tool is only useful when there are multiple shared ownership owners, and only a few of those desire the ability
to guarantee conversion to unique ownership.  If all of the shared ownership owners require this guarantee, then the
system degenerates into one of mutually exclusive ownership, and <tt>mutex</tt> becomes the tool of choice.
</p>

<h3><a name="upgrade_mutex_imp"></a><tt>upgrade_mutex</tt> Implementation</h3>

<p>
So this sounds very powerful, but how much does it cost?
</p>

<p>
This is truly amazing but the cost is virtually identical to the
reference implementation shown for <tt>shared_mutex</tt>. The Terekhov
algorithm needs only minor tweaking to adjust for the introduction of
the upgrade ownership mode, and all of the conversions.  There are no
subtleties (this is a cake walk compared to <tt>gen_cond_var</tt>!). 
Below is shown the <tt>upgrade_mutex</tt> data layout, and a few of the
member functions.  Note that the data layout is <b>identical</b> to that
of <tt>shared_mutex</tt>!  Only a single bit from the <tt>unsigned
state_</tt> is taken to record the presence of an upgrade owner.  The
result is that one can have only half as many simultaneous shared owners
(1,073,741,823 instead of 2,147,483,647 with a 32 bit <tt>unsigned
state_</tt>).  The common member functions of <tt>shared_mutex</tt> and
<tt>upgrade_mutex</tt> are no more expensive in <tt>upgrade_mutex</tt>. 
Compile time constants are changed.  But the complexity or expense of
the member functions do not change.  The expense of locking in the
upgrade ownership mode is comparable to locking in shared ownership
mode.  And the expense of converting among the ownership modes is
comparable to locking and unlocking <tt>shared_mutex</tt> in its two
ownership modes (actually faster).
</p>

<blockquote><pre>
class upgrade_mutex
{
    mutex    mut_;
    cond_var gate1_;
    cond_var gate2_;
    unsigned state_;

    static const unsigned write_entered_ = 1U &lt;&lt; (sizeof(unsigned)*CHAR_BIT - 1);
    static const unsigned upgradable_entered_ = write_entered_ &gt;&gt; 1;
    static const unsigned n_readers_ = ~(write_entered_ | upgradable_entered_);

public:

    upgrade_mutex() : state_(0) {}
    // ...
};

void
upgrade_mutex::lock()
{
    std::this_thread::disable_interruption _;
    unique_lock&lt;mutex&gt; lk(mut_);
    while (state_ &amp; (write_entered_ | upgradable_entered_))
        gate1_.wait(lk);
    state_ |= write_entered_;
    while (state_ &amp; n_readers_)
        gate2_.wait(lk);
}

void
upgrade_mutex::unlock()
{
    {
    scoped_lock&lt;mutex&gt; _(mut_);
    state_ = 0;
    }
    gate1_.notify_all();
}

void
upgrade_mutex::lock_shared()
{
    std::this_thread::disable_interruption _;
    unique_lock&lt;mutex&gt; lk(mut_);
    while ((state_ &amp; write_entered_) || (state_ &amp; n_readers_) == n_readers_)
        gate1_.wait(lk);
    unsigned num_readers = (state_ &amp; n_readers_) + 1;
    state_ &amp;= ~n_readers_;
    state_ |= num_readers;
}

void
upgrade_mutex::lock_upgrade()
{
    std::this_thread::disable_interruption _;
    unique_lock&lt;mutex&gt; lk(mut_);
    while ((state_ &amp; (write_entered_ | upgradable_entered_)) || 
           (state_ &amp; n_readers_) == n_readers_)
        gate1_.wait(lk);
    unsigned num_readers = (state_ &amp; n_readers_) + 1;
    state_ &amp;= ~n_readers_;
    state_ |= upgradable_entered_ | num_readers;
}

void
upgrade_mutex::unlock_upgrade_and_lock()
{
    std::this_thread::disable_interruption _;
    unique_lock&lt;mutex&gt; lk(mut_);
    unsigned num_readers = (state_ &amp; n_readers_) - 1;
    state_ &amp;= ~(upgradable_entered_ | n_readers_);
    state_ |= write_entered_ | num_readers;
    while (state_ &amp; n_readers_)
        gate2_.wait(lk);
}
</pre></blockquote>

<h3><a name="upgrade_lock_rationale"></a><tt>upgrade_lock</tt> Rationale</h3>

<p>
<tt>upgrade_lock</tt> is very similar to <tt>shared_lock</tt> and <tt>unique_lock</tt>.
It exists to serve as a RAII wrapper to lock a mutex satisfying the concept of an upgrade
mutex in upgrade ownership mode (just like <tt>shared_lock</tt> only locks its mutex in
shared ownership mode).  If you want to lock an <tt>upgrade_mutex</tt> in unique ownership
mode, use <tt>unique_lock&lt;upgrade_mutex&gt;</tt>.  If you want to lock an
<tt>upgrade_mutex</tt> in shared ownership mode, use <tt>shared_lock&lt;upgrade_mutex&gt;</tt>.
At the end of the day, an <tt>upgrade_mutex</tt> <i>is</i> a <tt>mutex</tt>.  And it also
<i>is</i> a <tt>shared_mutex</tt>.  It just also happens to be a little bit more.
</p>

<p>
Also recall that the reason for <tt>upgrade_mutex</tt> to exist is to facilitate conversions
among the various ownership modes.  The locks (<tt>unique_lock</tt>, <tt>shared_lock</tt>,
<tt>upgrade_lock</tt>) represent the ownership modes.  One could use <tt>upgrade_mutex</tt>
without locks and manually convert among the ownership modes.  But this is tedious and
error prone (especially within the context of exceptions).  The locks exist to homogenize the syntax for locking and unlocking various
types of mutexes, <b>and</b> for converting among the different types of ownership with
a uniform syntax.  This not only makes the <tt>upgrade_mutex</tt> easier to use, and enables
<i>generic locking algorithms</i> (such as <tt>std::lock</tt>), it also
enables <i>generic lock <b>conversion</b> algorithms</i> (such as <tt>std::tr2::transfer_lock</tt>).
</p>

<p>
If this all sounds complicated, that is only because I am a poor writer.
 The key drivers here are: learn one lock, and you now know them <i>all</i>. 
Learn how to convert between two locks, and you now know how to convert
between <i>any</i> two locks.  The <tt>upgrade_mutex</tt> interface is
complicated.  The locks greatly simplify dealing with this interface.  
Example code follows:
</p>

<blockquote><pre>
typedef std::tr2::upgrade_mutex       Mutex;
typedef std::tr2::shared_lock&lt;Mutex&gt;  ReadLock;
typedef std::tr2::upgrade_lock&lt;Mutex&gt; UpgradeLock;
typedef std::unique_lock&lt;Mutex&gt;       WriteLock;

Mutex mut;

void reader()
{
    ReadLock read_lock(mut);
    <font color="#C00000">// mut is now share-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>

void writer()
{
    WriteLock write_lock(mut);
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>

void reader_writer()
{
    UpgradeLock read_lock(mut);
    <font color="#C00000">// mut is now share-locked, but with privilege to upgrade</font>
    <font color="#C00000">// ...</font>
    WriteLock write_lock(std::move(read_lock));
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>

void reader_writer_reader()
{
    UpgradeLock upgrade_lock(mut);
    <font color="#C00000">// mut is now share-locked, but with privilege to upgrade</font>
    <font color="#C00000">// ...</font>
    WriteLock write_lock(std::move(read_lock));
    <font color="#C00000">// mut is now unique-locked</font>
    <font color="#C00000">// ...</font>
    upgrade_lock = std::move(write_lock);
    <font color="#C00000">// mut is now share-locked, but with privilege to upgrade</font>
    <font color="#C00000">// ...</font>
    ReadLock read_lock(std::move(upgrade_lock));
    <font color="#C00000">// mut is now share-locked, without the privilege to upgrade</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// mut is now unlocked</font>
</pre></blockquote>

<p>
The only clients I expect to have to deal with the
<tt>upgrade_mutex</tt> interface directly are those wishing to write
user-defined mutexes satisfying this interface. Everyone else can use
the homogeneous interface of the locks demonstrated above. For those
clients writing user-defined mutexes with the <tt>upgrade_mutex</tt>
interface, they will not have to replicate the lock infrastructure.  All
of the standard-defined locks will work with their mutex exactly as they
do with <tt>std::tr2::upgrade_mutex</tt> thanks to the standardized
interface for <tt>std::tr2::upgrade_mutex</tt>.
</p>

<h3><a name="transfer_lock_rationale"></a><tt>transfer_lock</tt> Rationale</h3>

<p>
In the above description of <tt>upgrade_lock</tt> and converting ownership as a function
progresses, no regard was given to <tt>if</tt> statements.  That is, sometimes you might
want to conditionally convert ownership for a portion of a function, and then revert
that ownership conversion later in the function.  And it all has to be exception safe.
Here's a preliminary sketch:
</p>

<blockquote><pre>
typedef std::tr2::upgrade_mutex       Mutex;
typedef std::tr2::shared_lock&lt;Mutex&gt;  ReadLock;
typedef std::tr2::upgrade_lock&lt;Mutex&gt; UpgradeLock;
typedef std::unique_lock&lt;Mutex&gt;       WriteLock;

Mutex mut;

void foo()
{
    <font color="#C00000">// Here I need shared ownership of mut</font>
    <font color="#C00000">// ...</font>
    if (some_predicate())
    {
        <font color="#C00000">// Here I need unique ownership of mut</font>
        <font color="#C00000">// ...</font>
    }
    <font color="#C00000">// Here I need shared ownership of mut</font>
    <font color="#C00000">// ...</font>
    if (some_other_predicate())
    {
        <font color="#C00000">// Here I need unique ownership of mut</font>
        <font color="#C00000">// ...</font>
    }
    <font color="#C00000">// Here I need shared ownership of mut, and will not need further upgradability</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// Here mut needs to be unlocked, no matter what</font>
</pre></blockquote>

<p>
This is the job <tt>transfer_lock</tt> is designed for.  It is a very simple lock
adaptor.  Its constructor takes a <b>lvalue</b> lock and transfers ownership (via
the homogeneous lock ownership transfer interface) to another embedded lock.  On
destruction it transfers ownership back to the lock received in the constructor.
This adaptor is only simple because of the homogenized lock conversion interface
outlined earlier.  <tt>transfer_lock</tt> is a prime example of 
<i>generic lock <b>conversion</b> algorithms</i>.
</p>

<p>
Here is how the above <tt>foo</tt> might be coded:
</p>

<blockquote><pre>
typedef std::tr2::upgrade_mutex       Mutex;
typedef std::tr2::shared_lock&lt;Mutex&gt;  ReadLock;
typedef std::tr2::upgrade_lock&lt;Mutex&gt; UpgradeLock;
typedef std::unique_lock&lt;Mutex&gt;       WriteLock;

Mutex mut;

void foo()
{
    UpgradeLock upgrade_lock(mut);
    <font color="#C00000">// Here I need shared ownership of mut</font>
    <font color="#C00000">// ...</font>
    if (some_predicate())
    {
        std::tr2::transfer_lock&lt;WriteLock, UpgradeLock&gt; _(upgrade_lock);
        <font color="#C00000">// Here I need unique ownership of mut</font>
        <font color="#C00000">// ...</font>
    }
    <font color="#C00000">// Here I need shared ownership of mut</font>
    <font color="#C00000">// ...</font>
    if (some_other_predicate())
    {
        std::tr2::transfer_lock&lt;WriteLock, UpgradeLock&gt; _(upgrade_lock);
        <font color="#C00000">// Here I need unique ownership of mut</font>
        <font color="#C00000">// ...</font>
    }
    ReadLock _(std::move(upgrade_lock));
    <font color="#C00000">// Here I need shared ownership of mut, and will not need further upgradability</font>
    <font color="#C00000">// ...</font>
}   <font color="#C00000">// Here mut needs to be unlocked, no matter what</font>
</pre></blockquote>

<h3><a name="transfer_lock_imp"></a><tt>transfer_lock</tt> Implementation</h3>

<p>
The implementation of <tt>transfer_lock</tt> is presented just for the purpose of demonstrating
how simple it is.  And the only reason it is simple is because of the generic lock conversion
syntax adopted by the locks.
</p>

<blockquote><pre>
template &lt;class ToLock, class FromLock&gt;
class transfer_lock
{
    ToLock to_lock_;
    FromLock&amp; from_lock_;
public:
    typedef typename ToLock::mutex_type mutex_type;

    explicit transfer_lock(FromLock&amp; fl) : to_lock_(std::move(fl)), from_lock_(fl) {}
    ~transfer_lock() {from_lock_ = FromLock(std::move(to_lock_));}

    transfer_lock(const transfer_lock&amp;) = delete;
    transfer_lock&amp; operator=(const transfer_lock&amp;) = delete;

    void lock()     {to_lock_ = ToLock(std::move(from_lock_));}
    void try_lock() {to_lock_ = ToLock(std::move(from_lock_), std::try_to_lock);}
    void unlock()   {from_lock_ = FromLock(std::move(to_lock_));}

    bool owns_lock() const {return to_lock_.owns();}
    mutex_type* mutex() const {return to_lock_.mutex();}
};
</pre></blockquote>

<h2><a name="Acknowledgements"></a>Acknowledgements</h2>

<p>
This document is the result of years worth of study. Many people and organizations have contributed.
I would like to especially acknowledge my former employer Freescale (formerly Motorola, formerly
Metrowerks) for their support in the early stages of this work.  And I would like to acknowledge
Apple for their continued support of this work.  Without this strong and continuous visionary support,
this work would not be presented here for your consideration.
</p>

</body>
</html>

