<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN" "http://www.w3.org/TR/html4/strict.dtd">
<!-- saved from url=(0051)http://gromer.mtv.corp.google.com:8080/P0561R1.html -->
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">

<style type="text/css">

body { color: #000000; background-color: #FFFFFF; max-width: 60em}
del { text-decoration: line-through; color: #8B0040; }
ins { text-decoration: underline; color: #005100; }

p.example { margin-left: 2em; }
pre.example { margin-left: 2em; }
div.example { margin-left: 2em; }

code.extract { background-color: #F5F6A2; }
pre.extract { margin-left: 2em; background-color: #F5F6A2;
  border: 1px solid #E1E28E; }

p.function { }
.attribute { margin-left: 2em; }
.attribute dt { float: left; font-style: italic;
  padding-right: 1ex; }
.attribute dd { margin-left: 0em; }

blockquote.std { color: #000000; background-color: #F1F1F1;
  border: 1px solid #D1D1D1;
  padding-left: 0.5em; padding-right: 0.5em; }
blockquote.stddel { text-decoration: line-through;
  color: #000000; background-color: #FFEBFF;
  border: 1px solid #ECD7EC;
  padding-left: 0.5empadding-right: 0.5em; ; }

blockquote.stdins { text-decoration: underline;
  color: #000000; background-color: #C8FFC8;
  border: 1px solid #B3EBB3; padding: 0.5em; }

blockquote pre em { font-family: normal }

table { border: 1px solid black; border-spacing: 0px;
  margin-left: auto; margin-right: auto; }
th { text-align: left; vertical-align: top;
  padding-left: 0.8em; border: none; }
th.multicol { text-align: center; vertical-align:top;
  padding-left: 0.8em; border: none; }
td { text-align: left; vertical-align: top;
  padding-left: 0.8em; border: none; border-top: 1px solid black; }

ul.function { list-style-type: none; margin-top: 0.5ex }
ul.function > li { padding-top:0.25ex; padding-bottom:0.25ex }

pre.function { margin-bottom:0.5ex }

pre span.comment { font-family: serif; font-style: italic }

caption { font-size: 1.25em; font-weight: bold; padding-bottom: 3px;}
</style>

<title>An RAII Interface for Deferred Reclamation</title>
</head>

<body>

<p><b>Document number:</b> P0561R1
<br><b>Date:</b> 2017-06-15
<br><b>Reply to:</b>
Geoff Romer &lt;<a href="mailto:gromer@google.com">gromer@google.com</a>&gt;,
Andrew Hunter &lt;<a href="mailto:ahh@google.com">ahh@google.com</a>&gt;
<br><b>Audience:</b> Concurrency Study Group
</p>

<h1>An RAII Interface for Deferred Reclamation</h1>

<ol>
  <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#background">Background</a></li>
  <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#design">Design Overview</a></li>
  <ol>
    <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#read">Read API</a></li>
    <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#update">Update API</a></li>
    <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#implementability">Implementability</a></li>
  </ol>
  <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#revision">Revision History</a></li>
  <li><a href="http://gromer.mtv.corp.google.com:8080/P0561R1.html#acknowledgements">Acknowledgements</a></li>
</ol>

<h2 id="background">Background</h2>

<p>For purposes of this paper, <dfn>deferred reclamation</dfn>
  refers to a pattern of concurrent data sharing with two components:
  <em>readers</em> access the data while holding reader locks,
  which guarantee that the data will remain live while the lock is held.
  Meanwhile one or more <em>updaters</em> update the data by replacing it
  with a newly-allocated value. All subsequent readers will see
  the new value, but the old value is not destroyed until all readers
  accessing it have released their locks. Readers never block the updater
  or other readers, and the updater never blocks readers. Updates are
  inherently costly, because they require allocating and constructing
  new data values, so they are expected to be rare compared to reads.</p>

<p>This pattern can be implemented in several ways, including reference
  counting, RCU, and hazard pointers. Several of these have been proposed
  for standardization (see e.g.
  <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0233r3.pdf">P0233R3</a>,
  <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0279r1.pdf">P0279R1</a>,
  and <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0461r1.pdf">P0461R1</a>),
  but the proposals have so far exposed these techniques through fairly
  low-level APIs.</p>

<p>In this paper, we will propose a high-level RAII API for deferred
  reclamation, which emphasizes safety and usability rather than
  fidelity to low-level primitives, but still permits highly efficient
  implementation. This proposal is based on an API which is used internally
  at Google, and our experience with it demonstrates the value of providing
  a high-level API: in our codebase we provide both a high-level API like the
  one proposed here, and a more bare-metal RCU API comparable to
  <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2017/p0461r1.pdf">P0461</a>,
  and while the high-level API has over 200 users, the low-level API has
  one.</p>

<p>This proposal is intended to complement, rather than conflict with,
  proposals for low-level APIs for RCU, hazard pointers, etc, and it
  can be standardized independently of whether and when we standardize those
  other proposals.</p>

<h2 id="design">Design Overview</h2>

<p>We begin with a simple example of how this API can be used: a
  <code>Server</code> class which handles requests using some config data,
  and can receive new versions of that config data (e.g. from a thread which
  polls the filesystem). Each request is handled using the latest config data
  available when the request is handled (i.e. updates do not affect
  requests already in flight). No synchronization is required between
  any of these operations.</p>

<pre>  class Server {
   public:

    void SetConfig(Config new_config) {
      config_.update(std::make_unique&lt;const Config&gt;(std::move(new_config)));
    }

    void HandleRequest() {
      snapshot_ptr&lt;const Config&gt; config = config_.get_snapshot();

      // Use `config` like a unique_ptr&lt;const Config&gt;

    }

   private:
    cell&lt;Config&gt; config_;
  };
</pre>

<p>The centerpiece of this proposal is the <code>cell&lt;T&gt;</code>
  class template, a wrapper which is either empty or holds a single object
  of type <code>T</code> (the name is intended to suggest a storage cell, but
  we expect it to be bikeshedded). Rather than providing direct access to the
  stored object, it allows the user to obtain a <em>snapshot</em> of the
  current state of the object:</p>

<pre>  template &lt;typename T, typename Alloc = allocator&lt;T&gt;&gt;
  class basic_cell {
  public:
    // Not copyable or movable
    cell(cell&amp;&amp;) = delete;
    cell&amp; operator=(cell&amp;&amp;) = delete;
    cell(const cell&amp;) = delete;
    cell&amp; operator=(const cell&amp;) = delete;

    cell(nullptr_t = nullptr, const Alloc&amp; alloc = Alloc());
    cell(std::unique_ptr&lt;T&gt; ptr, const Alloc&amp; alloc = Alloc());

    void update(nullptr_t);
    void update(unique_ptr&lt;T&gt; ptr);

    snapshot_ptr&lt;T&gt; get_snapshot() const;
  };

  template <typename t="">
  using cell = basic_cell&lt;<i>see below</i>&gt;;

  template &lt;typename T&gt;
  class snapshot_ptr {
   public:
    // Move only
    snapshot_ptr(snapshot_ptr&amp;&amp;) noexcept;
    snapshot_ptr&amp; operator=(snapshot_ptr&amp;&amp;) noexcept;
    snapshot_ptr(const snapshot_ptr&amp;) = delete;
    snapshot_ptr&amp; operator=(const snapshot_ptr&amp;) = delete;

    constexpr snapshot_ptr(nullptr_t = nullptr);

    // Converting operations, enabled if U* is convertible to T*
    template &lt;typename U&gt;
    snapshot_ptr(snapshot_ptr&lt;U&gt;&amp;&amp; rhs) noexcept;
    template &lt;typename U&gt;
    snapshot_ptr&amp; operator=(snapshot_ptr&lt;U&gt;&amp;&amp; rhs) noexcept;

    T* get() const noexcept;
    T&amp; operator*() const;
    T* operator-&gt;() const noexcept;

    explicit operator bool() const noexcept;
    operator std::shared_ptr&lt;T&gt;() &amp;&amp;;

    void swap(snapshot_ptr&amp; other);
  };

  template &lt;typename T&gt;
  void swap(snapshot_ptr&lt;T&gt;&amp; lhs, snapshot_ptr&lt;T&gt;&amp; rhs);

  template &lt;typename T&gt;
  bool operator==(const snapshot_ptr&lt;T&gt;&amp; lhs, const snapshot_ptr&lt;T&gt;&amp; rhs);
  // Similar overloads for !=, &lt;, &gt;, &lt;=, and &gt;=, and mixed
  // comparison with nullptr_t.

  template &lt;typename T&gt;
  struct hash&lt;snapshot_ptr&lt;T&gt;&gt;;
</typename></pre>

<p>We intend most users to use <code>cell</code> most of the time, although
  <code>basic_cell</code> is more fundamental. The problem with
  <code>basic_cell</code> is that, like so many other things in C++, it has
  the wrong default: a <code>basic_cell&lt;std::string&gt;</code>
  enables multiple threads to share <em>mutable</em> access to a
  <code>std::string</code> object, which is an open invitation to data races.
  Users can make the shared data immutable, but they have to opt into
  it by adding <code>const</code> to the type. We could add the
  <code>const</code> in the library, but then users would have no way to opt
  out if they geuninely do want to share mutable data (this is a reasonable
  thing to want, if the data has a race-free type).</p>

<p>This library is intended for routine use by non-expert programmers,
  so in our view safety <em>must</em> be the default, not something
  users have to opt into. Consequently, we provide a fully general but
  less safe API under the slightly more awkward name <code>basic_cell</code>,
  while reserving the name <code>cell</code> for a safe-by-default
  alias.</p>

<p>Specifically, <code>cell&lt;T&gt;</code> is usually an alias for
  <code>basic_cell&lt;const T&gt;</code>, but if the trait
  <code>is_race_free_v&lt;T&gt;</code> is true, signifying that <code>T</code>
  can be mutated concurrently without races, then <code>cell&lt;T&gt;</code>
  will be an alias for <code>basic_cell&lt;T&gt;</code>. Thus, <code>cell</code>
  exposes the most powerful interface that can be provided safely, while
  <code>basic_cell</code> provides users an opt-out, such as for cases where
  <code>T</code> is not inherently race-free, but the user will ensure it
  is used in a race-free manner. <code>basic_cell</code> thus acts as a marker
  of potentially unsafe code that warrants closer scrutiny, much like
  e.g. <code>const_cast</code>.</p>

<p>The major drawback we see in this approach is that it means that
  constness is somewhat less predictable: if <code>c</code> is a
  <code>cell&lt;T&gt;</code>, <code>c.get_snapshot()</code> might return
  a <code>snapshot_ptr&lt;T&gt;</code> or a
  <code>snapshot_ptr&lt;const T&gt;</code>, depending on <code>T</code>.
  We don't expect this to be a serious problem, because <code>const T</code>
  will be both the most common case by far, and the safe choice if the
  user is uncertain (<code>snapshot_ptr&lt;T&gt;</code> implicitly converts
  to <code>snapshot_ptr&lt;const T&gt;</code>). The problem can also be
  largely avoided through judicious use of <code>auto</code>.</p>

<p><b>Alternative approach:</b> we could unconditionally define
  <code>cell&lt;T&gt;</code> as an alias for
  <code>basic_cell&lt;const T&gt;</code>. This would be substantially simpler,
  but <code>basic_cell</code> would not be able to act as a marker of
  code that requires close scrutiny, since many if not most uses of it
  (e.g. <code>basic_cell&lt;atomic&lt;int&gt;&gt;</code>) would be safe
  by construction. That said, any use of <code>basic_cell&lt;T&gt;</code>
  would still warrant some scrutiny, since shared mutable data usually
  carries a risk of race conditions, even if it is immune to data races.</p>

<p>Other than construction and destruction, all operations on
  <code>basic_cell</code> behave as atomic operations for purposes of
  determining a data race. One noteworthy consequence of this is that
  <code>basic_cell</code> is not movable, because there are plausible extensions
  of this design (e.g. to support user-supplied RCU domains) under which we
  believe that move assignment cannot be made atomic without degrading the
  performance of <code>get_snapshot</code>.</p>

<p><code>cell</code>'s destructor is non-blocking, and
  does not require all outstanding <code>snapshot_ptr</code>s to be destroyed
  first. This is motivated by the principle that concurrency APIs should strive
  to avoid coupling between client threads that isn't mediated by the
  API; concurrency APIs should solve thread coordination problems, not
  create new ones. That said, destruction of the <code>cell</code> must
  already be coordinated with the threads that actually read it,
  so also coordinating with the threads that hold <code>snapshot_ptr</code>s
  to it may not be much additional burden (particularly since
  <code>snapshot_ptr</code>s cannot be passed across threads).</p>

<p>Some deferred reclamation libraries are built around the concept of
  "domains", which can be used to isolate unrelated operations from each
  other (for example with RCU, long-lived reader locks can delay all
  reclamation within a domain, but do not affect other domains). This
  proposal does not include explicit support for domains, so effectively
  all users of this library would share a single global domain. So far
  our experience has not shown this to be a problem. If necessary
  domain support can be added, by adding the domain as a constructor
  parameter of <code>cell</code> (with a default value, so client code
  can ignore domains if it chooses), but it is difficult to see how to
  do so without exposing implementation details (e.g. RCU vs. hazard
  pointers).</p>

<h3 id="read">Read API</h3>

<p><code>snapshot_ptr&lt;T&gt;</code>'s API is closely modeled on
  <code>unique_ptr&lt;T&gt;</code>, and indeed it could often be implemented
  as an alias for <code>unique_ptr</code> with a custom deleter, except that
  we don't want to expose operations such as <code>release()</code> or
  <code>get_deleter()</code> that could violate API invariants or leak
  implementation details.</p>

<p>A <code>snapshot_ptr</code> is either null, or points to a live object of
  type <code>T</code>, and it is only null if constructed from
  <code>nullptr</code>, moved-from, or is the result of invoking
  <code>get_snapshot()</code> on an empty <code>basic_cell</code> (in
  particular, a <code>snapshot_ptr</code> cannot spontaneously become null due
  to the actions of other threads). The guarantee that the object is live means
  that calling <code>get_snapshot()</code> is equivalent to acquiring a reader
  lock, and destroying the resulting <code>snapshot_ptr</code> is equivalent to
  releasing the reader lock.</p>

<p>All operations on a <code>snapshot_ptr</code> are non-blocking.</p>

<p>We require the user to destroy a <code>snapshot_ptr</code> in the same
  thread where it was obtained, so that this library can be implemented in terms
  of libraries that require reader lock acquire/release operations to
  happen on the same thread. Note that <code>unique_lock</code> implicitly
  imposes the same requirement, so this is not an unprecedented restriction.
  There are plausible use cases for transferring a <code>snapshot_ptr</code>
  across threads, and some RCU implementations can support it efficiently,
  but based on SG1 discussions we think it's safer to start with the
  more restrictive API, and broaden it later as needed.</p>

<p>The const semantics of <code>get_snapshot()</code> merit closer
  scrutiny. The proposed API permits users who have only const access
  to a <code>basic_cell&lt;T&gt;</code> to obtain non-const access to
  the underlying <code>T</code>. This is similar to the "shallow const"
  semantics of pointers, but unlike the "deep const" semantics of other
  wrapper types such as <code>optional</code>. In essence, the problem is
  that this library naturally supports three distinct levels of access
  (read-only, read-write, and read-write-update), but the const system can
  only express two. Our intuition (which SG1 in Kona generally seemed to share)
  is that the writer/updater distinction is more fundamental than the
  reader/writer distinction, so const should capture the former rather than
  the latter, but it's a close call.</p>

<p>We had considered providing <code>snapshot_ptr</code> with an aliasing
  constructor comparable to the one for <code>shared_ptr</code>:</p>

<pre>  template &lt;typename U&gt;
  snapshot_ptr(snapshot_ptr&lt;U&gt;&amp;&amp; other, T* ptr);
</pre>

<p>This would enable the user, given a <code>snapshot_ptr</code> to an
  object, to construct a <code>snapshot_ptr</code> to one of its members.
  However, it would mean we could no longer guarantee that a
  <code>snapshot_ptr</code> is either null or points to a live object.
  SG1's consensus in Kona was to omit this feature, and we agree:
  we shouldn't give up that guarantee without a compelling use case.</p>

<p><code>snapshot_ptr&lt;T&gt;</code> rvalues, like
  <code>unique_ptr&lt;T&gt;</code> rvalues, are convertible to
  <code>shared_ptr&lt;T&gt;</code>. However, even if two
  <code>snapshot_ptr</code>s point to the same data, the
  <code>shared_ptr</code>s they generate may not "share ownership" in
  the sense of [util.smartptr.shared], and so <code>use_count()</code> and
  <code>unique()</code> are not guaranteed to reflect the number of readers
  of the underlying snapshot. In principle this may cause confusion, but we
  don't see any reason for users to want to use those methods in this context
  in the first place, especially given all the caveats they carry.
  Perhaps more problematic is the fact that these <code>shared_ptr</code>s
  must still obey the same-thread destruction requirement (or at least,
  the last copy must). It may be easier to accidentally violate this
  requirement with <code>shared_ptr</code>s, because the restriction is
  not marked by the type. We propose to provide this conversion despite
  these concerns because it provides potentially useful functionality that
  users cannot easily recreate (because we provide no access to the
  deleter).</p>

<h3 id="update">Update API</h3>

<p>The update side is more complex. It consists of two parallel sets of
  overloads, constructors and <code>update()</code>, which respectively
  initialize the <code>cell</code> with a given value, and update the
  <code>cell</code> to store a given value. <code>update()</code> is
  always non-blocking, and in particular does not wait for the old data
  value to be destroyed.</p>

<p>The constructor and <code>update()</code> overload taking
  <code>nullptr_t</code> respectively initialize and set the <code>cell</code>
  to the empty state. The fact that a <code>cell</code> can be empty is in
  some ways unfortunate, since it's generally more difficult to reason about
  types with an empty or null state, and users could always say
  <code>cell&lt;optional&lt;T&gt;&gt;</code> if they explicitly want
  an empty state. However, given that <code>snapshot_ptr</code> must have
  a null state in order to be movable, eliminating the empty state would
  not simplify user code much. Furthermore, forbidding the empty state
  when we support initialization from a nullable type would actually
  complicate the API.</p>

<p>The constructor and <code>update()</code> overload that accept a
  <code>unique_ptr&lt;T&gt; ptr</code> take ownership of it, and respectively
  initialize and set the current value of the cell to <code>*ptr</code>.</p>

<p>Internally, <code>cell</code> must maintain some sort of data structure
  to hold its previous values until it can destroy them, and sometimes
  this will require allocating memory. In
  <a href="http://wg21.link/P0561R0">revision 0</a> of this paper, we
  discussed a possible mechanism by which the user could consolidate
  those allocations with their own allocation of the <code>T</code> data.
  However, this would effectively couple the library to a particular
  implementation, and greatly complicate the interface. Furthermore,
  its value is questionable, because those allocations can be made rare and
  small in normal usage (when <code>snapshot_ptr</code>s are destroyed within
  bounded time). In Kona, the SG1 consensus (which we agree with) was that
  such a mechanism is not necessary.</p>

<p>It also bears mentioning that this library may need to impose
  some restrictions on the allocators it supports. In particular,
  it may need to require that the allocator's <code>pointer</code>
  type is a raw pointer, or can safely be converted to one, since
  the implementation layer is unlikely to be able to accept "fancy
  pointers".</p>

<p>We have opted not to provide an emplacement-style constructor or
  <code>update</code> function, for several reasons. First of all,
  it provides no additional functionality; it's syntactic sugar for
  <code>update(make_unique&lt;T&gt;(...))</code> that might on some
  implementations be slightly more efficient (if it can consolidate
  allocations a la <code>make_shared</code>). Second, it would not be
  fully general; sometimes users need to perform some sort of setup
  in the interval between constructing the object and publishing it.
  Finally, it conflicts with another possible feature, support for
  custom deleters.</p>

<p>Currently, <code>unique_ptr</code>s passed to this library must use
  <code>std::default_delete</code>, but it's natural to ask if we could
  support other deleters. There are two ways we could go about that:
  we could make the deleter a template parameter of <code>cell</code>,
  or of the individual methods. Parameterizing the individual methods
  would be more flexible, but it would require some sort of type erasure,
  which would risk bloating the <code>cell</code> object (which can currently
  be as small as <code>sizeof(T*)</code>), and/or degrading
  performance on the read path (which needs to be fast). Parameterizing
  the whole class avoids type erasure, but precludes us from supporting
  emplace-style operations, because there's no way for the library
  to know how to allocate and construct an object so that it can be
  cleaned up by an arbitrary deleter.</p>

<p>Given the uncertainties around both features, the conflict between
  them, and the lack of strong motivation to add them, we have opted to
  omit them both for the time being.</p>

<p>One noteworthy property of <code>update()</code> is that there is
  no explicit support for specifying in the <code>update()</code> call
  how the old value is cleaned up, which we are told is required in some
  RCU use cases in the Linux kernel. It is possible for the user to
  approximate support for custom cleanup by using a custom destructor
  whose behavior is controlled via a side channel, but this
  is a workaround, and an awkward one at that. We've opted not to
  include this feature because use cases for it appear to be quite
  rare (our internal version of this API lacks this feature, and
  nobody has asked for it), and because it would substantially
  complicate the API. It would add an extra <code>update()</code>
  parameter which most users don't need, and which would break the
  symmetry between constructors and <code>update()</code> overloads.
  More fundamentally, it would raise difficult questions about the
  relationship between the user-supplied cleanup logic and the original
  deleter: does the cleanup logic run instead of the deleter,
  or before the deleter? Neither option seems very satisfactory.</p>

<h3 id="implementability">Implementability</h3>

<p>This proposal is designed to permit implementation via RCU, hazard
  pointers, or reference counting (or even garbage collection, we suppose).
  It is also designed to permit implementations that perform reclamation
  on background threads (which can enable <code>update()</code> to be
  nonblocking and lock-free regardless of <code>T</code>), as well
  as implementations that reclaim any eligible retired values during
  <code>update()</code> calls (which can ensure that <code>update()</code>
  is truly wait-free if <code>~T()</code> is, and ensure a bound on the
  number of unreclaimed values). These two techniques appear to
  be mutually exclusive, and neither seems dramatically superior
  to the other: this API is not intended for cases where
  <code>update()</code> is a performance bottleneck, and in practice the
  number of retired but unreclaimed values should be tightly bounded
  in normal use, even if it is theoretically unbounded. Consequently, we
  propose to permit either implementation strategy, by not bounding the
  number of live old values and permitting <code>update()</code> to delete
  unreclaimed old values.</p>

<p>We tentatively propose to also allow the implementation to perform
  reclamation during <code>~snapshot_ptr</code>, because that's the most
  natural choice for reference counting, but we're concerned that this risks
  adding latency to the read path (e.g. if <code>~T</code> is slow or blocking).
  The alternative would be to require reference-counting implementations to
  defer reclamation to a subsequent <code>update()</code> call or a
  separate thread, but that would probably be slower when <code>T</code>
  is trivially destructible. We are opting not to impose this constraint
  because it will be easier to add later than to remove.</p>

<p>We do not intend to support the trivial implementation strategy of never
  performing any reclamation at all, but it is not yet clear if we will
  be able to disallow it without also disallowing other more reasonable
  implementation strategies. If we are not able to make this implementation
  nonconforming, we will non-normatively discourage it as strongly
  as possible.</p>

<p>This proposal cannot be implemented in terms of an RCU library that requires
  user code to periodically enter a "quiescent" state where no reader locks
  are held. We see no way to satisfy such a requirement in a general-purpose
  library, since it means that any use of the library, no matter how local,
  imposes constraints on the global structure of the threads that use it (even
  though the top-level thread code may otherwise be completely unaware of the
  library). This would be quite onerous to comply with, and likely a source of
  bugs. Neither <a href="http://liburcu.org/">Userspace RCU</a>,
  Google's internal RCU implementation, nor <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0233r2.pdf">P0233R2</a>'s hazard pointer API
  impose this requirement, so omitting it does not appear to be a major
  implementation burden.</p>

<p>This proposal also does not require user code to register and unregister
  threads with the library, for more or less the same reasons: it
  would cause local uses of the library to impose global constraints on
  the program, creating an unacceptable usability and safety burden.
  <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0233r2.pdf">P0233R2</a>
  and Google's internal RCU do not impose this requirement, and
  Userspace RCU provides a library that does not (although at some performance
  cost). Furthermore, the standard library can satisfy this requirement if
  need be, without exposing users to it, by performing the necessary
  registration in <code>std::thread</code>.</p>

<p>A previous version of this paper stated that we did not think this
  API could be implemented in terms of hazard pointers, but that was an
  error. We are aware of no obstacles to implementing this
  library in terms of hazard pointers. However, we expect RCU to be the
  preferred implementation strategy in practice, because it can provide
  superior performance on the read side.</p>

<h2 id="revision">Revision History</h2>

<h3>Since P0561R0:</h3>

<ul>
  <li>Introduced <code>basic_cell</code> and altered the role of
    <code>is_race_free</code>, in order to provide a per-instance
    (as well as per-type) opt-out of default thread-safety.</li>
  <li><code>update()</code> calls are now guaranteed not to race with each
    other, because it simplifies the API: it's easier to remember that
    <code>cell</code> is always race-free than to try to keep track of which
    operations can race with which. As noted above, updates are expected
    to be relatively rare, so additional locking in <code>update()</code>
    should not matter, and in any event common implementations should be able
    to support this without locking.</li>
  <li>Added some discussion of why <code>cell</code> is not movable.</li>
  <li>Documented decision to omit an aliasing constructor.</li>
  <li>Required <code>snapshot_ptr</code>s to be destroyed in
    the same thread where they are created; see the main text for the
    rationale.</li>
  <li>Simplified discussion of allocation, and documented decision not
    to provide a mechanism like <code>cell_init</code>.</li>
  <li>Documented decision to omit emplacement-style operations and
    support for custom deleters.</li>
  <li>Documented decision that const access to <code>cell&lt;T&gt;</code>
    grants ability to mutate the <code>T</code>.</li>
  <li>Documented concerns with <code>use_count()</code> and
    <code>unique()</code> on <code>shared_ptr</code>s created from
    <code>snapshot_ptr</code>s.</li>
  <li>Dropped claim that this library cannot be implemented in terms of
    hazard pointers, which doesn't appear to be true.</li>
  <li>Added discussion of tradeoff between bounded space and fast
    <code>update()</code>.</li>
  <li>Added a usage example.</li>
</ul>

<h2 id="acknowledgements">Acknowledgements</h2>

<p>Thanks to Paul McKenney and Maged Michael for valuable feedback on
  drafts of this paper.</p>


</body><style type="text/css">embed[type*="application/x-shockwave-flash"],embed[src*=".swf"],object[type*="application/x-shockwave-flash"],object[codetype*="application/x-shockwave-flash"],object[src*=".swf"],object[codebase*="swflash.cab"],object[classid*="D27CDB6E-AE6D-11cf-96B8-444553540000"],object[classid*="d27cdb6e-ae6d-11cf-96b8-444553540000"],object[classid*="D27CDB6E-AE6D-11cf-96B8-444553540000"]{	display: none !important;}</style></html>