<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>
<head>
	<meta name="viewport"
	content="width=device-width, initial-scale=1.0, maximum-scale=1.0, user-scalable=0">
	<meta name="viewport" content="width=device-width">
	<meta content="True" name="HandheldFriendly">
	<meta http-equiv="content-type" content="text/html; charset=iso-8859-1">
	<title>Introduction of std::hive to the standard library</title>
	<style type="text/css">
	 .column {
		 float: left;
		 width: 25%;
	 }

	 /* Clear floats after the columns */
	 .row:after {
		 content: "";
		 display: table;
		 clear: both;
	 }
			pre {
				overflow-x: auto;
				white-space: pre-wrap;
				word-wrap: break-word;
	 }
			body {
				 font-size: 12pt;
				 font-weight: normal;
				 font-style: normal;
				 font-family: serif;
				 color: black;
				 background-color: white;
				 line-height: 1.2em;
				 margin-left: 4em;
				 margin-right: 2em;
			}
			/* paragraphs */

			p {
				 padding: 0;
				 line-height: 1.3em;
				 margin-top: 1.2em;
				 margin-bottom: 1em;
				 text-align: left;
			}

			table  {
				 margin-top: 3.8em;
				 margin-bottom: 2em;
				 text-align: left;
				 table-layout:fixed;
				 width:100%;
			}
			td {
							overflow:auto;
				word-wrap:break-word;
			}

			/* headings */

			h1 {
				 font-size: 200%;
				 font-weight: bold;
				 font-style: normal;
				 font-variant: small-caps;
				 line-height: 1.6em;
				 text-align: left;
				 padding: 0;
				 margin-top: 3.5em;
				 margin-bottom: 1.7em;
			}
			h2 {
				 font-size: 152%;
				 font-weight: bold;
				 font-style: normal;
				 text-decoration: underline;
				 padding: 0;
				 margin-top: 4.5em;
				 margin-bottom: 1.1em;
			}
			h3 {
				 font-size: 125%;
				 font-weight: bold;
				 font-style: normal;
				 text-decoration: underline;
				 padding: 0;
				 margin-top: 4em;
				 margin-bottom: 1.7em;
			}
			h4 {
				 font-size: 113%;
				 font-weight: bold;
				 font-style: normal;
				 padding: 0;
				 margin-top: 4em;
				 margin-bottom: 1.7em;
			}
			h5 {
				 font-size: 100%;
				 font-weight: bold;
				 font-style: italic;
				 padding: 0;
				 margin-top: 3em;
				 margin-bottom: 1em;
			}
			h6 {
				 font-size: 88%;
				 font-weight: bold;
				 font-style: normal;
				 padding: 0;
				 margin-top: 3em;
				 margin-bottom: 1em;
			}
			/* divisions */

			div {
				 padding: 0;
				 margin-top: 0em;
				 margin-bottom: 0em;
			}
			ul {
				 margin: 12pt 0pt 22pt 18pt;
				 padding: 0pt 0pt 0pt 0pt;
				 list-style-type: square;
				 font-size: 98%;
			}
			ol {
				 margin: 12pt 0pt 38pt 17pt;
				 padding: 0pt 0pt 0pt 0pt;
			}
			li {
				 margin: 0pt 0pt 10.5pt 0pt;
				 padding: 0pt 0pt 0pt 0pt;
				 text-indent: 0pt;
				 display: list-item;
			}
			/* inline */

			strong {
				 font-weight: bold;
			}
			sup,
			sub {
				 vertical-align: baseline;
				 position: relative;
				 top: -0.4em;
				 font-size: 70%;
			}
			sub {
				 top: 0.4em;
			}
			em {
				 font-style: italic;
			}
								code {
										font-family: Courier New, Courier, monospace;
										font-size: 90%;
										padding: 0;
										word-wrap:break-word;
									 }
			ins {
				 background-color: A0FFA0;
				 text-decoration: underline;
			}
			del {
							background-color:#FFA0A0;
				 text-decoration: line-through;
			}
			a:hover {
				 color: #4398E1;
			}
			a:active {
				 color: #4598E1;
				 text-decoration: none;
			}
			a:link.review {
				 color: #AAAAAF;
			}
			a:hover.review {
				 color: #4398E1;
			}
			a:visited.review {
				 color: #444444;
			}
			a:active.review {
				 color: #AAAAAF;
				 text-decoration: none;
			}
	</style>
</head>

<body>
Audience: LEWG, SG14, WG21<br>
Document number: D0447R28<br>
Date: 2024-12-13<br>
Project: Introduction of std::hive to the standard library<br>
Reply-to: Matthew Bentley &lt;mattreecebentley@gmail.com&gt;<br>


<h1>Introduction of std::hive to the standard library</h1>

<h2>Table of Contents</h2>
<ol type="I">
	<li><a href="#introduction">Introduction</a></li>
	<li><a href="#definitions">Definitions</a></li>
	<li><a href="#motivation">Motivation and Scope</a></li>
	<li><a href="#impact">Impact On the Standard</a></li>
	<li><a href="#design">Design Decisions</a></li>
	<li><a href="#technical">Technical Specification</a></li>
	<li><a href="#Acknowledgments">Acknowledgments</a></li>
	<li>Appendices:
		<ol type="A">
			<li><a href="#basicusage">Basic usage examples</a></li>
			<li><a href="#benchmarks">Original reference implementation benchmarks,
				differences from current reference, and links</a></li>
			<li><a href="#faq">Frequently Asked Questions</a></li>
			<li><a href="#sg14gameengine">Typical game engine requirements</a></li>
			<li><a href="#users">User experience reports</a></li>
			<li><a href="#container_guide">Brief guide for selecting an appropriate
				container based on usage and performance</a></li>
			<li><a href="#constraints_summary">Hive constraints summary</a></li>
			<li><a href="#external_prior_art">Links to prior art</a></li>
			<li><a href="#non_reference_implementations_info">Information on
				non-reference-implementation hive designs</a></li>
		</ol>
 </li>
</ol>

<h2><a id="revisions"></a>Revision history</h2>
<ul>
<li>R28: LWG telecon 7-12-2024 + reflector changes: corrections to move assignment/constructor, notes, emplace/emplace_hint, insert (singular), trim_capacity, fill insert, initializer_list insert, range insert, erase, sort. Updated FAQ 19 to also cover shrink_to_fit and move assignment/construction when allocators are unequal. Minor correction to non-reference-implementation appendix.</li>

<li>R27: Correction/optimization to bitset+skipfield approach to element skipping. Updated tab:containers.summary and sequence.reqmts to match current draft C++ standard. Added [diff] and [library] sections to tech spec. Other nitpicks from first LWG meeting processed. Correction to possessive usage of 'its'. Placeholder numbers in tech spec removed. "may be the back of the container when no erasures have occurred, or " removed in synopsis. Added move assignment definition as it conditionally transfers current-limits from source to destination so is not covered by blanket wording, and copy assignment definition as it does not transfer current-limits. Added comment sections to synopsis. Added note to overview re: hard limits being constrained by allocator_traits&lt;Alloc&gt;::max_size(). Update to appendix info for supporting small types, given that I found a better/cheaper alternative. Changed all references to 'bitfield' to 'bitset' to avoid ambiguity in the context of C++. Removed references to non-literal types being a problem for constexpr in the FAQ, since Ville's paper P2242 removes that hurdle. Further editing to hive.overview.3. Update Appendix B. Added a FAQ entry on why we don't support why a fully-functional insert(iterator position, T value).<br><br>

LWG meeting 19-11-24 changes: added hint'ed insert overloads. Remove "If an exception is thrown other than by the move constructor of a non-Cpp17CopyInsertable T, both x and *this are left in a usable state." from move assignment. Remove throws paragraphs in move assignment, shrink_to_fit and reshape. Move reshape, block_capacity_hard_limits and block_capacity_default_limits to [hive.capacity]. Modification to exception info for shrink_to_fit based on removing support for copy-if-non-noexcept-move. Rebase reserve and sort 'Throws' wording around the more recent basic_string wording.<br><br>

LWG meeting 20-11-24 changes: Updates to tech spec overview, removable of 5.5. Added definitions for copy constructors because blanket wording for copy constructors requires operator == on container. Nitpicks. Changes to move assignment, copy assignment, sort, reserve, unique, capacity and shrink_to_fit definitions. Remove sort allocation notes based on info that allocators are not intended for transitory storage in containers.<br><br>

LWG meeting 21-11-24 changes: Added FAQ entry explaining why erasing the back element invalidates the past-the-end iterator. Emplace_hint added and emplace/insert functions redefined because they don't inherit from sequence containers due to lack of <code>position</code> parameter. Corrections to sort, reshape, reserve, trim_capacity, move assign, sort, erase/erase_if, splice, reshape, copy constructor, copy assign. Move constructor definition added.<br><br>

LWG meeting 22-11-24 changes: Added FAQ entries explaining under what circumstances insert/emplace/sort can invalidate the past-the-end iterator. Overview 5.5 removed as replicates information already in function definitions. Fixes to reserve, insert_range, emplace/emplace-hint, insert(x), insert_hint(x), insert(n, x), insert(start, end), insert_range, insert(il), move constructor, move assignment, copy constructor, initializer-list constructor, sort, trim_capacity, splice, unique, erase. Corrections to FAQ.

Post-Wroclaw reflector changes: hive_limits constructor constexpr now, nitpicks.

</li>
	<li>R26: Tech spec changes: kona final meeting changes applied: is_active()
		removed. Precondition added to get_iterator, Effects 'return value of end()
		when pointer is not in <code>*this</code>' clause is removed. Heading corrections. Formatting corrections.<br>
		Non-tech-spec changes: Update google groups links for new archived google groups format. Bring prose sections of the paper up to date with
		the finalized technical specification and reduce those sections'
		repetition. Move alternative implementation guidelines from Design
		Decisions to the alt implementation appendix. Frequently Asked Questions
		and "responses to specific questions from the committee" combined into
		single appendix. Benchmarks and notes re: original reference implementation
		vs current implementation combined into a single appendix. "Questions for
		the committee" section removed, replaced with (brief) Definitions section.
		Links moved from plflib.org to external, potentially more-permanent sites
		where possible. Some FAQ entries consolidated/removed. References to
		distance() being able to deal with negative values removed (user must use
		iterator comparison operators and swap where necessary). Some corrections
		to FAQ. Info about lack of resize() moved from Design Decisions to FAQ.
		Time complexity appendix removed and info moved into Design Decisions
		section for individual functions, to avoid repetition. Changed 'over-alignment' to 'artificial widening the storage' as it is more accurate. Changed "bitset + jump-counting" approach in alt implementation appendix to cover smaller types. Added 3 better approaches to 'supporting very small types' under alt implementation appendix. Corrected some links. Add section about comparison to slot maps to FAQ.</li>
	<li>R25: Publication of mid-Kona intra-conference draft updates (based on
		committee feedback) for book-keeping: Update to reshape's exception
		guarantees. Add 'invalidates past-the-end iterators' for splice. Add
		constexpr to definition of block_capacity_limits(). Reduction/clarification
		of when element reordering may occur for reshape, shrink_to_fit. Slight
		rewording of sort() to be more in-keeping with list::sort(). Minor
		editorial corrections ("shall be" to "is", "into hive" to "into
		<code>*this</code>") to match current standard ie. "T shall be
		Cpp17MoveInsertable into hive" becomes "T is Cpp17MoveInsertable into
		<code>*this</code>". Added "This operation may change capacity()" to
		remarks in reshape. sort: Throws section removed as this is covered by
		blanket wording and by Remarks and the post-Remarks note. Specifically the
		following line is removed: "<i>Throws:</i> <code>bad_alloc</code> if it fails to
		allocate any memory necessary for this operation. <code>comp</code> may
		also throw." Note: forward_list::sort and list::sort do not include such a
		line.</li>
	<li>R24: Corrected HTML errors (some highlighting incorrect in tech
		specification, other places). Overhauled info for supporting small types
		without overalignment, in the alt implementation details appendix. Minor
		corrections to tech spec. Slight rewording of shrink_to_fit tech spec to be
		closer in wording to vector's.</li>
	<li>R23: Correction/update to constexpr usage section in appendices.
		Correction of bit_cast to reinterpret_cast in Design Decisions. Added
		section within Design Decisions-&gt;Erased-element location recording
		mechanism, detailing the various approaches possible for keeping track of
		which blocks contain erasures. Correction to Design
		Decisions-&gt;Collection of element blocks + metadata. Addition of
		explanatory infographic to intro, courtesy of Victor Reverdy's suggestion.
		&lt;=&gt; exclusion note removed from tech overview. Growth factor "greater
		than 1" removed from overview and "which need not be integral" added to
		bring in line with other mentions of growth factor in Standard. Note1
		'poem' removal in hive.overview taken out, referred to LWG directly for
		review. LWG feedback on time complexity wording: make maximum required by
		current implementation, if at later point it needs to be adjusted it can,
		future ABI breakage is more a concern than time complexity. Time complexity
		for erasure-handling removed as it is possible to make it O(1) for all
		types without overaligning small types (see end of alt implementation
		appendix for details). Tim Song and Jens Maurer signed off on
		erasure-handling time complexity wording (x + y where x = in elements, y =
		in element blocks, this approach is mirrored in parts of the standard).
		Further updates based on Ben Craig's feedback &amp; private review group
		feedback. Other corrections to time complexity. Updates to alt
		implementation details appendix. Removed clear() description from tech spec
		as is covered by the containers blanket wording and the overview blanket
		wording. Erase descriptions also reduced as portions were covered by
		overview blanket wording.</li>
	<li>R22: Addition of Hive constraints summary in appendices. Addition of
		prior art info to appendices. Additional information around alternative
		(vector-of-pointer based) implementations added in Appendices and said
		information in Design Decisions modified. Some other appendix items
	updated.</li>
	<li>R21: Included note in Design Decisions section regarding conditions under
		which block capacity limits are copied between hives, and formalized this
		in the Technical Specification. Corrections to time complexity appendix.
		Correction to title in Design Decisions section.</li>
	<li>R20: Removal of == != and &lt;=&gt; container operators. Reasoning for
		this added to the FAQ appendix. Addition of reference implementation
		licensing compatibility with other licensing to FAQ appendix. Minor
		corrections. Removal of complexity specification for sort() to allow for
		different algorithms to be used. Iterator invalidation information moved
		into tech spec function descriptions. Tech spec overhaul via Ben Craig's
		feedback. C++20 ranges overloads added. Tech spec numbering removed,
		replaced with tags. Addition of 'block_capacity_hard_limits' function. More
		FAQ entries. Remove priority template parameter and reasoning added to FAQ.
		Tech spec overhaul via Jonathan Wakely's feedback. Removal of memory() (see
		FAQ), trim() renamed to trim_capacity(). Conditions for functions with
		block limits as an argument changed from throwing when not satisfying
		requirements, to undefined behavior (now that requirements can always be
		satisfied by user via calling block_capacity_hard_limits()). Addition of
		unique() functions since optimal implementation is non-intuitive.
		Clarification of erase() iterator invalidation rules. iterator
		get_iterator(pointer p) changed to iterator get_iterator(const_pointer p).
		Corrections to synopsis. Removal of advance/distance/next/prev overloads
		from tech spec (this allows them to be specialisations within those
		functions for implementors). Removed copy constructors and operator= from
		[hive.cons] as these are covered in [sequence.req] (this is reflected in
		other sequence container [cons]). hive_limits constructor changed to
		constexpr to allow for constexpr calling from block_capacity_hard_limits().
		is_active(const_iterator) added. reverse_iterator and
		const_reverse_iterator now equivalent to
		std::reverse_iterator&lt;iterator/const_iterator&gt; in the tech spec.
		Corrections/additions to time complexity details in tech spec. Tech spec
		update based on Tim Song's feedback and committee feedback.
		trim_capacity(n) overload added. hive_limit constructor defaults changed to
		separate overloads.</li>
	<li>R19: Correction to intro. Addition of sort() to invalidation rules.
		Removal of questions for the committee based on Ga&scaron;per A&#x17e;man's
		feedback. Minor corrections. Moved constexpr explanation to appendices.
		Addition and removal of questions from appendices.</li>
	<li>R18: Addition of &lt;=&gt; operator. Addition of basic guide for
		container selection within/without the standard library, in appendix.
		Addition of edits to [containers.general] and [sequence.reqmts] in
		technical specification. Update 22.3.14.1. Some rewording.</li>
	<li>R17: Addition of appendix containing reported user experiences. Editing
		of constexpr exploration.</li>
	<li>R16: Explanation of desired clear() behavior added in Design Decisions
		section. References to colony reference implementation changed to refer to
		hive reference implementation (C++20 only). Textual corrections.
		Range-constructor corrected to allow sentinels.</li>
	<li>R15: Added throw details to splice. Further design decisions information
		on reshape and splice. Assign() overload for sentinels (differing iterator
		types) added. Minor text snafu corrections. Colony changed to hive based on
		D2332R0.</li>
	<li>R14: get_iterator_from_pointer changed to get_iterator - the pointer part
		is implied by the fact that it's the only argument. Added const_iterator
		overload for get_iterator - which takes a const_pointer and is a const
		function. Some wording corrections, additional design decisions
		information. HTML corrections.</li>
	<li>R13: Revisions based on committee feedback. Skipfield template parameter
		changed to priority enum in order to not over-specify container
		implementation. Other wording changes to reduce over-specifying
		implementation. Some non-member template functions moved to be friend
		functions. std::limits changed to std::colony_limits. block_limits()
		changed to block_capacity_limits().</li>
	<li>R12: Fill, range and initializer_list inserts changed to void return,
		since the insertions are not guaranteed to be sequential in terms of colony
		order and therefore returning an iterator to the first insertion is not
		useful. Non-default-value fill constructor changed to non-explicit to match
		other std:: containers. Correction to reserve() wording. Other minor
		corrections and clarity improvements.</li>
	<li>R11: Overhaul of technical specification to be more 'wording-like'. Minor
		alterations &amp; clarifications. Additional alternative approach added to
		Design Decisions under skipfield information. Overall rewording. Reordering
		based on feedback. Removal of some easily-replicated 'helper' functions.
		Change to noexcept guarantees. Assign added. get_block_capacity_limits and
		set_block_capacity_limits functions renamed to block_limits and reshape.
		Addition of block-limits default constructors. Reserve() and
		shrink_to_fit() reintroduced. trim(), erase and erase_if overloads
	added.</li>
	<li>R10: Additional information about time complexity requirements added to
		appendix, some minor corrections to time complexity info. The 'bentley
		pattern' (this was always a temporary name) is renamed to the more astute
		'low-complexity jump-counting pattern'. Likewise the 'advanced
		jump-counting skipfield' is renamed to the 'high-complexity jump-counting
		pattern' - for reasoning behind this go <a href="https://plflib.org/blog.htm#whatsinaname">here</a>. Both refer to
		time complexity of operations, as opposed to algorithmic complexity. Some
		other corrections.</li>
	<li>R9: Link to Bentley pattern paper added, and is spellchecked now.</li>
	<li>R8: Correction to SIMD info. Correction to structure (missing appendices
		title, member functions and technical specification were conjoined,
		acknowledgments section had mysteriously gone missing since an earlier
		version, now restored and updated). Update intro. HTML corrections.</li>
	<li>R7: Minor changes to member functions.</li>
	<li>R6: Re-write. Reserve() and shrink_to_fit() removed from
	specification.</li>
	<li>R5: Additional note for reserve, re-write of introduction.</li>
	<li>R4: Addition of revision history and review feedback appendices. General
		rewording. Cutting of some dead wood. Addition of some more dead wood.
		Reversion to HTML, benchmarks moved to external URL, based on feedback.
		Change of font to Times New Roman based on looking at what other papers
		were using, though I did briefly consider Comic Sans. Change to insert
		specifications.</li>
	<li>R3: Jonathan Wakely's extensive technical critique has been actioned on,
		in both documentation and the reference implementation. "Be clearer about
		what operations this supports, early in the paper." - done (V. Technical
		Specifications). "Be clear about the O() time of each operation, early in
		the paper." - done for main operations, see V. Technical Specifications.
		Responses to some other feedbacks included in the foreword.</li>
	<li>R2: Rewording.</li>
</ul>

<h2><a id="introduction"></a>I. Introduction</h2>
<img src="https://archive.org/download/hive_addition/hive_infographic.png"
alt="Explanatory infographic of hive general structure, based on reference implementation"
style="width: 100%; max-width: 21cm; height: auto;">

<p>The purpose of a container in the standard library cannot be to provide the
optimal solution for all scenarios. Inevitably in fields such as
high-performance trading or gaming, the optimal solution within critical loops
will be a custom-made one that fits that scenario perfectly. However, outside
of the most critical of hot paths, there is a wide range of application for
more generalized solutions.</p>

<p>Hive is a formalisation, extension and optimization of what is typically
known as a 'bucket array' or 'object pool' container in game programming
circles. Thanks to all the people who've come forward in support of the paper
over the years, I know that similar structures exist in various incarnations
across many fields including high-performance computing, high performance
trading, 3D simulation, physics simulation, robotics, server/client application
and particle simulation fields (see <a href="https://groups.google.com/a/isocpp.org/g/sg14/c/1iWHyVnsLBQ/m/tEJfuJMvCQAJ">this
google groups discussion</a>, the <a href="https://isocpp.org/files/papers/P3011R0.pdf">hive supporting paper #1</a>
and <a href="#external_prior_art">appendix links to prior art</a>).</p>

<p>The concept of a bucket array is: you have multiple memory blocks of
elements, and a boolean token for each element which denotes whether or not
that element is 'active' or 'erased' - commonly known as a skipfield. If it is
'erased', it is skipped over during iteration. When all elements in a block are
erased, the block is removed, so that iteration does not lose performance by
having to skip empty blocks. If an insertion occurs when all the blocks are
full, a new memory block is allocated.</p>

<p>The advantages of this structure are as follows: because a skipfield is
used, no reallocation of elements is necessary upon erasure. Because the
structure uses multiple memory blocks, insertions to a full container also do
not trigger reallocations. This means that element memory locations stay stable
and iterators stay valid regardless of erasure/insertion. This is highly
desirable, for example, <a href="#sg14gameengine">in game programming</a>
because there are usually multiple elements in different containers which need
to reference each other during gameplay, and elements are being inserted or
erased in real time. The only non-associative standard library container which
also has this feature is std::list, but it is undesirable for performance and
memory-usage reasons. This does not stop it being used in <a href="https://isocpp.org/files/papers/P3012R0.pdf">many open-source
projects</a> due to this feature and its splice operations.</p>

<p>Problematic aspects of a typical bucket array are that they tend to have a
fixed memory block size, tend to not re-use memory locations from erased
elements, and utilize a boolean skipfield. The fixed block size (as opposed to
block sizes with a growth factor) and lack of erased-element re-use leads to
far more allocations/deallocations than is necessary, and creates memory waste
when memory blocks have many erased elements but are not entirely empty. Given
that allocation is a costly operation in most operating systems, this becomes
important in performance-critical environments. The boolean skipfield makes
iteration time complexity at worst O(n) in <code>capacity()</code>, as there is
no way of knowing ahead of time how many erased elements occur between any two
non-erased elements. This can create variable latency during iteration. It also
requires branching code for each skipfield node, which may cause performance
issues on processors with deep pipelines and poor branch-prediction failure
performance.</p>

<p>A hive uses a non-boolean method for skipping erased elements, which allows
for more-predictable iteration performance than a bucket array and O(1)
iteration time complexity; the latter of which means it meets the C++
standard requirements for iterators, which a boolean method doesn't. It has an
(optional - on by default) growth factor for memory blocks and reuses erased
element locations upon insertion, which leads to fewer
allocations/reallocations. Because it reuses erased element memory space, the
exact location of insertion is undefined. Insertion is therefore considered
unordered, but the container is sortable. Lastly, because there is no way of
predicting in advance where erasures ('skips') may occur between non-erased
elements, an O(1) time complexity [ ] operator is not possible and thereby the
container is bidirectional but not random-access.</p>

<p>There are two patterns for accessing stored elements in a hive: the first is
to iterate over the container and process each element (or skip some elements
using advance/prev/next/iterator ++/-- functions). The second is to store an
iterator returned by insert() (or a pointer derived from the iterator) in some
other structure and access the inserted element in that way. To better
understand how insertion and erasure work in a hive, see the following
diagrams.</p>

<h4>Insertion to back</h4>

<p>The following demonstrates how insertion works in a hive compared to a
vector when size == capacity.</p>
<img src="https://archive.org/download/hive_addition/vector_addition.gif"
alt="Visual demonstration of inserting to a full vector"
style="max-width: 21cm; height: auto;"><br><img
src="https://archive.org/download/hive_addition/hive_addition.gif"
alt="Visual demonstration of inserting to a full hive"
style="max-width: 21cm; height: auto;">

<h4>Non-back erasure</h4>

<p>The following images demonstrate how non-back erasure works in a hive
compared to a vector.</p>
<img src="https://archive.org/download/hive_addition/vector_erasure.gif"
alt="Visual demonstration of randomly erasing from a vector"
style="max-width: 21cm; height: auto;"><br><img
src="https://archive.org/download/hive_addition/hive_erasure.gif"
alt="Visual demonstration of randomly erasing from a hive"
style="max-width: 21cm; height: auto;">

<p>There is additional introductory information about the container's structure
in <a href="https://www.youtube.com/watch?v=wBER1R8YyGY">this CPPcon talk</a>,
though much of its information is out of date (hive no longer uses a stack but
a <a href="https://en.wikipedia.org/wiki/Free_list">free list</a> instead,
benchmark data is out of date, etcetera), and more detailed implementation
information is available in <a href="https://www.youtube.com/watch?v=V6ZVUBhls38">this CPPnow talk</a>. Both
talks discuss the precursor for std::hive, called <a href="https://github.com/mattreecebentley/plf_colony">plf::colony</a>.</p>

<h2><a id="definitions"></a>II. Definitions</h2>

<p>For the purposes of the non-technical-specification sections of this document, the following terms
are defined:</p>
<ul>
	<li>Link: denotes any form of referencing between elements whether it be via
		ids/iterators/pointers/indexes/references or anything else.</li>
	<li>Active blocks: memory blocks of type T, which are interacted with when iterating over the sequence in a multiple-memory-block
		container.</li>
	<li>Reserved blocks: memory blocks of type T, which do not contain elements but which are
		retained by a multiple-memory-block container for future insertions.</li>
	<li>Element blocks: both active and reserved blocks.</li>
	<li>Skipfield: An array of integers or a bitset, used to skip over certain
		objects in an accompanying data structure during iteration or processing.
		In the context of a container typically used to indicate erased elements.</li>
	<li>Skipblock: a run of elements which are contiguous in memory and designated to be skipped
		during iteration. In the context of a container, contiguous erased elements.</li>
</ul>

<h2><a id="motivation"></a>III. Motivation and Scope</h2>

<p>There are situations where data is heavily interlinked, iterated over
frequently, and changing often. An example is the typical video game engine.
Most games will have a central generic 'entity' or 'actor' class, regardless of
their overall schema (an entity class does not imply an <a href="https://en.wikipedia.org/wiki/Entity-component-system">ECS</a>).
Entity/actor objects tend to be 'has a'-style objects rather than 'is a'-style
objects, which link to, rather than contain, shared resources like sprites,
sounds and so on. Those shared resources are usually located in separate
containers/arrays so that they can re-used by multiple entities. Entities are
in turn referenced by other structures within a game engine, such as
quadtrees/octrees, level structures, and so on.</p>

<p>Entities may be erased at any time (for example, a wall gets destroyed and
no longer is required to be processed by the game's engine, so is erased) and
new entities inserted (for example, a new enemy is spawned). While this is all
happening the links between entities, resources and superstructures such as
levels and quadtrees, must stay valid in order for the game to run. The order
of the entities and resources themselves within the containers is, in the
context of a game, typically unimportant, so an unordered container is okay.
More specific requirements for game engines are listed in the <a href="#sg14gameengine">appendices</a>.</p>

<p>But the non-fixed-size container with the best iteration performance in the
standard library, vector, loses pointer validity to elements within it upon
insertion, and pointer/index validity upon erasure. This leads towards
sophisticated and often restrictive workarounds when developers attempt to
utilize vector or similar containers under the above circumstances.</p>

<p>std::list and the like are not suitable due to their poor memory locality,
which leads to poor cache performance during iteration. This <a href="https://isocpp.org/files/papers/P3012R0.pdf">does not stop them</a> from
being used extensively. This is however an ideal situation for a container such
as hive, which has a high degree of memory locality. Even though that locality
can be punctuated by gaps from erased elements, it still works out better in
terms of iteration performance than all other standard library containers other
than deque/vector, regardless of the ratio of erased to non-erased elements
(see <a href="#benchmarks">benchmarks</a>). It is also in most cases faster for
insertion and (non-back) erasure than current standard library containers.</p>

<p>As another example, particle simulation (weather, physics etcetera) often
involves large clusters of particles which interact with external objects and
each other. The particles each have individual properties (eg. spin, speed,
direction etc) and are being created and destroyed continuously. Therefore the
order of the particles is unimportant, what is important is the speed of
erasure and insertion. No current standard library container has both strong
insertion and non-back erasure performance, so again this is a good match for
hive.</p>

<p><a href="https://groups.google.com/a/isocpp.org/g/sg14/c/1iWHyVnsLBQ/m/tEJfuJMvCQAJ">Reports
from other fields</a> suggest that, because most developers aren't aware of
containers such as this, they often end up using solutions which are sub-par
for iterative performance such as std::map and std::list in order to preserve
pointer validity, when most of their processing work is actually
iteration-based. So, introducing this container would both create a convenient
solution to these situations, as well as increasing awareness of this approach.
It will ease communication across fields, as opposed to the current scenario
where each field uses a similar container but each has a different name for it
(object pool, bucket array, etcetera).</p>

<h2><a id="impact"></a>IV. Impact On the Standard</h2>

<p>This is purely a library addition, requiring no changes to the language.</p>

<h2><a id="design"></a>V. Design Decisions</h2>

<h3>Core aspects</h3>

<p>The three core aspects of a hive from an abstract perspective are:</p>
<ol>
	<li>A collection of element blocks + metadata, to prevent reallocation
		during insertion (as opposed to a single element block).</li>
	<li>A method of skipping erased elements in O(1) time during iteration (as
		opposed to reallocating subsequent elements during erasure).</li>
	<li>An erased-element location recording mechanism, to enable the re-use of
		memory from erased elements in subsequent insertions, which in turn
		increases cache locality and reduces the number of block
		allocations/deallocations.</li>
</ol>

<p>Each element block houses multiple elements. The metadata about each block
may or may not be allocated with the blocks themselves and could be contained in a
separate structure. This metadata must include <a href="#constraints_summary">at a minimum</a>, the number of
non-erased elements within each block and the block's capacity - which allows
the container to know when the block is empty and needs to be removed from the
sequence, and also allows iterators to judge when the end of a block has been
reached, given the starting point of the block.</p>

<p>It should be noted that most of the data associated with the skipping mechanism and erased-element
recording mechanisms should be per-element-block and independent of
subsequent/previous element blocks, as otherwise you would create unacceptably variable
latency <a href="https://lists.isocpp.org/mailman/listinfo.cgi/sg14/">for any fields involving timing sensitivity</a>. Specifically with a
global data set for either, erase would likely require all data subsequent to a
given element block's data to be reallocated, when an element block is removed from the
iterative sequence. Insert would likewise require reallocation of all data to a larger
memory space when hive capacity expanded.</p>

<p>In the <a href="https://github.com/mattreecebentley/plf_colony">original</a>
reference implementation (current reference implementation is <a href="https://github.com/mattreecebentley/plf_hive">here</a>) the specific
structure and mechanisms have changed many times over the course of
development, however the interface to the container and its time complexity
guarantees have remained largely unchanged. So it is likely that regardless of
specific implementation, it will be possible to maintain this interface without
obviating future improvement.</p>

<p>The current reference implementation implements the 3 core aspects as
follows. Information about known alternative ways to implement these is
available in <a href="#non_reference_implementations_info">the
appendices</a>.</p>

<h4>1. Collection of element blocks + metadata</h4>

<p>In the reference implementation this is essentially a doubly-linked list of
'group' structs containing (a) a dynamically-allocated element block,
(b) element block metadata and (c) a dynamically-allocated skipfield. The element
blocks and skipfields have a growth factor of 2. The
metadata includes information necessary for an iterator to iterate over hive
elements, such as that already mentioned and
information useful to specific functions, such as the group's sequence order number (used for
iterator comparison operations). This linked-list approach keeps the operation
of removing empty element blocks from the sequence at O(1) time complexity.</p>

<h4>2. A method of skipping erased elements in O(1) time during iteration</h4>

<p>The reference implementation uses a skipfield pattern called the <a href="https://archive.org/details/matt_bentley_-_the_low_complexity_jump-counting_pattern">low complexity jump-counting pattern</a>. This encodes the length of runs of
contiguous erased elements (skipblocks) into a skipfield which allows for O(1) time
complexity during iteration (see the paper above for details). Since
there is no branching involved in iterating over the skipfield aside from
end-of-block checks, it is less problematic computationally than a boolean
skipfield (which has to branch for every skipfield read) in terms of CPUs which
don't handle branching or branch-prediction failure efficiently (eg. Core2).</p>

<h4>3. Erased-element location recording mechanism</h4>

<p>The reference implementation utilizes the memory space of erased elements to form a per-element-block index-based
doubly-linked <a href="https://en.wikipedia.org/wiki/Free_list">free list</a> of skipblocks, which is used during subsequent insertion.
Each element block has a 'free list head' as a metadata member. The free lists are index-based rather
than pointer-based in order to reduce the amount of space necessary to store
the 'previous' and 'next' list links in an erased element's memory. The
beginning and end of the free lists are marked using
<code>numeric_limits&lt;skipfield_type&gt;::max()</code> in the 'previous'
and 'next' indexes, respectively. If the free list head is equal to this number
this means there are no erasures in that element block. Since this number is reserved that means element block capacities cannot be larger than <code>numeric_limits&lt;skipfield_type&gt;::max()</code> ie. 255 elements instead of 256 for 8-bit skipfield types, as otherwise the free list would be unable to address a skipblock comprised only of the last element in the block.</p>

<p>These per-element-block free lists are combined with a doubly-linked pointer-based intrusive list of
blocks with erased elements in them, the head of which is stored as a member variable in hive. The combination of these two things allows re-use of erased element
memory space in O(1) time.</p>


<p>More information on these approaches, and alternative approaches to the
3 core aspects, is available to read in <a href="#non_reference_implementations_info">the alt implementation
appendix</a>.</p>

<h3>Iterator classes</h3>

<p>Iterators are bidirectional in hive but also provide constant time
complexity &gt;, &lt;, &gt;=, &lt;= and &lt;=&gt; operators for convenience
(eg. in <code>for</code> loops when skipping over multiple elements per loop
and there is a possibility of going past a pre-determined end element). This is
achieved by keeping a record of the relative order of element blocks. In the
reference implementation this is done by assigning a number to each memory
block in its metadata. In an implementation using a vector of pointers to
groups instead of a linked list, one can simply use the position of the
pointers within the vector to determine this. Comparing the relative order of
two iterators' blocks, then comparing the memory locations of the elements
which the iterators point to (if they happen to be within the same memory
block), is enough to implement all comparisons.</p>

<p>Iterator implementations are dependent on the approach taken to core aspects
1 and 2 as described above. The reference implementation's iterator stores a
pointer to the current 'group' struct, plus a pointer to the current element
and a pointer to its corresponding skipfield node. It is possible to replace
the element and skipfield pointers with a single index value, but benchmarks
have shown this to be slower despite the increased memory cost.</p>

<p>The reference implementation's ++ operation is as shown below, following
the low-complexity jump-counting pattern's algorithm:</p>
<ol>
	<li>Add 1 to the existing element and skipfield pointers in the iterator.</li>
	<li>Dereference skipfield pointer to get the value of the skipfield node,
		then add that value to both the skipfield pointer and the element pointer.
		If the node indicates an erased element, its value will be a positive
		integer indicating the number of nodes until the next non-erased node. If
		not erased it will be zero.</li>
	<li>If the element pointer is now beyond the end of the element block,
		change the group pointer to the next group in the linked list, element
		pointer to the start of the next group's element block, skipfield
		pointer to the start of the next group's skipfield. In case there is a skipblock at the beginning of this element block, again dereference the
		skipfield pointer to get the value of the skipfield node and add that value
		to both the skipfield pointer and the element pointer. There is no need to
		repeat the check for end of block, because if the block were empty of
		elements it would've been removed.</li>
</ol>

<p>-- operation is the same except both step 1 and 2 involve subtraction rather
than addition and step 3 checks to see if the element pointer is now before the
beginning of the element block instead of beyond the end of it. If it is before
the beginning of the block it traverses to the back element of the previous
group's element block, and subtracts the value of the back skipfield node from the element
pointer and skipfield pointer.</p>

<p>We can see from the above that every so often iteration will involve a
transistion to the next/previous element block in the hive's sequence of active blocks,
depending on whether we are doing ++ or --. Hence for
every element block transition, 2 reads of the skipfield are necessary instead of 1.</p>

<h3>Specific functions</h3>
<ul>
	<li>Non-member function specializations for <code
		style="font-weight: bold;">advance, prev and next</code><br>
		<i>(between O(1) and O(n))</i>
		<p>For these functions, complexity is dependent on the state of the hive
		instance, position of the iterator and the amount of distance to travel,
		but in many cases will be less than linear, and may be constant. To
		explain: it is necessary in a hive to store, for each element block, both <i>capacity</i> metadata (for the purpose of iteration) and metadata about how many
		non-erased elements are present (ie. <i>size</i>, for the purpose of
		removing blocks from the iterative chain once they become empty). For this
		reason, intermediary blocks between the iterator's initial block and its
		final destination block (if these are not the same block, and are not
		immediately adjacent) can be skipped rather than iterated linearly across,
		by using the <i>size</i> metadata.</p>
		<p>This means that the only linear time operations are any iterations
		within the initial block and the final block. However if either the initial
		or final block have no erased elements (as determined by comparing whether
		the block's <i>capacity</i> and <i>size</i> metadata are equal), linear iteration can be skipped for that
		block and pointer/index math used instead to determine distances, reducing
		complexity to constant time. Finally, if the iterator points to the first element in that element block, and distance is greater-or-equal-to the block's <i>size</i>, we can treat it as an intermediary block and just skip it, subtracting <i>size</i> from the distance we want to travel. Hence the best case for this operation is
		constant time, the worst is linear in the distance.</p>
 </li>
	<li>Non-member function specialization for <code
		style="font-weight: bold;">distance(first, last)</code><br>
		<i>(between O(1) and O(n))</i>
		<p>The same considerations which apply to advance, prev and next also apply
		to distance - intermediary element blocks between first and last's blocks can be
		skipped in constant time and their <i>size</i> metadata
		added to the cumulative distance count, while first's block and last's
		block (if they are not the same block) must be linearly iterated across
		unless either block has no erased elements, in which case the operation
		becomes pointer/index math and is reduced to constant time for that block.
		If first and last are in the same block but are in the first and last
		element slots in the block, distance can again be calculated from the
		block's <i>size</i> metadata in constant time. If they are not in the same block but first points to the first element in its block, the first block can be skipped and its <i>size</i> added to the distance travelled. Likewise if last points to the last element in its block, the last block can also be skipped and its <i>size</i> added.</p>
 </li>
	<li><code style="font-weight: bold;">iterator insert/emplace</code><br>
		<i>(O(1) amortized)</i>
		<p>Insertion can re-use previously-erased element memory locations when
		available, so position of insertion is effectively random unless no
		previous erasures have occurred, in which case all elements will likely be
		inserted linearly to the back of the container in the majority of
		implementations. If inserting to the back this invalidates iterators
		pointing to end(). It could also potentially insert before begin(), if
		erasures have occurred at the beginning of the container.</p>
		<p>While it is not mandated to do so, hive implementations will
		generally insert into existing element blocks when able, and create a new
		element block only when all existing element blocks are full.</p>
		<p>If hive is implemented as a vector of pointers to element blocks instead of
		a linked list of element blocks, creation of a new element block would
		occasionally involve expanding the pointer vector, itself O(n) in the number of
		blocks, but this is within amortized limits since it is only
		occasional.</p>
 </li>
	<li><code style="font-weight: bold;">void insert</code><br>
		<i>(O(n))</i>
		<p>For range, fill and initializer_list insertion, it is not possible to
		guarantee that all the elements inserted will be sequential in the hive's
		sequence, and so it is not considered useful to return an iterator to the
		first inserted element. There is a precedent for this in the various std::
		map containers. Therefore these functions return void.</p>
		<p>The same considerations regarding iterator invalidation in singular
		insertion above, also applies to these insertion styles.</p>
		<p>For multiple insertions an implementation can call reserve()
		in advance, reducing the number of allocations
		necessary (whereas repeated singular insertions would generally follow the
		implementation's block growth factor, and possibly allocate more and smaller element blocks than
		necessary). This has no effect on time complexity which is still linear in the number elements
		inserted.</p>
 </li>
	<li><code style="font-weight: bold;">iterator erase(const_iterator
		position)</code><br>
		<i>(O(1) amortized)</i>
		<p>Erasure is a simple matter of destructing the element in question and
		updating whatever data is associated with the erased-element skipping
		mechanism. No reallocation of subsequent elements is
		necessary and hence the process is O(1). Updates to the erased-element recording
		and skipping mechanisms are also required to be O(1).</p>
		<p>When an element block becomes empty of non-erased elements it must be
		freed to the OS (or reserved for future insertions, depending on
		implementation) and removed from the hive's sequence of active blocks. If
		it were not, we would end up with non-O(1) iteration, since there would be
		no way to predict how many empty element blocks were between the
		current element block being iterated over, and the next element block with
		non-erased elements in it.</p>
<p>In a linked-list-of-blocks style of implementation this removal is always O(1).
		However if the hive were implemented as vector of pointers to element
		blocks, this could, depending on implementation, trigger an O(n) relocation
		of subsequent block pointers in the vector (a smart implementation would
		only do this occasionally, using erase_if - see the <a href="#non_reference_implementations_info">alt implementation appendix</a>).
	 Hence this operation is O(1) amortized.</p>
		<p>Under what circumstances element blocks are reserved rather than
		deallocated is implementation-defined - however given that small memory
		blocks have low cache locality compared to larger ones, from a performance
		perspective it is best to only reserve the largest blocks currently
		allocated in the hive. In my benchmarking, reserving both the back and
		2nd-to-back element blocks while ignoring the actual capacity of the blocks
		themselves seemed to have the most beneficial performance
		characteristics out of other techniques attempted.</p>
		<p>There are three main performance advantages to retaining back blocks as
		opposed to just any block - the first is that these will be, under most
		circumstances, the largest blocks in the hive (given the growth factor). An
		exception to this is when splice is used, which may result in a smaller
		block following a larger block (implementation-dependent). The second
		advantage is that in situations where erasures and insertions are occurring
		at the back of the hive (this assumes no erased element locations in other memory
		blocks, which would most likely be used for the insertions) continuously and in
		quick succession, retaining the back block avoids large numbers of
		deallocations/reallocations. The third advantage is that deallocations of
		these larger blocks can, in part, be moved by the user to non-critical code regions via
		trim_capacity(). Though ultimately if the user wants total control of when
		allocations and deallocations occur they would need to use a custom
		allocator.</p>
<p>Lastly, the reason for returning an iterator is: if an erasure empties an element block of elements, the block will be deallocated or reserved - in either case, it's no longer part of the iterative sequence and an iterator pointing into it, such as <code>position</code>, can no longer be used for iteration. This is important for erasing inside a loop.</p>
 </li>
	<li><code style="font-weight: bold;">iterator erase(const_iterator first,
		const_iterator last)</code><br>
		<i>(O(n) in the number of blocks within the range, and O(n) in the
		number of elements erased)</i>
		<p>The same considerations for singular erasure above also apply for
		range-erasure. In addition, ranged erasure is O(n) if elements are
		non-trivially-destructible. If they are trivially-destructible, we can
		follow similar logic to the distance specialization above. Which is to say,
		for the first and last element blocks in the range, if the number of
		elements in either block are equal to their capacity, there are no erasures in the
		block and we may be able to - depending on the
		erased-element-skipping-mechanism - simply notate a new skipblock without needing to deal with any existing skipblocks. If there are erasures in that element block, we would (implementation-dependent) likely need to identify whether the range we're
		erasing contains erased elements in between the non-erased elements, in
		order to update metadata (such as number of non-erased elements in the
		block) correctly.</p>
		<p>For intermediary blocks between the first and last blocks, for
		trivially-destructible types we can simply deallocate or reserve these
		without calling the destructors of elements or dealing with the
		erased-element skipping/recording mechanisms for those blocks. As with distance, if the first iterator points to the first element in its element block, the first block can be treated like an intermediary block - likewise for the last block, if the last iterator points to the last element in its element block. Hence for trivially-destructible types, the entire operation can be linear in the
		number of blocks contained within the range or linear in the number of elements
		contained within the range, or somewhere in between.</p>
		<p>As with singular erasure, in a vector-of-pointers-to-blocks style of
		implementation, there may be a need to reallocate element block pointers backward
		when blocks becomes empty of elements.</p>
		<p>Lastly, specifying a return iterator for range-erase may seem pointless,
		as no reallocation of elements occurs in erase for hive, so the return
		iterator will almost always be the <code>last</code> const_iterator of the
		<code>first, last</code> pair. However if
		<code>last</code> was <code>end()</code>, the new value of
		<code>end()</code> will be
		returned. In this case either the user intentionally submitted <code>end()</code> as
		<code>last</code>, or they incremented an iterator pointing to the final
		element in the hive and submitted that as <code>last</code>. The latter is
		the only valid reason to return an iterator from the function, as it may
		occur as part of a loop which is erasing elements which ends when
		<code>end()</code> is reached. If <code>end()</code> is changed by the
		erasure, but the iterator used in the loop
		does not accurately reflect <code>end()</code>'s new value, that iterator
		could iterate past <code>end()</code> and the loop would never end.</p>
 </li>
	<li><code style="font-weight: bold;">void reshape(std::hive_limits
		block_limits)</code><br>
		<i>(between O(1) and O(n) in the number of elements reallocated + O(n) in the number of element blocks)</i>
		<p>This function updates the block capacity limits in the hive with
		user-defined ones and, if necessary, changes any active blocks which fall outside
		of those limits to be within the limits (and deallocates any reserved blocks outside of the limits - although an implementation could choose to allocate new reserved blocks if they wanted). A program will not compile if the function is used with non-copyable/movable types. It will invalidate
		pointers/iterators/references to if reallocation of elements to other element blocks occurs.</p>
		<p>The order of elements post-reshape is not guaranteed to be stable, in
		order to allow for optimizations. Specifically: in the instance where a
		given element element block does not fit within the limits supplied, the elements within that element block could be reallocated to
		previously-erased element locations in other element blocks which <i>do</i> fit
		within the limits supplied. Or they could be reallocated to the back of the
		final element block, if it fits within the limits, or into reserved blocks
		if they fit within the limits.</p>
		<p>If the existing current limits fit within the new user-supplied ones, no
		checking of block capacities is needed and the operation is O(1).
		If they do not but existing blocks <i>may</i> fit within the limits, all blocks
		need to be checked, making the operation O(n) in the number of blocks (both
		active and reserved). If any blocks containing elements don't fit within
		the supplied limits reallocation will occur and the operation is at worst O(n) in
		<code>capacity()</code>.</p>
 </li>
	<li><code style="font-weight: bold;">static constexpr std::hive_limits block_capacity_hard_limits() noexcept</code><br>
		<i>(O(1))</i>
		<p>As opposed to block_capacity_limits() which returns the current min/max
		element block capacities for a given instance of hive, this allows the user
		to get any implementation's min/max 'hard' lower/upper limits for element
		block capacities ie. the limits which any user-supplied limits must
		fit within. For example, if an implementation's hard limit is 3 elements
		min, 1 million elements max, all user-supplied limits must be &gt;= 3 and
		&lt;= 1 million.</p>
		<p>This is useful for 2 reasons:</p>
		<ol type="a">
			<li>An implementation may have default block capacity limits which are
				different from its hard limits.</li>
			<li>A user must have a mechanism for determining what user-defined limits
				they can supply before supplying them to a constructor, to avoid triggering an exception. This is most important when working with projects on multiple platforms.</li>
		</ol>
 </li>
	<li><code style="font-weight: bold;">static constexpr std::hive_limits block_capacity_default_limits() noexcept</code><br>
		<i>(O(1))</i>
		<p>Likewise, this returns the default limits for a given hive and type/allocator.</p>
		<p>This is useful for 2 reasons:</p>
		<ol type="a">
			<li>Finding the defaults easily without interpreting source code (if they have access to source code).</li>
			<li>Reshaping hives with user-defined block capaciity limits to match the defaults, such that they can be spliced into other hives which are constructed without user-defined limits.</li>
		</ol>
 </li>
	<li><code style="font-weight: bold;">void clear()</code><br>
		<i>(O(n) in the number of elements)</i>
		<p>User expectation was that clear() would erase all elements but not
		deallocate element blocks. Therefore all active blocks are emptied
		of elements and become reserved blocks. If deallocation of memory
		blocks is desired, a clear() call can be followed by a trim_capacity()
		call. For trivially-destructible types element destruction can be skipped and depending on implementation the process may be O(1).</p>
 </li>
	<li><code style="font-weight: bold;">iterator get_iterator(const_pointer p) noexcept<br>
		const_iterator get_iterator(const_pointer p) const noexcept</code><br>
		<i>(O(n) in the number of active blocks)</i>
		<p>Because hive iterators could be large, potentially storing three
		pieces of data - eg. pointers to: current element block, current element
		and current skipfield node - a program storing many links to
		elements within a hive may opt to dereference iterators to get pointers and
		store those instead of iterators, to save memory and improve performance
		via reduced cache use. This function reverses that process, giving an
		iterator which can then be used for operations such as <code>erase</code>.
		A get_const_iterator function was fielded as a workaround for the
		possibility of someone wanting to supply a non-const pointer and get a
		const_iterator back, however <code>as_const</code> fulfills this same role
		when supplied to <code>get_iterator</code> and doesn't require expanding
		the interface of hive. Likewise it was decided to use const_pointer's because if a user wants to supply a non-const pointer they can use as_const,
		whereas there is no meaningful equivalent process to convert const_pointer to pointer.</p>
		<p>Note that this function is only guaranteed to return an iterator that
		corresponds to the pointer supplied - it makes no checks to see whether the
		element which <code>p</code> originally pointed to is the same element
		which <code>p</code> now points to (eg. from an ABA scenario). Resolving
		this problem is down to the end user and could involve having a unique id
		within elements or similar (more info in the <a href="#faq">frequently-asked questions appendix</a>).</p>
		<p>Technically, a precondition of the function is that <code>p</code> points to an
		element in <code>*this</code>, and does not point to an erased element, otherwise behaviour is undefined. This is
		due to the <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2021/p1726r5.pdf">"lifetime
		pointer zap" issue</a> ie. reading the value of a pointer to an erased
		element is undefined behaviour in C++. In practice this is usually non-problematic and many fields are fine with this situation. The reference implementation returns
		<code>end()</code> when <code>p</code> is not an element in
		<code>*this</code> and it is possible that other implementations may do the
		same. LEWG decided to remove this as an Effect of the function due to the
		UB mentioned.</p>
		<p><i>Note 1</i>: in order to check whether a given element is erased when
		an implementation is using the <a href="https://archive.org/details/matt_bentley_-_the_low_complexity_jump-counting_pattern">low-complexity
		jump-counting pattern</a>, the additional operations specifiedunder
		"Parallel processing" in that paper must be followed.</p>
		<p><i>Note 2</i>: get_iterator compares pointers against the start and end
		memory locations of the active blocks in
		<code>*this</code>. There was some confusion that this would be problematic
		due to obscure rules in the standard which state that a given platform may
		allow addresses inside of a given memory block to essentially not be
		contiguous, at least in terms of the std::less/std::greater/&gt;/&lt;/etc
		operators. According to Jens Maurer, these difficulties can be bypassed via
		hidden channels between the library implementation and the compiler.</p>
 </li>
	<li><code style="font-weight: bold;">void shrink_to_fit()</code><br>
		<i>(at worst O(n) in the number of elements)</i>
		<p>A decision had to be made as to whether this function should, in the
		context of hive, be allowed to reallocate elements (as std::vector
		implementations tend to do) or simply trim off reserved blocks (as
		std::deque implementations tend to do). Due to the fact that a large hive
		memory block could have as few as one remaining element in a large active
		block after a series of erasures, it makes little sense to only trim reserved
		blocks, so instead a shrink_to_fit reallocates all elements to
		as few active blocks as possible in order to increase cache locality during
		iteration and reduce memory usage. It cannot guarantee that <code>size() ==
		capacity()</code> after the operation, because the min/max block capacity
		limits of <code>*this</code> may prevent that.</p>
		<p>One potential implementation is fairly brute-force - create a new
		temporary hive, reserve(size() of original hive), copy/move all elements
		from the original hive into the temporary, then operator = &amp;&amp; the
		temporary into the original. A more astute implementation might allocate a
		temporary array detailing the full capacity and unused capacity of each
		block, then use some procedure to move elements out of some blocks and into as few of the
		existing blocks as possible, filling up any erased element locations and/or
		unused space at the back of the hive and only allocating new element blocks as-necessary.
The latter approach is also why the order of elements post-reshape is not guaranteed to be stable.
	 </p>
 </li>
	<li><code style="font-weight: bold;">void trim_capacity()<br>
		void trim_capacity(size_type n)</code><br>
		<i>(O(n) in the number of reserved blocks deallocated)</i>
		<p>The trim_capacity() command was also introduced as a way to free reserved blocks which had been previously created via reserve() or
		transformed from active blocks to reserved blocks via erase(), without
		reallocating elements and invalidating iterators as shrink_to_fit() does.
		The second overload was introduced as a way of allowing the user to say "I
		want to retain at least n capacity while freeing reserved blocks, so that I
		have room for future insertions without having to allocate again". This
		means the user doesn't have to know how much unused capacity is in (a)
		unused element memory space in the back block, (b) unused
		element memory space from prior erasures, or (c) reserved blocks. They
		just say how much they want to retain, and the implementation will free as
		much of the the remainder (<code>capacity() - n</code>) as possible if
		there are suitable reserved blocks available to deallocate.</p>
 </li>
	<li><code style="font-weight: bold;">void sort()</code><br>
		<i>(O(n log n))</i>
		<p>Although the container has unordered insertion, there may be
		circumstances where sorting is desired. Because hive uses bidirectional
		iterators, using std::sort or other random access sort techniques is not
		possible. Therefore an internal sort routine is supplied, bringing it in
		line with std::list. An implementation of the sort routine used in the
		reference implementation of hive can be found in a non-container-specific
		form <a href="https://github.com/mattreecebentley/plf_indiesort">here</a> -
		see that page for the technique's advantages over the usual sort algorithms
		for non-random-access containers. An allowance is made for sort to allocate
		memory if necessary, so that algorithms such as this can be used. Erased element memory space and reserved blocks can also be used as temporary sorting memoery instead of, or as well as, allocating.
		Since memory allocation is unspecified but allowed for std::list::sort/std::forward_list::sort, this is not necessary, just courtesy.</p>
 </li>
	<li><code style="font-weight: bold;">void unique();<br>
		template &lt;class BinaryPredicate&gt;<br>
		size_type unique(BinaryPredicate binary_pred); </code><br>
		<i>(between O(1) and O(n - 1))</i>
		<p>Likewise, if a container can be sorted, unique() may be of use
		post-sort. Optimal implementation of unique involves calling the
		range-erase function where possible, as range-erase has potentially
		constant time depending on the state of the given blocks, as opposed to
		calling single-element erase() repeatedly which would be at worst O(n - 1).</p>
 </li>
	<li><code style="font-weight: bold;">void splice(hive &amp;x)<br>
		void splice(hive &amp;&amp;x) </code><br>
		(at best O(1), at worst O(n) in the number of blocks in <code>*this</code>
		+ the number of blocks in <code>x</code>)
		<p>Whether <code>x</code>'s active blocks are transferred to the beginning
		or end of <code>*this</code>'s sequence of active blocks, or interlaced in
		some way (for example to order blocks by their capacity from small to
		large) is implementation-defined. Better performance may be gained in some
		cases by allowing the source's active blocks to go to the front rather than
		the back, depending on how full the final active block in <code>x</code>
		is. This is because unused elements that are not at the back of hive's
		iterative sequence will need to be marked as skipped in some way, and
		skipping over large numbers of elements will incur a small performance
		disadvantage during iteration compared to skipping over a small number of
		elements, due to memory locality.</p>
		<p>This function may throw in three ways - the first is a
		length_error exception if any of the capacities of
		<code>x</code>'s active blocks are outside of <code>*this</code>'s block capacity
		limits. The second is an exception if the allocators of the
		two hives are different. Third is a potential bad_alloc in the case of a
		<a href="#vector_implementations_info">vector-of-pointers-to-blocks</a> style of implementation, where an
		allocation may be made if <code>*this</code>'s pointer vector isn't of
		sufficient capacity to accomodate the pointers to <code>x</code>'s active blocks.</p>
		<p>For that scenario the time complexity (to expand the vector and reallocate all pointers) is linear in the number of the element blocks in <code>*this</code> + the number of active blocks in <code>x</code>. But regardless of implementation, a check needs to be
		made as to whether the <code>x</code>'s active blocks are within <code>*this</code>'s current block limits. This check may be O(1) or O(n) in the number of <code>x</code>'s active blocks depending on the values of <code>*this</code>'s and <code>x</code>'s current limits (same logic as reshape() above).</p>
		<p>Final note: reserved blocks in <code>x</code> are not transferred into <code>*this</code>. This was decided by LEWG to be non-implementation defined in
		order to not create unexpected behaviour when moving from one implementation to another. An implementation may need to count the amount of capacity stored in reserved blocks in the two hive instances in order to correct the total capacity values of both, which may involve a traversal of reserved blocks.<br>
	 </p>
 </li>
</ul>

<h2><a id="technical"></a>VI. Technical Specification</h2>

<p>Suggested location of hive in the standard is Sequence Containers.</p>


<h3>Library introduction [library]</h3>
<h4>Headers [headers]</h4>
<h5>C++ library headers [tab:headers.cpp]</h5>
<div class="row">
	<div class="column">
	<code>
 &lt;algorithm&gt;
&lt;any&gt;
&lt;array&gt;
&lt;atomic&gt;
&lt;barrier&gt;
&lt;bit&gt;
&lt;bitset&gt;
&lt;charconv&gt;
&lt;chrono&gt;
&lt;codecvt&gt;
&lt;compare&gt;
&lt;complex&gt;
&lt;concepts&gt;
&lt;condition_variable&gt;
&lt;coroutine&gt;
&lt;deque&gt;
&lt;exception&gt;
&lt;execution&gt;
&lt;expected&gt;
&lt;filesystem&gt;
&lt;flat_map&gt;
&lt;flat_set&gt;
	</code>
	</div>
	<div class="column">
	<code>
 &lt;format&gt;
&lt;forward_list&gt;
&lt;fstream&gt;
&lt;functional&gt;
&lt;future&gt;
&lt;generator&gt;
&lt;hazard_pointer&gt;
<ins>&lt;hive&gt;</ins>
&lt;initializer_list&gt;
&lt;inplace_vector&gt;
&lt;iomanip&gt;
&lt;ios&gt;
&lt;iosfwd&gt;
&lt;iostream&gt;
&lt;istream&gt;
&lt;iterator&gt;
&lt;latch&gt;
&lt;limits&gt;
&lt;list&gt;
&lt;locale&gt;
&lt;map&gt;
&lt;mdspan&gt;
&lt;memory&gt;
&lt;memory_resource&gt;

	</code>
	</div>
	<div class="column">
	<code>
&lt;mutex&gt;
&lt;new&gt;
&lt;numbers&gt;
&lt;numeric&gt;
&lt;optional&gt;
&lt;ostream&gt;
&lt;print&gt;
&lt;queue&gt;
&lt;random&gt;
&lt;ranges&gt;
&lt;ratio&gt;
&lt;regex&gt;
&lt;scoped_allocator&gt;
&lt;semaphore&gt;
&lt;set&gt;
&lt;shared_mutex&gt;
&lt;source_location&gt;
&lt;span&gt;
&lt;spanstream&gt;
&lt;sstream&gt;
&lt;stack&gt;
&lt;stacktrace&gt;
	</code>

	</div>
	<div class="column">
	<code>
&lt;stdexcept&gt;
&lt;stdfloat&gt;
&lt;stop_token&gt;
&lt;streambuf&gt;
&lt;string&gt;
&lt;string_view&gt;
&lt;strstream&gt;
&lt;syncstream&gt;
&lt;system_error&gt;
&lt;thread&gt;
&lt;tuple&gt;
&lt;type_traits&gt;
&lt;typeindex&gt;
&lt;typeinfo&gt;
&lt;unordered_map&gt;
&lt;unordered_set&gt;
&lt;utility&gt;
&lt;valarray&gt;
&lt;variant&gt;
&lt;vector&gt;
&lt;version&gt;
	</code>

	</div>
</div>


<h3>Header &lt;version&gt; synopsis [version.syn]</h3>

<p><code><ins>#define __cpp_lib_hive ?????? // also in
&lt;hive&gt;</ins></code></p>


<h3>General [containers.general]</h3>

<h4>Containers library summary [tab:containers.summary]</h4>

<table>
	<tbody>
		<tr>
			<td>Subclause</td>
			<td>Header</td>
		</tr>
		<tr>
			<td>Requirements</td>
			<td><br>
			</td>
		</tr>
		<tr>
			<td>Sequence containers</td>
			<td>&lt;array&gt;, &lt;deque&gt;, &lt;forward_list&gt;, <ins>&lt;hive&gt;</ins>, &lt;inplace_vector&gt;, &lt;list&gt;,
				&lt;vector&gt;</td>
		</tr>
		<tr>
			<td>Associative containers</td>
			<td>&lt;map&gt;, &lt;set&gt;</td>
		</tr>
		<tr>
			<td>Unordered associative containers</td>
			<td>&lt;unordered_map&gt;, &lt;unordered_set&gt;</td>
		</tr>
		<tr>
			<td>Container adaptors</td>
			<td>&lt;flat_map&gt;, &lt;flat_set&gt;, &lt;queue&gt;, &lt;stack&gt;</td>
		</tr>
		<tr>
			<td>Views</td>
			<td>&lt;span&gt;, &lt;mdspan&gt;</td>
		</tr>
	</tbody>
</table>

<h3>Sequence containers [sequence.reqmts]</h3>
<ol>
	 <li>A sequence container organizes a finite set of objects, all of the same type, into a strictly linear arrangement.
The library provides the following basic kinds of sequence containers: vector, inplace_vector, forward_list, list, and deque.
In addition, array <del>is provided as a sequence container which provides limited sequence operations because it has a fixed number of elements</del> <ins>and hive are provided as sequence containers which provide limited sequence operations, in array's case because it has a fixed number of elements, and in hive's case because insertion order is unspecified</ins>.
The library also provides container adaptors that make it easy to construct abstract data types, such as stacks, queues, flat_maps, flat_multimaps, flat_sets, or flat_multisets, out of the basic sequence container kinds (or out of other program-defined sequence containers).</li>
</ol>

<h3>Header <code>&lt;hive&gt;</code> synopsis [hive.syn]</h3>

 <div style="background: #ffffff; overflow:auto; width:auto; border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;">
	<pre style="margin: 0; line-height: 125%">
// [hive] class template hive

#include &lt;initializer_list&gt; // see [initializer.list.syn]
#include &lt;compare&gt; // see [compare.syn]

namespace std {

  struct hive_limits
  {
    size_t min;
    size_t max;
    constexpr hive_limits(size_t minimum, size_t maximum) noexcept : min(minimum), max(maximum) {}
  };


  // class template hive

  template &lt;class T, class Allocator = allocator&lt;T&gt;&gt; class hive;

  template&lt;class T, class Allocator&gt;
    void swap(hive&lt;T, Allocator&gt;&amp; x, hive&lt;T, Allocator&gt;&amp; y)
      noexcept(noexcept(x.swap(y)));

  template&lt;class T, class Allocator, class U = T&gt;
    typename hive&lt;T, Allocator&gt;::size_type
      erase(hive&lt;T, Allocator&gt;&amp; c, const U&amp; value);

  template&lt;class T, class Allocator, class Predicate&gt;
    typename hive&lt;T, Allocator&gt;::size_type
      erase_if(hive&lt;T, Allocator&gt;&amp; c, Predicate pred);

  namespace pmr {
    template &lt;class T&gt;
      using hive = std::hive&lt;T, polymorphic_allocator&lt;T&gt;&gt;;
  }
}</pre>
</div>

<h3>Class template <code>hive</code> [hive]</h3>

<h4>Overview [hive.overview]</h4>
<ol>
	 <li>A hive is a type of sequence container that provides constant-time insertion and erasure operations. Storage is automatically managed in multiple memory blocks, referred to as <i>element blocks</i>. Insertion position is determined by the container, and may re-use the memory locations of erased elements.</li>
	<li>Element blocks which contain elements are referred to as <i>active blocks</i>, those which do not are referred
		to as <i>reserved blocks</i>. Active blocks which become empty of elements
		are either deallocated or become reserved blocks. Reserved blocks become
		active blocks when they are used to store elements. A user can create additional reserved blocks by calling <code>reserve</code>.</li>
	<li>Erasures use unspecified techniques of constant time complexity to identify the memory locations of erased elements, which are subsequently skipped
during iteration, as opposed to relocating subsequent elements during erasure.</li>
	<li>Active block capacities have an implementation-defined growth factor
		(which need not be integral), for example a new active block's capacity
		could be equal to the summed capacities of the pre-existing active
	blocks.</li>
	<li>Limits can be placed on both the minimum and maximum element capacities
		of element blocks, both by users and implementations.
		<p style="text-indent: -25pt;">(5.1) &mdash; The minimum limit shall be no
		larger than the maximum limit.</p>
		<p style="text-indent: -25pt;">(5.2) &mdash; When limits are not specified
		by a user during construction, the implementation's default limits are
		used.</p>
		<p style="text-indent: -25pt;">(5.3) &mdash; The default limits of an
		implementation are not guaranteed to be the same as the minimum and maximum
		possible capacities for an implementation's element blocks [Note 1: To
		allow latitude for both implementation-specific and user-directed
		optimization. - end note]. The latter are defined as <i>hard limits</i>. The maximum hard limit shall be no
		larger than <code>std::allocator_traits&lt;Allocator&gt;::max_size()</code>.</p>
		<p style="text-indent: -25pt;">(5.4) &mdash; If user-specified limits are
		not within hard limits, or if the specified minimum limit is greater than
		the specified maximum limit, behavior is undefined.</p>
		<p style="text-indent: -25pt;">(5.5) &mdash; An element block is said to be
		<i>within the bounds of</i> a pair of minimum/maximum limits when its
		capacity is greater-or-equal-to the minimum limit and less-than-or-equal-to the maximum limit.</p>
 </li>
	<li>A hive conforms to the requirements for Containers ([container.reqmts]),
		with the exception of operators <code>== and !=</code>. A hive also meets
		the requirements of a reversible container ([container.rev.reqmts]), of an
		allocator-aware container ([container.alloc.reqmts]), and some of the
		requirements of a sequence container, including several of the optional
		sequence container requirements ([sequence.reqmts]). Descriptions are
		provided here only for operations on hive that are not described in that
		table or for operations where there is additional semantic information.</li>
	<li>Hive iterators meet the <i>Cpp17BidirectionalIterator</i> requirements but also
		model <code>three_way_comparable&lt;strong_ordering&gt;</code>.</li>
</ol>

<div style="background: #ffffff; overflow:auto; width:auto; border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;">
<pre style="margin: 0; line-height: 125%">namespace std {

template&lt;class T, class Allocator = allocator&lt;T&gt;&gt;
class hive {
public:

  // types
  using value_type = T;
  using allocator_type = Allocator;
  using pointer = typename allocator_traits&lt;Allocator&gt;::pointer;
  using const_pointer = typename allocator_traits&lt;Allocator&gt;::const_pointer;
  using reference = value_type&amp;;
  using const_reference = const value_type&amp;;
  using size_type = implementation-defined; // see [container.requirements]
  using difference_type = implementation-defined; // see [container.requirements]
  using iterator = implementation-defined; // see [container.requirements]
  using const_iterator = implementation-defined; // see [container.requirements]
  using reverse_iterator = std::reverse_iterator&lt;iterator&gt;; // see [container.requirements]
  using const_reverse_iterator = std::reverse_iterator&lt;const_iterator&gt;; // see [container.requirements]


  // [hive.cons] construct/copy/destroy
  constexpr hive() noexcept(noexcept(Allocator())) : hive(Allocator()) { }
  constexpr explicit hive(const Allocator&amp;) noexcept;
  constexpr explicit hive(hive_limits block_limits) : hive(block_limits, Allocator()) { }
  constexpr hive(hive_limits block_limits, const Allocator&amp;);
  explicit hive(size_type n, const Allocator&amp; = Allocator());
  hive(size_type n, hive_limits block_limits, const Allocator&amp; = Allocator());
  hive(size_type n, const T&amp; value, const Allocator&amp; = Allocator());
  hive(size_type n, const T&amp; value, hive_limits block_limits, const Allocator&amp; = Allocator());
  template&lt;class InputIterator&gt;
    hive(InputIterator first, InputIterator last, const Allocator&amp; = Allocator());
  template&lt;class InputIterator&gt;
    hive(InputIterator first, InputIterator last, hive_limits block_limits, const Allocator&amp; = Allocator());
  template&lt;<i>container-compatible-range</i>&lt;T&gt; R&gt;
    hive(from_range_t, R&amp;&amp; rg, const Allocator&amp; = Allocator());
  template&lt;<i>container-compatible-range</i>&lt;T&gt; R&gt;
    hive(from_range_t, R&amp;&amp; rg, hive_limits block_limits, const Allocator&amp; = Allocator());
  hive(const hive&amp; x);
  hive(hive&amp;&amp;) noexcept;
  hive(const hive&amp; x, const type_identity_t&lt;Allocator&gt;&amp; alloc);
  hive(hive&amp;&amp;, const type_identity_t&lt;Allocator&gt;&amp; alloc);
  hive(initializer_list&lt;T&gt; il, const Allocator&amp; = Allocator());
  hive(initializer_list&lt;T&gt; il, hive_limits block_limits, const Allocator&amp; = Allocator());

  ~hive();
  hive&amp; operator=(const hive&amp; x);
  hive&amp; operator=(hive&amp;&amp; x) noexcept(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || allocator_traits&lt;Allocator&gt;::is_always_equal::value);
  hive&amp; operator=(initializer_list&lt;T&gt;);
  template&lt;class InputIterator&gt;
    void assign(InputIterator first, InputIterator last);
  template&lt;<i>container-compatible-range</i> &lt;T&gt; R&gt;
    void assign_range(R&amp;&amp; rg);
  void assign(size_type n, const T&amp; t);
  void assign(initializer_list&lt;T&gt;);
  allocator_type get_allocator() const noexcept;



  // iterators
  iterator                begin() noexcept;
  const_iterator          begin() const noexcept;
  iterator                end() noexcept;
  const_iterator          end() const noexcept;
  reverse_iterator        rbegin() noexcept;
  const_reverse_iterator  rbegin() const noexcept;
  reverse_iterator        rend() noexcept;
  const_reverse_iterator  rend() const noexcept;

  const_iterator          cbegin() const noexcept;
  const_iterator          cend() const noexcept;
  const_reverse_iterator  crbegin() const noexcept;
  const_reverse_iterator  crend() const noexcept;


  // [hive.capacity] capacity
  bool empty() const noexcept;
  size_type size() const noexcept;
  size_type max_size() const noexcept;
  size_type capacity() const noexcept;
  void reserve(size_type n);
  void shrink_to_fit();
  void trim_capacity() noexcept;
  void trim_capacity(size_type n) noexcept;
  constexpr hive_limits block_capacity_limits() const noexcept;
  static constexpr hive_limits block_capacity_default_limits() noexcept;
  static constexpr hive_limits block_capacity_hard_limits() noexcept;
  void reshape(hive_limits block_limits);


  // [hive.modifiers] modifiers
  template&lt;class... Args&gt; iterator emplace(Args&amp;&amp;... args);
  template&lt;class... Args&gt; iterator emplace_hint(const_iterator hint, Args&amp;&amp;... args);
  iterator insert(const T&amp; x);
  iterator insert(T&amp;&amp; x);
  iterator insert(const_iterator hint, const T&amp; x);
  iterator insert(const_iterator hint, T&amp;&amp; x);
  void insert(initializer_list&lt;T&gt; il);
  template&lt;<i>container-compatible-range</i> &lt;T&gt; R&gt;
    void insert_range(R&amp;&amp; rg);
  template&lt;class InputIterator&gt;
    void insert(InputIterator first, InputIterator last);
  void insert(size_type n, const T&amp; x);

  iterator erase(const_iterator position);
  iterator erase(const_iterator first, const_iterator last);
  void swap(hive&amp;) noexcept(allocator_traits&lt;Allocator&gt;::propagate_on_container_swap::value || allocator_traits&lt;Allocator&gt;::is_always_equal::value);
  void clear() noexcept;


  // [hive.operations] hive operations
  void splice(hive&amp; x);
  void splice(hive&amp;&amp; x);
  template&lt;class BinaryPredicate = equal_to&lt;T&gt;&gt;
    size_type unique(BinaryPredicate binary_pred = BinaryPredicate());

  template&lt;class Compare = less&lt;T&gt;&gt;
    void sort(Compare comp = Compare());

  iterator get_iterator(const_pointer p) noexcept;
  const_iterator get_iterator(const_pointer p) const noexcept;

private:
  hive_limits <i>current-limits</i> = implementation-defined; // <i>exposition only</i>

};


template&lt;class InputIterator, class Allocator = allocator&lt;<i>iter-value-type</i> &lt;InputIterator&gt;&gt;
  hive(InputIterator, InputIterator, Allocator = Allocator())
    -&gt; hive&lt;<i>iter-value-type</i> &lt;InputIterator&gt;, Allocator&gt;;

template&lt;class InputIterator, class Allocator = allocator&lt;<i>iter-value-type</i> &lt;InputIterator&gt;&gt;
  hive(InputIterator, InputIterator, hive_limits block_limits, Allocator = Allocator())
    -&gt; hive&lt;<i>iter-value-type</i> &lt;InputIterator&gt;, block_limits, Allocator&gt;;

template&lt;ranges::input_range R, class Allocator = allocator&lt;ranges::range_value_t&lt;R&gt;&gt;&gt;
  hive(from_range_t, R&amp;&amp;, Allocator = Allocator())
    -&gt; hive&lt;ranges::range_value_t&lt;R&gt;, Allocator&gt;;

template&lt;ranges::input_range R, class Allocator = allocator&lt;ranges::range_value_t&lt;R&gt;&gt;&gt;
  hive(from_range_t, R&amp;&amp;, hive_limits block_limits, Allocator = Allocator())
    -&gt; hive&lt;ranges::range_value_t&lt;R&gt;, block_limits, Allocator&gt;;
}

</pre>
</div>

<h4>Constructors, copy, and assignment [hive.cons]</h4>

<code style="font-weight:bold">constexpr explicit hive(const Allocator&amp;) noexcept;</code>
<ol>
	<li><i>Effects:</i> Constructs an empty <code>hive</code>, using the specified allocator.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight:bold">
	constexpr hive(hive_limits block_limits, const Allocator&amp;);
</code>
<ol start="3">
	<li><i>Effects:</i> Constructs an empty <code>hive</code>, using the specified allocator. Initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight:bold">
explicit hive(size_type n, const Allocator&amp; = Allocator());<br>
hive(size_type n, hive_limits block_limits, const Allocator&amp; = Allocator());</code>
<ol start="5">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17DefaultInsertable</i> into
		<code>hive</code>.</li>
	<li><i>Effects:</i> Constructs a <code>hive</code> with <code>n</code> default-inserted elements, using
		the specified allocator. If the second overload is called, also initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Linear in <code>n</code>.</li>
</ol>


<code style="font-weight:bold">
hive(size_type n, const T&amp; value, const Allocator&amp; = Allocator());<br>
hive(size_type n, const T&amp; value, hive_limits block_limits, const Allocator&amp; = Allocator());</code>
<ol start="8">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17CopyInsertable</i> into
		<code>hive</code>.</li>
	<li><i>Effects:</i> Constructs a <code>hive</code> with <code>n</code> copies of <code>value</code>, using
		the specified allocator. If the second overload is called, also initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Linear in <code>n</code>.</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;class InputIterator&gt;
  hive(InputIterator first, InputIterator last, const Allocator&amp; = Allocator());
template&lt;class InputIterator&gt;
  hive(InputIterator first, InputIterator last, hive_limits block_limits, const Allocator&amp; = Allocator());</code></pre>
<ol start="11">
	<li><i>Effects:</i> Constructs a <code>hive</code> equal to the range [<code>first, last</code>), using the specified allocator.
 If the second overload is called, also initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Linear in <code>distance(first, last)</code>.</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;<i>container-compatible-range</i>&lt;T&gt; R&gt;
  hive(from_range_t, R&amp;&amp; rg, const Allocator&amp; = Allocator());
template&lt;<i>container-compatible-range</i>&lt;T&gt; R&gt;
  hive(from_range_t, R&amp;&amp; rg, hive_limits block_limits, const Allocator&amp; = Allocator());</code></pre>
<ol start="13">
	<li><i>Effects:</i> Constructs a <code>hive</code> object with the elements of the range <code>rg</code>, using the specified allocator. If the second overload is called, also initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Linear in <code>ranges::distance(rg)</code>.</li>
</ol>


<code style="font-weight:bold">
  hive(const hive&amp; x);<br>
  hive(const hive&amp; x, const type_identity_t&lt;Allocator&gt;&amp; alloc);</code>
<ol start="15">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17CopyInsertable</i> into <code>hive</code>.</li>
	<li><i>Effects:</i> Constructs a <code>hive</code> object with the elements of <code>x</code>. If the second overload is called, uses <code>alloc</code>. Initializes <code><i>current-limits</i></code> with <code>x.<i>current-limits</i></code>.</li>
	<li><i>Complexity:</i> Linear in <code>x.size()</code>.</li>
</ol>


<code style="font-weight:bold">
  hive(hive&amp;&amp; x);<br>
  hive(hive&amp;&amp; x, const type_identity_t&lt;Allocator&gt;&amp; alloc);</code>
<ol start="18">
	<li><i>Preconditions:</i> For the second overload, when <code>allocator_traits&lt;alloc&gt;::is_always_equal::value</code> is <code>false</code>, <code>T</code> meets the <code>Cpp17MoveInsertable</code> requirements.</li>
	<li><i>Effects:</i> When the first overload is called, or the second overload is called and <code>alloc == x.get_allocator()</code> is <code>true</code>, <code><i>current-limits</i></code> is set to <code>x.<i>current-limits</i></code> and each element block is moved from <code>x</code> into <code>*this</code>. Pointers and references to the elements of <code>x</code> now refer to those same elements but as members of <code>*this</code>. Iterators referring to the elements of <code>x</code> will continue to refer to their elements, but they now behave as iterators into <code>*this</code>.<br>
	If the second overload is called and <code>alloc == x.get_allocator()</code> is <code>false</code>, each element in <code>x</code> is moved into <code>*this</code>. References, pointers and iterators referring to the elements of <code>x</code>, as well as the past-the-end iterator of <code>x</code>, are invalidated.</li>
	<li><i>Postconditions:</i> <code>x.empty()</code> is <code>true</code>.</li>
	<li><i>Complexity:</i> If the second overload is called and <code>alloc == x.get_allocator()</code> is <code>false</code>, linear in <code>x.size()</code>. Otherwise constant.</li>
</ol>


<code style="font-weight:bold">
	hive(initializer_list&lt;T&gt; il, const Allocator&amp; = Allocator());<br>
	hive(initializer_list&lt;T&gt; il, hive_limits block_limits, const Allocator&amp; = Allocator());</code>
<ol start="21">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17CopyInsertable</i> into hive.</li>
	<li><i>Effects:</i> Constructs a <code>hive</code> object with the elements of <code>il</code>, using the specified allocator. If the second overload is called, also initializes <code><i>current-limits</i></code> with <code>block_limits</code>.</li>
	<li><i>Complexity:</i> Linear in <code>il.size()</code>.</li>
</ol>


<code style="font-weight:bold">
  hive&amp; operator=(const hive&amp; x);</code>
<ol start="24">
  <li><i>Preconditions:</i> <code>T</code> is <i>Cpp17CopyInsertable</i> into <code>hive</code> and <i>Cpp17CopyAssignable</i>.</li>
  <li><i>Effects:</i> All elements in <code>*this</code> are either copy-assigned to, or destroyed. All elements in <code>x</code> are copied into <code>*this</code>.<br>
  [Note 1: <code><i>current-limits</i></code> is unchanged. - end note]</li>
  <li><i>Complexity:</i> Linear in <code>size() + x.size()</code>.</li>
</ol>



<code style="font-weight:bold">
	hive&amp; operator=(hive&amp;&amp; x) noexcept(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || allocator_traits&lt;Allocator&gt;::is_always_equal::value);</code>
<ol start="27">
	<li><i>Preconditions:</i> When <code>(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || allocator_traits&lt;Allocator&gt;::is_always_equal::value)</code> is <code>false</code>, <code>T</code> is <i>Cpp17MoveInsertable</i> into <code>hive</code> and <i>Cpp17MoveAssignable</i>.</li>
	<li><i>Effects:</i> Each element in <code>*this</code> is either move-assigned to, or destroyed.<br>
	When <code>(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || get_allocator() == x.get_allocator())</code> is <code>true</code>, <code><i>current-limits</i></code> is set to <code>x.<i>current-limits</i></code> and each element block is moved from <code>x</code> into <code>*this</code>. Pointers and references to the elements of <code>x</code> now refer to those same elements but as members of <code>*this</code>. Iterators referring to the elements of <code>x</code> will continue to refer to their elements, but they now behave as iterators into <code>*this</code>, not into <code>x</code>.<br>
	When <code>(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || get_allocator() == x.get_allocator())</code> is <code>false</code>, each element in <code>x</code> is moved into <code>*this</code>. References, pointers and iterators referring to the elements of <code>x</code>, as well as the past-the-end iterator of <code>x</code>, are invalidated.</li>
	<li><i>Postconditions:</i> <code>x.empty()</code> is <code>true</code>.</li>
	<li><i>Complexity:</i> Linear in <code>size()</code>. If <code>(allocator_traits&lt;Allocator&gt;::propagate_on_container_move_assignment::value || get_allocator() == x.get_allocator())</code> is <code>true</code>, also linear in <code>x.size()</code>.</li>
</ol>



<h4>Capacity [hive.capacity]</h4>

<code style="font-weight: bold;">size_type capacity() const noexcept;</code>
<ol>
	<li><i>Returns:</i> The total number of elements that <code>*this</code> can hold without requiring allocation of more element blocks.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight: bold;">void reserve(size_type n);</code>
<ol start="3">
	<li><i>Effects:</i> If <code>n &lt;= capacity()</code> is <code>true</code> there are no effects. Otherwise increases <code>capacity()</code> by allocating reserved blocks.</li>
	<li><i>Postconditions:</i> <code>capacity() &gt;= n</code> is <code>true</code>.</li>
	<li><i>Complexity:</i> It does not change the size of the sequence and takes at most
		linear time in the number of reserved blocks allocated.</li>
	<li><i>Throws:</i> <code>length_error</code> if <code>n &gt; max_size()</code>, as well as any exceptions thrown by the allocator.</li>
	<li><i>Remarks:</i> All references, pointers, and iterators referring to elements in <code>*this</code>, as well as the past-the-end iterator, remain valid.</li>
</ol>


<code style="font-weight: bold;">void shrink_to_fit();</code>
<ol start="8">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17MoveInsertable</i> into
		<code>hive</code>.</li>
	<li><i>Effects:</i> <code>shrink_to_fit</code> is a non-binding request to reduce
		<code>capacity()</code> to be closer to <code>size()</code>.<br>
		[ Note: The request is non-binding to allow latitude for
		implementation-specific optimizations. - end note ]<br>
		It does not increase <code>capacity()</code>, but may reduce
		<code>capacity()</code>. It may reallocate elements. If
		<code>capacity()</code> is already equal to <code>size()</code> there are
		no effects. If an exception is thrown during allocation of a new element block,
		<code>capacity()</code> may be reduced and reallocation may occur. Otherwise if an exception is thrown the effects are unspecified.</li>
	<li><i>Complexity:</i> If reallocation happens, linear in the size of the
	sequence.</li>
	<li><i>Remarks:</i> If reallocation happens, the order of the elements in <code>*this</code> may change and all references, pointers, and iterators referring to the elements in <code>*this</code>, as well as the past-the-end iterator, are invalidated.</li>
</ol>


<code style="font-weight: bold;">void trim_capacity() noexcept;<br>
void trim_capacity(size_type n) noexcept; </code>
<ol start="12">
	<li><i>Effects:</i> For the first overload, all reserved blocks are deallocated, and
		<code>capacity()</code> is reduced accordingly. For the second overload,
		<code>capacity()</code> is reduced to no less than <code>n</code>.
 </li>
	<li><i>Complexity:</i> Linear in the number of reserved blocks deallocated.</li>
	<li><i>Remarks:</i> All references, pointers, and iterators referring to elements in <code>*this</code>, as well as the past-the-end iterator,
		remain valid.</li>
</ol>


<code style="font-weight: bold;">constexpr hive_limits block_capacity_limits() const noexcept;</code>
<ol start="15">
	<li><i>Returns:</i> <code><i>current-limits</i></code>.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight: bold;">static constexpr hive_limits block_capacity_default_limits() noexcept;</code>
<ol start="17">
	<li><i>Returns:</i> A <code>hive_limits</code> struct with the <code>min</code> and
		<code>max</code> members set to the implementation's default limits.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight: bold;">static constexpr hive_limits block_capacity_hard_limits() noexcept;</code>
<ol start="19">
	<li><i>Returns:</i> A <code>hive_limits</code> struct with the <code>min</code> and
		<code>max</code> members set to the implementation's hard limits.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>


<code style="font-weight: bold;">void reshape(hive_limits block_limits);</code>
<ol start="21">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17MoveInsertable</i> into <code>hive</code>.
 </li>
	<li><i>Effects:</i> For any active blocks not within the bounds of <code>block_limits</code>, the elements within those active blocks are reallocated to new or existing element blocks which are within the bounds. Any element blocks not within the bounds of <code>block_limits</code> are deallocated.
		If an exception is thrown during allocation of a new element block,
		<code>capacity()</code> may be reduced, reallocation may occur and <code><i>current-limits</i></code> may be assigned a value other than
		<code>block_limits</code>. Otherwise <code>block_limits</code> is assigned to <code><i>current-limits</i></code>. If any other exception is thrown the effects are unspecified.</li>
<li><i>Postconditions:</i> <code>size()</code> is unchanged.</li>
	<li><i>Complexity:</i> Linear in the number of element blocks in <code>*this</code>. If reallocation happens, also linear in the number of elements reallocated.</li>
	<li><i>Remarks:</i> This operation may change <code>capacity()</code>. If
		reallocation happens, the order of the elements in <code>*this</code> may
		change. Reallocation invalidates all references, pointers, and
		iterators referring to the elements in <code>*this</code>, as well as the
		past-the-end iterator.<br>
		[Note 1: If no reallocation happens, they remain valid. - end note]</li>
</ol>




<h4>Modifiers [hive.modifiers]</h4>

<pre><code style="font-weight:bold">
template&lt;class... Args&gt; iterator emplace(Args&amp;&amp;... args);
template&lt;class... Args&gt; iterator emplace_hint(const_iterator hint, Args&amp;&amp;... args);</code></pre>
<ol>
<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17EmplaceConstructible</i> into <code>hive</code> from <code>args</code>.</li>
<li><i>Effects:</i> Inserts an object of type <code>T</code> constructed with <code>std::forward&lt;Args&gt;(args)...</code>.<br>
The <code>hint</code> parameter is ignored. If an exception is thrown, there are no effects.<br>
[Note 1: <code>args</code> can directly or indirectly refer to a value in <code>*this</code>. - end note]</li>
<li><i>Returns:</i> An iterator that points to the new element.</li>
<li><i>Complexity:</i> Constant. Exactly one object of type <code>T</code> is constructed.</li>
<li><i>Remarks:</i> Invalidates the past-the-end iterator.</li>
</ol>


<code style="font-weight:bold">
iterator insert(const T&amp; x);<br>
iterator insert(const_iterator hint, const T&amp; x);<br>
iterator insert(T&amp;&amp; x);<br>
iterator insert(const_iterator hint, T&amp;&amp; x);</code>
<ol start="6">
<li><i>Effects:</i> Equivalent to: <code>return emplace(std::forward&lt;decltype(x)&gt;(x));</code><br>
[Note 2: The <code>hint</code> parameter is ignored. - end note]</li>
</ol>


<pre><code style="font-weight:bold">
void insert(initializer_list&lt;T&gt; rg);
template&lt;<i>container-compatible-range</i> &lt;T&gt; R&gt;
  void insert_range(R&amp;&amp; rg);</code></pre>
<ol start="7">
<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17EmplaceInsertable</i> into <code>hive</code> from <code>*ranges::begin(rg)</code>. <code>rg</code> and <code>*this</code> do not overlap.</li>
<li><i>Effects:</i> Inserts copies of elements in <code>rg</code>. Each iterator in the range <code>rg</code> is dereferenced exactly once.</li>
<li><i>Complexity:</i> Linear in the number of elements inserted. Exactly one object of type <code>T</code> is constructed for each element inserted.</li>
<li><i>Remarks:</i> If an element is inserted, invalidates the past-the-end iterator.</li>
</ol>


<code style="font-weight:bold">
void insert(size_type n, const T&amp; x);</code>
<ol start="11">
<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17CopyInsertable</i> into <code>hive</code>.</li>
<li><i>Effects:</i> Inserts <code>n</code> copies of <code>x</code>.</li>
<li><i>Complexity:</i> Linear in <code>n</code>. Exactly one object of type <code>T</code> is constructed for each element inserted.</li>
<li><i>Remarks:</i> If an element is inserted, invalidates the past-the-end iterator.</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;class InputIterator&gt;
  void insert(InputIterator first, InputIterator last);</code></pre>
<ol start="15">
	<li><i>Effects:</i> Equivalent to <code>insert_range(ranges::subrange(first, last)).</code></li>
</ol>


<code style="font-weight: bold;">iterator erase(const_iterator
position);<br>
iterator erase(const_iterator first, const_iterator last);</code>
<ol start="16">
	<li><i>Complexity:</i> Linear in the number of elements erased. Additionally, if any active blocks become empty of elements as a result of the function call, at worst
		linear in the number of element blocks.</li>
	<li><i>Remarks:</i> Invalidates references, pointers and iterators referring to the erased elements. An
		erase operation that erases the last element in <code>*this</code> also
		invalidates the past-the-end iterator.</li>
</ol>


<code style="font-weight: bold;">void swap(hive&amp; x)
noexcept(allocator_traits&lt;Allocator&gt;::propagate_on_container_swap::value
|| allocator_traits&lt;Allocator&gt;::is_always_equal::value);<br>
</code>
<ol start="18">
	<li><i>Effects:</i> Exchanges the contents, <code>capacity()</code> and <code><i>current-limits</i></code> of
		<code>*this</code> with that of <code>x</code>.</li>
	<li><i>Complexity:</i> Constant.</li>
</ol>
<br>


<h4>Operations [hive.operations]</h4>
<ol>
	<li>In this subclause, arguments for a template parameter named Predicate or
		BinaryPredicate shall meet the corresponding requirements in
		[algorithms.requirements]. The semantics of <code>i + n</code> and <code>i
		- n</code>, where <code>i</code> is an iterator into the <code>hive</code>
		and <code>n</code> is an integer, are the same as those of <code>next(i,
		n)</code> and <code>prev(i, n)</code>, respectively. For <code>sort</code>,
		the definitions and requirements in [alg.sorting] apply.</li>
</ol>


<code style="font-weight: bold;">void splice(hive&amp; x);<br>
void splice(hive&amp;&amp; x);</code>
<ol start="2">
	<li><i>Preconditions: </i><code>get_allocator() == x.get_allocator()</code> is <code>true</code>.</li>
	<li><i>Effects:</i> If <code>addressof(x) == this</code> is <code>true</code> the behavior is erroneous and there are no effects.
		Otherwise, inserts the contents of <code>x</code> into <code>*this</code>
		and <code>x</code> becomes empty. Pointers and references to the moved
		elements of <code>x</code> now refer to those same elements but as members
		of <code>*this</code>. Iterators referring to the moved elements
		continue to refer to their elements, but they now behave as iterators into
		<code>*this</code>, not into <code>x</code>.</li>
	<li><i>Complexity:</i> Linear in the sum of all element blocks in <code>x</code>
		plus all element blocks in <code>*this</code>.</li>
	<li><i>Throws:</i> <code>length_error</code> if any of <code>x</code>'s active
		blocks are not within the bounds of
		<code><i>current-limits</i></code>.</li>
	<li><i>Remarks:</i> Reserved blocks in <code>x</code> are not transferred into
		<code>*this</code>. If <code>addressof(x) == this</code> is <code>false</code>, invalidates the past-the-end iterator for both
		<code>x</code> and <code>*this</code>.</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;class BinaryPredicate = equal_to&lt;T&gt;&gt;
  size_type unique(BinaryPredicate binary_pred = BinaryPredicate());</code></pre>
<ol start="7">
	<li><i>Preconditions:</i> <code>binary_pred</code> is an equivalence relation.</li>
	<li><i>Effects:</i> Erases all but the first element from every consecutive group of
		equivalent elements. That is, for a nonempty <code>hive</code>, erases all
		elements referred to by the iterator <code>i</code> in the range
		<code>[begin() + 1, end())</code> for which <code>binary_pred(*i, *(i - 1))</code> is <code>true</code>.</li>
	<li><i>Returns:</i> The number of elements erased.</li>
	<li><i>Throws:</i> Nothing unless an exception is thrown by the predicate.</li>
	<li><i>Complexity:</i> If <code>empty()</code> is <code>false</code>, exactly <code>size() -
		1</code> applications of the corresponding predicate, otherwise no
		applications of the predicate.</li>
	<li><i>Remarks:</i> Invalidates references, pointers and iterators referring to the erased elements. If the last element in <code>*this</code> is erased, also invalidates the past-the-end iterator.</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;class Compare = less&lt;T&gt;&gt;
  void sort(Compare comp = Compare());</code></pre>
<ol start="13">
	<li><i>Preconditions:</i> <code>T</code> is <i>Cpp17MoveInsertable</i> into <code>hive</code>, <i>Cpp17MoveAssignable</i>, and <i>Cpp17Swappable</i>.</li>
	<li><i>Effects:</i> Sorts <code>*this</code> according to the <code>comp</code> function object. If an exception is
		thrown, the order of the elements in <code>*this</code> is unspecified.</li>
	<li><i>Complexity:</i> O(<i>N</i> log <i>N</i>) comparisons, where <i>N</i> is <code>size()</code>.</li>
	<li><i>Remarks:</i> May allocate. References, pointers and iterators referring to elements in <code>*this</code>, as well as the past-the-end iterator, may be invalidated.<br>
		[Note 1: Not required to be stable ([algorithm.stable]) - end note]</li>
</ol>


<code style="font-weight: bold;">iterator get_iterator(const_pointer p) noexcept;<br>
const_iterator get_iterator(const_pointer p) const noexcept;</code>
<ol start="17">
	<li><i>Preconditions:</i> <code>p</code> points to an element in
	<code>*this</code>.</li>
	<li><i>Complexity:</i> Linear in the number of active blocks in
	<code>*this</code>.</li>
	<li><i>Returns:</i> An <code>iterator</code> or <code>const_iterator</code> pointing
		to the same element as <code>p</code>.</li>
</ol>


<h4>Erasure [hive.erasure]</h4>
<pre><code style="font-weight:bold">
template&lt;class T, class Allocator, class U&gt;
  typename hive&lt;T, Allocator&gt;::size_type
    erase(hive&lt;T, Allocator&gt;&amp; c, const U&amp; value);</code></pre>
<ol>
<li><i>Effects:</i> Equivalent to:
<pre><code style="font-weight:bold">return erase_if(c, [&amp;](auto&amp; elem) { return elem == value; });</code></pre>
</li>
</ol>


<pre><code style="font-weight:bold">
template&lt;class T, class Allocator, class Predicate&gt;
  typename hive&lt;T, Allocator&gt;::size_type
    erase_if(hive&lt;T, Allocator&gt;&amp; c, Predicate pred);
</code></pre>
<ol start="2">
<li><i>Effects:</i> Equivalent to:
<pre><code style="font-weight:bold">auto original_size = c.size();
for (auto i = c.begin(), last = c.end(); i != last; ) {
  if (pred(*i)) {
    i = c.erase(i);
  } else {
    ++i;
  }
}
return original_size - c.size();
</code></pre>
</li>
</ol>




<h3>Annex C (informative) Compatibility [diff]</h3>
<h4>C.1.4 Clause 16: library introduction [diff.cpp23.library]</h4>
<p>Affected subclause: [headers]<br>
Change: New headers.<br>
Rationale: New functionality.<br>
Effect on original feature: The following C++ headers are new: &lt;debugging&gt;, &lt;hazard_pointer&gt;, <ins>&lt;hive&gt;, </ins>&lt;inplace_vector&gt;, &lt;linalg&gt;, &lt;rcu&gt;, and &lt;text_encoding&gt;.
Valid C++ 2023 code that #includes headers with these names may be invalid in this revision of C++.</p>



<h2><a id="Acknowledgments"></a>VII. Acknowledgments</h2>

<p>Matt would like to thank: Glen Fernandes and Ion Gaztanaga for restructuring
advice, Robert Ramey for documentation advice, various Boost and SG14/LEWG/LWG
members for support, critiques and corrections, Baptiste Wicht for teaching me
how to construct decent benchmarks, Jonathan Wakely, Sean Middleditch, Jens
Maurer (very nearly a co-author at this point really), Tim Song, Patrice Roy
and Guy Davidson for standards-compliance advice and critiques, support,
representation at meetings and bug reports, Henry Miller for getting me to
clarify why the free list approach to memory location reuse is the most
appropriate, Ville Voutilainen and Ga&scaron;per A&#x17e;man for help with the
colony/hive rename paper, Ben Craig for his critique of the tech spec, that
ex-Lionhead guy for annoying me enough to force me to implement the original
skipfield pattern, Jon Blow for some initial advice and Mike Acton for some
influence, the community at large for giving me feedback and bug reports on the
reference implementation.<br>
Also Nico Josuttis for doing such a great job in terms of explaining the
general format of the structure to the committee.<br>
Dedicated to Melodie.<br>
</p>

<h2>VIII. Appendices</h2>

<h3><a id="basicusage"></a>Appendix A - Basic usage examples</h3>

<p>Using <a href="https://github.com/mattreecebentley/plf_hive">plf::hive
reference implementation</a>.</p>

<div style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;">
<pre style="margin: 0; line-height: 125%"><code><span style="color: #557799">#include &lt;iostream&gt;</span>
<span style="color: #557799">#include &lt;numeric&gt;</span>
<span style="color: #557799">#include "plf_hive.h"</span>

<span style="color: #333399; font-weight: bold">int</span> <span style="color: #0066BB; font-weight: bold">main</span>(<span style="color: #333399; font-weight: bold">int</span> argc, <span style="color: #333399; font-weight: bold">char</span> <span style="color: #333333">**</span>argv)
{
  plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span><span style="color: #333333">&gt;</span> i_hive;

  <span style="color: #888888">// Insert 100 ints:</span>
  <span style="color: #008800; font-weight: bold">for</span> (<span style="color: #333399; font-weight: bold">int</span> i <span style="color: #333333">=</span> <span style="color: #0000DD; font-weight: bold">0</span>; i <span style="color: #333333">!=</span> <span style="color: #0000DD; font-weight: bold">100</span>; <span style="color: #333333">++</span>i)
  {
    i_hive.insert(i);
  }

  <span style="color: #888888">// Erase half of them:</span>
  <span style="color: #008800; font-weight: bold">for</span> (plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span><span style="color: #333333">&gt;::</span>iterator it <span style="color: #333333">=</span> i_hive.begin(); it <span style="color: #333333">!=</span> i_hive.end(); <span style="color: #333333">++</span>it)
  {
    it <span style="color: #333333">=</span> i_hive.erase(it);
  }

  std<span style="color: #333333">::</span>cout <span style="color: #333333">&lt;&lt;</span> <span style="background-color: #fff0f0">"Total: "</span> <span style="color: #333333">&lt;&lt;</span> std::accumulate(i_hive.begin(), i_hive.end(), 0) <span style="color: #333333">&lt;&lt;</span> std<span style="color: #333333">::</span>endl;
  std<span style="color: #333333">::</span>cin.get();
  <span style="color: #008800; font-weight: bold">return</span> <span style="color: #0000DD; font-weight: bold">0</span>;
} </code></pre>
</div>

<h4>Example demonstrating pointer stability</h4>

<div
style="background: #ffffff; overflow:auto;width:auto;border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;">
<pre style="margin: 0; line-height: 125%"><code><span style="color: #557799">#include &lt;iostream&gt;</span>
<span style="color: #557799">#include "plf_hive.h"</span>

<span style="color: #333399; font-weight: bold">int</span> <span style="color: #0066BB; font-weight: bold">main</span>(<span style="color: #333399; font-weight: bold">int</span> argc, <span style="color: #333399; font-weight: bold">char</span> <span style="color: #333333">**</span>argv)
{
  plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span><span style="color: #333333">&gt;</span> i_hive;
  plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span><span style="color: #333333">&gt;::</span>iterator it;
  plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span> <span style="color: #333333">*&gt;</span> p_hive;
  plf<span style="color: #333333">::</span>hive<span style="color: #333333">&lt;</span><span style="color: #333399; font-weight: bold">int</span> <span style="color: #333333">*&gt;::</span>iterator p_it;

  <span style="color: #888888">// Insert 100 ints to i_hive and pointers to those ints to p_hive:</span>
  <span style="color: #008800; font-weight: bold">for</span> (<span style="color: #333399; font-weight: bold">int</span> i <span style="color: #333333">=</span> <span style="color: #0000DD; font-weight: bold">0</span>; i <span style="color: #333333">!=</span> <span style="color: #0000DD; font-weight: bold">100</span>; <span style="color: #333333">++</span>i)
  {
    it <span style="color: #333333">=</span> i_hive.insert(i);
    p_hive.insert(<span style="color: #333333">&amp;</span>(<span style="color: #333333">*</span>it));
  }

  <span style="color: #888888">// Erase half of the ints:</span>
  <span style="color: #008800; font-weight: bold">for</span> (it <span style="color: #333333">=</span> i_hive.begin(); it <span style="color: #333333">!=</span> i_hive.end(); <span style="color: #333333">++</span>it)
  {
    it <span style="color: #333333">=</span> i_hive.erase(it);
  }

  <span style="color: #888888">// Erase half of the int pointers:</span>
  <span style="color: #008800; font-weight: bold">for</span> (p_it <span style="color: #333333">=</span> p_hive.begin(); p_it <span style="color: #333333">!=</span> p_hive.end(); <span style="color: #333333">++</span>p_it)
  {
    p_it <span style="color: #333333">=</span> p_hive.erase(p_it);
  }

  <span style="color: #888888">// Total the remaining ints via the pointer hive (pointers will still be valid even after insertions and erasures):</span>
  <span style="color: #333399; font-weight: bold">int</span> total <span style="color: #333333">=</span> <span style="color: #0000DD; font-weight: bold">0</span>;

  <span style="color: #008800; font-weight: bold">for</span> (p_it <span style="color: #333333">=</span> p_hive.begin(); p_it <span style="color: #333333">!=</span> p_hive.end(); <span style="color: #333333">++</span>p_it)
  {
    total <span style="color: #333333">+=</span> <span style="color: #333333">*</span>(<span style="color: #333333">*</span>p_it);
  }

  std<span style="color: #333333">::</span>cout <span style="color: #333333">&lt;&lt;</span> <span style="background-color: #fff0f0">"Total: "</span> <span style="color: #333333">&lt;&lt;</span> total <span style="color: #333333">&lt;&lt;</span> std<span style="color: #333333">::</span>endl;

  <span style="color: #008800; font-weight: bold">if</span> (total <span style="color: #333333">==</span> <span style="color: #0000DD; font-weight: bold">2500</span>)
  {
    std<span style="color: #333333">::</span>cout <span style="color: #333333">&lt;&lt;</span> <span style="background-color: #fff0f0">"Pointers still valid!"</span> <span style="color: #333333">&lt;&lt;</span> std<span style="color: #333333">::</span>endl;
  }

  std<span style="color: #333333">::</span>cin.get();
  <span style="color: #008800; font-weight: bold">return</span> <span style="color: #0000DD; font-weight: bold">0</span>;
} </code></pre>
</div>


<h3><a id="benchmarks"></a>Appendix B - Original reference implementation
benchmarks, differences from current reference implementation, and links</h3>

<p>Benchmark results for plf::colony (performance and majority of code is
identical to std::hive reference implementation) under GCC 9.2 on an Intel Xeon
E3-1241 (Haswell 2014) are <a href="https://plflib.org/benchmarks_haswell_gcc.htm">here</a>.</p>

<p>Old benchmark results for an earlier version of colony under MSVC 2015
update 3, on an Intel Xeon E3-1241 (Haswell 2014) are <a href="https://plflib.org/benchmarks_haswell_msvc.htm">here</a>. There is no
commentary for the MSVC results.</p>

<p>Even older benchmark results for an even earlier version of colony under GCC
5.1 on an Intel E8500 (Core2 2008) are <a href="https://plflib.org/benchmarks_core2_gcc.htm">here</a>.<br>
</p>

<p>This proposal and its <a href="https://github.com/mattreecebentley/plf_hive">reference
implementation</a>, and the <a href="https://github.com/mattreecebentley/plf_colony">original reference
implementation</a>, have several differences; one is that the original was
named 'colony' (as in: a human, ant or bird colony), and that name has been
retained for userbase purposes but also for differentiation. Other differences
between hive and colony as of the time of writing follow:</p>
<ul>
	<li>C++20 and above only.</li>
	<li>==, != and &lt;=&gt; container operators removed, as these
		were considered either too slow as order-inspecific operators, or too
		confusing as order-specific operators (for an unordered-insert
	container)</li>
	<li>priority template parameter removed as was not considered useful enough
		to maintain in standard.</li>
	<li>data() function removed, as may cause too much implementation
	specificity.</li>
	<li>reset() function removed (use clear() followed by trim_capacity()
		instead) - simply to cut down on the user interface.</li>
	<li>memory() removed as this was considered to be not in-keeping with
		previous container practice.</li>
	<li>No support for user-specified internal sort routines to be used by member
		function sort().</li>
	<li>advance, prev, next and distance are std:: overloads, rather than hidden
		friends selected via ADL. This is only possible under C++20 and above.
		std::distance overload doesn't support negative distances, the two
		iterators must be swapped prior to distance call if first is &gt; last.</li>
	<li>The following public functions are not present: is_active,
		block_capacity_default_min(), block_capacity_default_max(),
		block_metadata_memory, block_allocation_amount, max_elements_per_allocation.</li>
</ul>

<p>Other differences may appear over time.</p>

<h3><a id="faq"></a>Appendix C - Frequently Asked Questions</h3>
<ol>
	<li><h4>Where is it worth using a hive in place of other std::
		containers?</h4>
		<p>See the <a href="#container_guide">guide to container selection
		appendix</a> for a more intensive answer to this question, however for a
		brief overview, it is worthwhile for performance reasons in situations
		where the order of container elements is not important and:</p>
		<ol type="a">
			<li>Insertion order is unimportant</li>
			<li>Insertions and erasures to the container occur frequently in
				performance-critical code, <i><b>and</b></i></li>
			<li>Links to non-erased container elements may not be invalidated by
				insertion or erasure.</li>
		</ol>
		<p>Under these circumstances a hive will generally out-perform other std::
		containers. In addition, because it never invalidates pointer references to
		container elements (except when the element being pointed to has been
		previously erased) it may make many programming tasks involving
		inter-relating structures in an object-oriented or modular environment much
		faster, and should be considered in those situations.</p>
		<p>Some ideal situations to use a hive: cellular/atomic simulation,
		persistent octtrees/quadtrees, game entities or destructible-objects in a
		video game, particle physics, anywhere where objects are being created and
		destroyed continuously. Also, anywhere where a vector of pointers to
		dynamically-allocated objects or a std::list would typically end up being
		used in order to preserve pointer stability, but where order is
		unimportant.</p>
 </li>
	<li><h4>Is it similar to a deque?</h4>
		<p>A deque is reasonably dissimilar to a hive - being a double-ended queue,
		it requires a different internal framework. In addition, being a
		random-access container, having a growth factor for element blocks in a
		deque is problematic (though not impossible). deque and hive have no
		comparable performance characteristics except for insertion (assuming a
		good deque implementation). Deque erasure performance can vary substantially
		depending on implementation, but is generally similar to vector erasure
		performance. A deque invalidates pointers to subsequent container elements
		when erasing elements, which a hive does not, and guarantees ordered
		insertion.</p>
 </li>
<li><h4>How does a slot map compare to a hive?</h4>
<p>Both a <a href="https://web.archive.org/web/20180121142549/http://seanmiddleditch.com/data-structures-for-game-developers-the-slot-map/">slot map</a> and a hive attempt to create a container where there is reasonable insertion/erasure/iteration performance while maintaining stable links from external objects to elements within the container. In the case of hive this is done with pointers/iterators, in the case of a slot map this is done with keys, which are separate to iterators (which do not stay valid post-erasure/insertion in slot maps). Each approach has some advantages, but the hive approach has more, in my view.</p>
<p>If you use a slot map your external object also needs to store a link to the slot map in order to access an element from its stored key (or a higher-level object accessing the external object needs to store such a link). This prevents splicing, since there is no way to ensure that keys in one slot map are unique globally, as is possible with pointers. With a hive there is no need for the external object to have knowledge of the hive instance, as pointers are sufficient to access the elements and remain valid regardless of insertion/erasure.</p>
<p>One upside of the slot map approach is if you make a duplicate of a slot map + a duplicate of an external object which accesses its elements via keys, and send these to a secondary thread, little work need be done - all the keys stored in the external object duplicate will work with the copied slot map, all you need to do is update the external object to use the duplicate slot map, instead of the original.</p>
<p>If you wish to do this with a hive, pointers to its elements in the duplicate external object will of course not point into the duplicate hive, so any external objects you copy to the secondary thread which you intend to access the hive elements, will need their pointers re-written. The easiest way to do this is by finding the indexes of the elements in the original hive instance via <code>size_type original_index = std::distance(hive.begin(), hive.get_iterator(external_object.pointers[x]));</code>, then using <code>external_object_copy.pointers[x] = &amp;*(std::next(hive_copy.begin(), original_index));</code> to write the new pointer values. Since std::distance/std::next/etc can be overloaded to be very quick for hive, this will not take too much time, but it is an inconvenience.</p>

<p>While I haven't done any benchmarks comparing hive performance to a slot map, I have done extensive benchmarks vs a <a href="http://bitsquid.blogspot.com/2011/09/managing-decoupling-part-4-id-lookup.html">packed array</a> implementation, which is arguably simpler than a slot map but has the same characteristics, and hive is faster. From the structure of a slot map its obvious that slot maps are slower for insertion, erasure and for referencing elements within the slot map via external objects. This is because of the intermediary interfaces of the key resolution array and generation counters which need to be accessed and updated. Slot maps can however be faster for iteration since all their data is (typically, implementation-defined) stored contiguously and iteration does not use the keys/counters. In addition contiguous storage means using a slot map with SIMD is more straightforward.</p>
<p>Slot maps use more memory due to the keys/counters. Ignoring element block metadata, a hive implementation can use as little as 2 extra bits of metadata per element (current reference implementation is generally between 10 and 16 bits for performance reasons), but a slot map will typically
 use between 64 and 128 bits per element (or 32-64 on a 32-bit system). This will also lower performance due to higher pressure on the cache and the increased numbers of allocations/deallocations.</p>
</li>
	<li><h4>What are the thread-safe guarantees?</h4>
		<p>Unlike a std::vector, a hive can be read from and inserted-into/erased-from at the
		same time (provided the erased element is not the same as the element being read), however it cannot be iterated over and inserted-into/erased-from to at the same time. If we look at a (non-concurrent implementation of) std::vector's thread-safe matrix to see
		which basic operations can occur at the same time, it reads as follows
		(please note push_back() is the same as insertion in this regard due to reallocation when size == capacity):</p>

		<table border="1" cellspacing="3">
			<tbody>
				<tr>
					<td><b>std::vector</b></td>
					<td>Insertion</td>
					<td>Erasure</td>
					<td>Iteration</td>
					<td>Read</td>
				</tr>
				<tr>
					<td>Insertion</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
				</tr>
				<tr>
					<td>Erasure</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
				</tr>
				<tr>
					<td>Iteration</td>
					<td>No</td>
					<td>No</td>
					<td>Yes</td>
					<td>Yes</td>
				</tr>
				<tr>
					<td>Read</td>
					<td>No</td>
					<td>No</td>
					<td>Yes</td>
					<td>Yes</td>
				</tr>
			</tbody>
		</table>
		<p>In other words, multiple reads can happen
		simultaneously, but the potential reallocation and pointer/iterator
		invalidation caused by insertion/push_back and erasure means those
		operations cannot occur at the same time as anything else. pop_back() is slightly different in that it doesn't cause reallocation, so pop_back()'s and reads can occur at the same time provided you're not reading from the back(). Likewise a swap-and-pop operation can occur at the same time as reading if neither the erased location nor back() is what's being read.</p>
		<p>Hive on the other hand does not invalidate pointers/iterators to non-erased elements during insertion and erasure, resulting in the following matrix:</p>

		<table border="1" cellspacing="3">
			<tbody>
				<tr>
					<td><b>hive</b></td>
					<td>Insertion</td>
					<td>Erasure</td>
					<td>Iteration</td>
					<td>Read</td>
				</tr>
				<tr>
					<td>Insertion</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
					<td>Yes</td>
				</tr>
				<tr>
					<td>Erasure</td>
					<td>No</td>
					<td>No</td>
					<td>No</td>
					<td>Mostly*</td>
				</tr>
				<tr>
					<td>Iteration</td>
					<td>No</td>
					<td>No</td>
					<td>Yes</td>
					<td>Yes</td>
				</tr>
				<tr>
					<td>Read</td>
					<td>Yes</td>
					<td>Mostly*</td>
					<td>Yes</td>
					<td>Yes</td>
				</tr>
			</tbody>
		</table>
		<p><span style="font-size: 10pt;">* Erasures will not invalidate iterators
		unless the iterator points to the erased element.</span></p>
		<p>In other words, reads may occur at the same time as insertions and
		erasures (provided that the element being erased is not the element being
		read), multiple reads and iterations may occur at the same time, but
		iterations may not occur at the same time as an erasure or insertion, as
		either of these may change the state of the skipfield which is being
		iterated over, if a skipfield is used in the implementation. Note that
		iterators pointing to end() may be invalidated by insertion.</p>
		<p>So, hive could be considered more inherently thread-safe than a
		(non-concurrent implementation of) std::vector, but still has many areas
		which require mutexes or atomics to navigate in a multithreaded
		environment.</p>
 </li>
	<li><h4>What is hive's Abstract Data Type (ADT)?</h4>
		<p>Though I am happy to be proven wrong I suspect hives/colonies/bucket
		arrays are their own abstract data type. Some have suggested its ADT is of
		type bag, I would somewhat dispute this as it does not have typical bag
		functionality such as <a href="http://www.austincc.edu/akochis/cosc1320/bag.htm">searching based on
		value</a> (you can use std::find but it's O(n)) and adding this
		functionality would slow down other performance characteristics. <a href="https://en.wikipedia.org/wiki/Set_%28abstract_data_type%29#Multiset">Multisets/bags</a>
		are also not sortable (by means other than automatically by key value).
		Hive does not utilize key values, is sortable, and does not provide the
		sort of functionality frequently associated with a bag (e.g. counting the
		number of times a specific value occurs).</p>
 </li>
	<li><h4>Why must active blocks be removed from the iterative sequence when
		empty?</h4>
		<p>Two reasons:</p>
		<ol type="a">
			<li>Standards compliance: if active blocks aren't removed then <code>++</code>
				and <code>--</code> iterator operations become O(n) in the number of
				active blocks, making them non-compliant with the C++ standard. At the
				moment they are O(1); in the reference implementation this
				constitutes typically one update for both skipfield and element
				pointers, two if a skipfield jump takes the iterator beyond the
				bounds of the current block and into the next block. But if empty
				blocks are allowed, there could be any number of empty blocks between
				the current element and the next one with elements in it. Essentially you get the same scenario
				as you do when iterating over a boolean skipfield.</li>
			<li>Performance: iterating over empty blocks is slower than them not
				being present obviously. But also, if you have to allow for empty
				blocks while iterating, then you have to introduce a branching loop in
				every iteration operation, which increases cache misses and code size.
				The strategy of removing blocks when they become empty also
				statistically removes (assuming a randomized erasure pattern) smaller
				blocks from the hive before larger blocks, which has a net result of
				improving iteration, because with a larger block, more iterations
				within the block can occur before the end-of-block condition is reached
				and a jump to the next block (and subsequent cache miss) occurs.</li>
		</ol>
 </li>
	<li><h4>Why not turn All emptied active blocks into reserved blocks for future use during
		erasure, or None, rather than leaving the decision implementation-defined?</h4>
		<p>Future implementations may find better strategies, and it is best not to
		overly constraint implementation. For the reasons described in
		the <i>Design Decisions->Specific Functions</i> section on erase(), retaining the current two back blocks has
		performance and latency benefits. Therefore reserving no active blocks is
		non-optimal. Meanwhile, reserving All active blocks is bad for performance
		as many small active blocks will be reserved, which decreases iterative
		performance due to lower cache locality. If a user wants more fine-grained
		control over memory retention they may use an allocator.</p>
 </li>
	<li><h4>Why is there no default constructor for hive_limits?</h4>
		<p>The user must obtain the block capacity hard limits of the
		implementation (via block_capacity_hard_limits()) prior to supplying their
		own limits as part of a constructor or reshape(), so that they do not
		trigger undefined behavior by supplying limits which are outside of the
		hard limits. Hence it was perceived by LEWG that there would be no reason
		for a hive_limits struct to ever be used with non-user-supplied values eg.
		zero.</p>
 </li>
	<li><h4>Element block capacities - what are they based on, how do they
		expand?</h4>
		<p>There are 'hard' capacity limits, 'default' capacity limits, 'current'
		limits and user-defined capacity limits. Current limits are whatever the
		current min/max capacity limits are in a given instance of hive. Default
		limits are what a hive is instantiated with if user-defined capacity limits
		are not supplied. Both default limits and user-defined limits are not allowed to go outside of
		an implementation's hard limits, which represent a fixed upper and lower limit. New element blocks have an
		implementation-defined growth factor, so will expand up to the current max limit.</p>
		<p>While implementations are free to chose their own limits and strategies, in the reference implementation element block sizes start from either
		the dynamically-defined default minimum size (8 elements, larger if the
		type stored is small) or an amount defined by the user (with a minimum of 3
		elements, as less than 3 elements is pretty much a linked list with more waste per-node.</p>
		<p>Subsequent block capacities in the reference implementation increase the
		<i>total capacity</i> of the hive by a factor of 2 (so, 1st block 8
		elements, 2nd 8 elements, 3rd 16 elements, 4th 32 elements etcetera) until
		the current maximum block size is reached. The default maximum block size
		in the reference implementation is 255 (if the type sizeof is &lt; 10
		bytes) or 8192. These values are based on multiple benchmark comparisons between different
		maximum block capacities, with different sized types. For larger-than-10-byte types the skipfield bitdepth is (at least) 16 so the
		maximum capacity 'hard' limit would be 65535 elements in that context, for
		&lt; 10-byte types the skipfield bitdepth is (at least) 8, making the
		maximum capacity hard limit 255. Larger capacities do not necessarily perform better because, given a randomized erasure pattern, a larger block may statistically retain more erased elements ie. empty space before it runs out of elements entirely and is removed from the sequence, and therefore can create slowdown during iteration due to low locality. </p>
 </li>
	<li><h4>What are user-defined element block minimum and maximum capacities
		good for?</h4>
		<p>See the summary in paper P2857R0 which goes into this.</p>
 </li>
	<li><h4>Why are hive_limits specified in constructors and not relegated to a
		secondary function?</h4>
		<ol type="a">
			<li>They have always been required in range/fill constructors for the
				reason that otherwise the user must construct, call reshape and then
				call range-assign/insert. This is slower and more cumbersome for
			users.</li>
			<li>They were not originally in the default constructors due to creating
				overload ambiguity with the fill constructors, but users have asked for this feature
				since 2016. One reason for the request was consistency. Another was
				usage with non-movable/copyable types, which cannot be used with
				reshape() as it must be able to reallocate elements when the
				existing blocks do not fit within the user-supplied range, and will throw
				when it cannot do so (either due to lack of memory or some other
				problem). The non-noexcept status of this function was also an issue
				for some users.</li>
			<li>In 2020 the issue was discussed <a href="https://lists.isocpp.org/sg14/2020/05/index.php">in SG14</a> and
				Jens Maurer <a href="https://lists.isocpp.org/sg14/2020/05/0354.php">suggested</a>
				using an external struct to make the constructor calls unambiguous.
				This has been the ongoing solution. It meets the needs of those using
				non-movable/copyable types and provides an exception-free way for users
				to specify block limits (provided they check their limits using
				block_capacity_hard_limits() first).</li>
			<li>Block capacity limits are something which users have repeatedly been
				thankful for. They are needed and their positive
				characteristics are discussed in P2857R0. As a side-note, it is of
				annoyance to many users that similar functionality was never
				specified for deque.</li>
		</ol>
 </li>
	<li><h4>Can a hive be used with SIMD instructions?</h4>
		<p>Yes if you're careful, no if you're not.<br>
		On platforms which support scatter and gather operations via hardware (e.g.
		AVX512) you can use hive with SIMD as much as you want, using gather to
		load elements from disparate or sequential locations, directly into a SIMD
		register, in parallel. Then use scatter to push the post-SIMD-process
		values back to their original locations.</p>
		<p>In situations where gather and scatter operations are too expensive or
		which require elements to be contiguous in memory for SIMD processing, it
		is more complicated. When you have a bunch of erasures in a hive, there's
		no guarantee that your objects will be contiguous in memory, even though
		they are sequential during iteration, as there may be erased elements in between them. Some of them may also be in different
		element blocks to each other. In these situations if you want to use SIMD
		with hive, you must do the following:</p>
		<ul>
			<li>Set your minimum and maximum group sizes to multiples of the width of
				your target processor's SIMD instruction size. If it supports 8
				elements at once, set the group sizes to multiples of 8.</li>
			<li>Either never erase from the hive, or:<br>

				<ol>
					<li>Shrink-to-fit after you erase (will invalidate all pointers to
						elements within the hive).</li>
					<li>Only erase from the back or front of the hive, and only erase
						elements in multiples of the width of your SIMD instruction e.g. 8
						consecutive elements at once. This will ensure that the element
						boundaries line up with the width of the SIMD instruction, provided
						you've set your min/max block sizes as above.</li>
				</ol>
			</li>
		</ul>
		<p>Generally if you want to use SIMD without gather/scatter, it's easier to use a vector or array.</p>
 </li>
	<li><h4>Why were container operators ==, != and &lt;=&gt; removed?</h4>
		<p>Since this is a container where insertion position is unspecified,
		situations such as the following may occur:<br>
		<code>hive&lt;int&gt; t = {1, 2, 3, 4, 5}, t2 = {6, 1, 2, 3, 4};<br>
		t2.erase(t2.begin());<br>
		t2.insert(5);<br>
		</code></p>
		<p>In this case it is implementation-defined as to whether or not t == t2
		because insertion position is not specified, if the == operator is
		order-sensitive.<br>
		If the == operator is order-insensitive, there is only one reasonable way
		to compare the two containers, which is with is_permutation. is_permutation
		has a worst-case time complexity of o(n<sup>2</sup>), which, while in
		keeping with how other unordered containers are implemented, was considered
		to be out of place for hive, which is a container where performance and
		consistent latency are a focus and most operations are O(1) as a result.
		While there are order-insensitive comparison operations which can be done
		in o(n log n) time, these allocate, which again was considered
		inappropriate for a == operator.</p>
		<p>In light of this the bulk of SG14/LEWG considered it more
		appropriate to remove the ==, != and &lt;=&gt; operators entirely, as these
		were unlikely to be used significantly with hive anyway. This gives the
		user the option of using is_permutation if they want an order-insensitive
		comparison, or std::equal if they want an order-sensitive comparison. In
		either case it removes ambiguity about what kind of operation they are
		expecting and the time complexity associated with it.</p>
 </li>
	<li><h4>What hive functions can potentially change sequence order?</h4>
		<p>Asides from functions which erase existing elements like clear, erase, ~hive and the = operators, the following may also change ordering: insert/emplace (as position is undefined), reshape, shrink_to_fit, splice. In addition, assigning a range to hive may not preserve the order of the range specified.</p>
 </li>
	<li><h4>Why was memory() removed?</h4>
		<p>This was a convenience function to allow programmers to find a
		hive's current memory usage without using a debugger or profiler, however this
		was considered out of keeping with current standard practice ie.
		unordered_map also uses a lot of additional memory but we don't provide
		such a function. In addition, the context where it would've been useful in
		realtime (ie. determining whether or not it's worth calling trim_capacity() or shrink_to_fit()),
		is better approached by comparing size() to capacity().</p>
 </li>
	<li><h4>Why was the Priority template parameter removed?</h4>
		<p>This was a hint to the implementation to prioritize for lowered memory
		usage or performance specifically. In other implementations this could've,
		for example, been used to switch between a jump-counting skipfield
		and the bitset+jump-counting approach described in the <a href="#non_reference_implementations_info">alternative implementations
		appendix</a> (reducing memory cost at cost of performance). In the early
		reference implementation, this told the container to switch between 16-bit
		and 8-bit skipfield types (smaller bitdepth skipfield types limit the block
		capacities due to the constraints of the jump-counting skipfield pattern).
		However, prior to a particular LEWG thread there had not been sufficient
		benchmarking and memory testing done on this.</p>
		<p>When more thorough benchmarking including memory assessments were done,
		it was found that the vast bulk of unnecessary memory usage came from
		erased elements in hive when an active block was not yet empty (and
		therefore freed to the OS or reserved, depending on implementation), rather
		than the skipfield type. This meant that making block capacities
		appropriately-sized was more important to performance and cache friendliness than the skipfield
		type, as - assuming a randomised erasure pattern - smaller blocks were more
		likely to become empty and therefore be removed from the iterative sequence
		than larger blocks, thereby improving element contiguousness and iteration
		performance by skipblocks. But also, reusing erased
		element memory space in existing element blocks was much faster than having to
		deallocate/reserve blocks and subsequently allocate or un-reserve new element blocks,
		so, having too-small element block capacities was also bad.</p>
		<p>The only place where the sizeof(skipfield_type) turned out to be more
		important than block capacities was when using small types such as scalars,
		where the additional memory use from the skipfield (proportional to the type
		size) was significant, and so reducing the skipfield_type from 16-bits to
		8-bits improved performance even though it made the max block capacities
		very small (255 elements), due to cache effects.</p>
		<p>As a result of all this it was decided to remove the priority parameter
		and let implementations decide when to switch internal mechanisms,
		rather than relying on user guesswork; probably a
		better approach. I think the priority parameter might've been useful for
		additional compile-time decision processes, such as deciding what type of block
		retention strategy to use when an active block becomes empty of
		elements. Having a priority tag also gave the ability to specify new
		priority values in future as part of the standard, potentially allowing for
		big changes without breaking ABI. It could also have been used to switch between an implementation using a pure jump-counting skipfield (max 16-bits per element), and a bitset + jump-counting approach (1 bit per element as described in the alt implementations appendix). However,
		it is also a matter of complexity vs simplicity, and simplicity is worthwhile too.</p>
 </li>
	<li><h4>Why does a bidirectional container have iterator operators &gt; &lt;,
		&gt;=, &lt;= and &lt;=&gt;?</h4>
		<p>These are useful for several reasons:</p>
		<ol type="a">
			<li>Where the user is iterating over the container elements and
				adding/subtracting a greater-than-1 value to/from the iterator for each cycle of
				the loop, having != end() or != another_iterator would not
				necessarily work as a loop end condition.</li>
			<li>It can be used by the user to correctly order two iterators obtained from random parts of the sequence, before
				supplying them to a function covering a range eg. range-insertion, std::distance, range-erasure, and so on. This stops the function throwing an exception when <code>it1 &gt; it2</code>.</li>
			<li>Because hive insertion/emplacement location is unspecified, if you
				have a specific range of elements you're interested in (perhaps
				you've calculated the distance between two iterators and that distance
				value is important somehow), iterator ops
				&lt;/&gt;/&lt;=/&gt;=/&lt;=&gt; are the only way to determine whether
				an element you just inserted is now within the range. Likewise if
				external objects/entities are removing elements from a hive via stored
				pointers/iterators, the comparison operations above are the only way to
				determine if the element just erased was within the given range.</li>
		</ol>
 </li>
	<li><h4>Why is the insertion time complexity for singular insert/emplace O(1)
		amortized as opposed to O(1)?</h4>
		<p>In the current reference implementation (at time of
		writing) in the event of needing to allocate a new active block or change a reserved block into an active block, there is an update
		of active block numbers which may occur if the current back block has a
		group number == std::numeric_limits&lt;size_type&gt;::max(). The occurence
		of this event is 1 in every std::numeric_limits&lt;size_type&gt;::max()
		block allocations/transfers-from-reserved (ie. basically <i>never</i> in the lifetimes of almost all programs). The number of active blocks at this point could be small or large depending on how many blocks have been deallocated in the past.</p>
		<p>If a hive were implemented as a vector of pointers to
		blocks instead, this would also necessitate amortized time complexity as when the vector became full and more blocks were needed, all element block
		pointers would need to be reallocated to a new memory block.</p>
 </li>
	<li><h4>Why does throwing an exception when calling reshape allow for the
		resultant <code><i>current-limits</i></code> to be something other than the original
		<code><i>current-limits</i></code> and the user-supplied block_limits? Also, why does exception handling for shrink_to_fit specify that reallocation of elements may occur if an allocation exception occurs? Finally, what happens to source element blocks during move assignment/construction if the allocators are not equal or propagating?</h4>
		<p>Although it might not sound like it, all three of these aspects are related. We'll start with reshape().</p>
		<p>If reshaping a hive instance from, for example, a <code><i>current-limits</i></code> of min =
		6 elements, max = 12 elements, to a supplied user limit of min = 15
		elements, max = 255 elements, this basically means none of the original
		blocks fit within the new limits. So all elements have to go
		into new blocks. A very basic strategy would be to allocate all new active blocks before moving elements, then
		deallocate all old active blocks once elements have been moved over. However if there are many active blocks, this could potentially lead to out-of-memory exceptions. So an implementation might choose
		to allocate blocks one-at-a-time, move elements into it, then deallocate each
		old block as it becomes empty.</p>
		<p>If the latter strategy is pursued we don't necessarily have the old memory
		blocks to revert to if an exception is thrown and we're halfway through
		moving the elements over. And we wouldn't want to start creating new
		element blocks of the old capacity and move the elements back to them, as
		this risks yet more exceptions! The best strategy in this scenario is to keep
		all existing blocks, remove duplicate elements if any exist, and change
		<code><i>current-limits</i></code> to values which accomodate both
		the original block capacities and the new block capacities.</p>
		<p>Similar concerns apply for shrink_to_fit() - instead of just creating a new hive with max block sizes then reallocating all elements into it, we could transfer elements block-by-block, allocating each new block then deallocating the old once the elements are transferred. This reduces the likelihood of out-of-memory errors substantially. However if we get halfway through this process and an allocation error occurs, obviously we can't go back to the old blocks as they've been deallocated. So we're stuck with the reallocation and have to follow the same strategy as reshape, retaining all blocks.</p>
		<p>Likewise with move-constructing/assigning between hive instances with unequal and non-propagating allocators, we can't transfer the element blocks because their allocators are incompatible, however we can move the elements to new element blocks allocated with <code>*this</code>'s allocator. Like shrink_to_fit() and reshape() above, a valid strategy here - particularly for memory-scarce platforms - would be to allocate blocks equal in capacity to the old ones, one at a time, transfering the elements in, then deallocating each old block after transfer of elements. But another valid strategy (for less memory-scarce platforms) is for the source hive to retain it's element blocks and turn them into reserved blocks after transfer of elements - this reduces the necessity of more allocations, should the source hive be re-used. Hence we do not specify what happens to the element blocks of the source hive when the allocators are unequal and non-propagating, but leave that up to the implementator.</p>
 </li>
	<li><h4>Why does the sort() specification require that T is
		Cpp17MoveInsertable into hive and Cpp17MoveAssignable?</h4>
		<p>Cpp17MoveAssignable is obvious, Cpp17MoveInsertable is to deal with the event that a specific sorting technique wants to take advantage
		of erased element memory spaces between active elements. For example, for the hive sequence: 4, 3,
		5 - where there is an erased element memory space before 4 - the sort
		technique could simply move-construct the 3 to the erased element memory
		space instead of swapping with the 4, saving instructions and the creation of a temporary. Also, the sort routine could choose to use erased element memory spaces, if they exist, or empty space at the back of the container, as space to store temporary buffers for swap operations instead of dynamically-allocating buffers - this may be useful both to performance and memory use, if elements are very large.</p>
 </li>
	<li><h4>Reasoning for container name?</h4>
		<p>See paper P2332R0.</p>
 </li>
	<li><h4>"Unordered and no associative lookup, so this only supports use cases
		where you're going to do something to every element."</h4>
		<p>The container was originally designed for highly object-oriented
		situations where you have many elements in different containers linking to
		many other elements in other containers. This linking can be done with
		pointers or iterators in hive (insert returns an iterator which can be
		dereferenced to get a pointer, pointers can be converted into iterators
		with get_iterator (for use with erase)) and because pointers/iterators stay
		stable regardless of insertion/erasure to other elements, this usage is
		unproblematic. You could say the pointer is equivalent to a key in this
		case but without the overhead. That is the first access pattern, the second
		is straight iteration over the container. However, the container can have
		(typically better than O(n)) std::advance/next/prev overloads, so multiple
		elements can be skipped efficiently.</p>
 </li>
	<li><h4>"Prove this is not an allocator"</h4>
		<p>I'm not really sure how to answer this, as I don't see the resemblance,
		unless you count maps, vectors etc as being allocators also. The only
		aspect of it which resembles what an allocator might do, is the memory
		re-use mechanism. It would not be impossible for an allocator to perform a
		similar function while still allowing the container to iterate over the
		data linearly in memory, preserving locality, in the manner described in
		this document.</p>
 </li>
	<li><h4>If this is for games, won't game devs just write their own versions
		for specific types in order to get a 1% speed increase anyway?</h4>
		<p>This is true for many/most AAA game companies who are on the bleeding
		edge, as well as some of the more hardcore indie developers, but they also
		do this for vector etc, so they aren't the target audience of std:: for the
		most part; sub-AAA game companies are more likely to use std::/third
		party/pre-existing tools. Also, this type of structure crops up in <a href="#external_prior_art">many fields</a>, not just game dev. So the
		target audience is probably everyone other than people working on bare
		metal core loops in extreme-high-performance circumstances, but even then,
		it facilitates communication across fields and companies as to what this
		type of container is, giving it a standardized name and understanding.</p>
 </li>
	<li><h4>Is there active research in this problem space? Is it likely to
		change in future?</h4>
		<p>The only analysis done has been around the question of whether it's
		possible for this specification to fail to allow for a better
		implementation in future. Bucket arrays have
		been around since the 1990s at least, there's been no significant
		innovation in them until now. I've been researching/working on hive since
		early 2015, and while I can't say that a better implementation might not be
		possible, I am confident that no change should be necessary to the
		specification to allow for future implementations. If a change were
		necessary it would likely be a loosening of the specification rather than a breaking
		change. This is because of the <a href="#constraints_summary">C++
		container requirements and how these constrain implementation</a>.</p>
		<p>The requirement of allowing no reallocations upon insertion or erasure,
		truncates possible implementation strategies significantly. Element blocks
		have to be independently allocated so that they can be removed from the iterative sequence (when empty)
		without triggering reallocation of subsequent elements. There's limited
		numbers of ways to do that and keep track of the element blocks at the same
		time. Erased element locations must be recorded (for future re-use by
		insertion) in a way that doesn't create allocations upon erasure, and
		there's limited numbers of ways to do this also. Multiple consecutive
		erased elements have to be skipped in O(1) time in order for the iterator
		to meet the C++ iterator O(1) function requirement, and again there're
		limits to how many ways you can do that. That covers the three core aspects
		upon which this specification is based. See the <a href="#non_reference_implementations_info">alt implementation appendix</a> for more information.</p>
 </li>
	<li><h4>We have (full container) splice, unique and sort, like in std::list,
		but not merge?</h4>
		<p>With splice and unique we can retain the guarantee that pointers to
		non-erased elements stay valid (sort does not guarantee this for hive), but
		with merge we cannot, as the function requires an interleaving of elements,
		which is impossible to accomplish without invalidating pointers, unless the
		elements are allocated individually. This is not the case in hive, hence
		including merge may confuse users as to why it doesn't share that same
		property of valid pointers with std::list. std::sort however is known to invalidate
		pointers when used with vectors and deques, so sort() as a member function
		does not necessarily have the association of retaining pointer validity.</p>
 </li>
	<li><h4>Why no resize()?</h4>
		<p>This was a choice by LEWG to avoid confusing the user, as insertion
		position into a hive is implementation-defined. In the case of hive,
		resizing would not necessarily insert new elements to the back of the
		container, when the supplied size was larger than the existing size(). New
		elements could be inserted into erased elements memory locations. This
		means the initialization of those non-contiguous elements (if they are POD
		types) cannot be optimized by use of memset, removing the main performance
		reason to include resize(). The lack of ability to specify the position of
		insertion removes the "ease of developer use" reason to include
		resize().</p>
 </li>
	<li><h4>Why not support push_back and push_front?</h4>
		<ol type="a">
			<li>Ordered insertion would create performance penalties due to not
				reusing previously-erased element locations, which in turn increases
				the number of block allocations necessary and reduces iteration speed
				due to wider gaps between active elements and the resultant reduced
				cache locality. This negates the performance benefits of using this
				container.</li>
			<li>Newcomers will get confused and use push_back instead of insert,
				because they will assume this is faster based on their experience of
				other containers, and the function call itself may actually be faster
				in some circumstances. But it will also inhibit performance for the
				reasons above. Further, explaining how the container works and operates
				has proved to be difficult even with C++ committee members, so being
				able to explain it adequately to novices such that they avoid this
				pitfall would not be guaranteed.</li>
			<li>It should be unambiguous as to its interface and how it works, and
				what guarantees are in place. Making insertion guarantees
				straightforward is key to performant usage. Having fewer constraints is
				also important for allowing future, potentially-faster, implementations.</li>
			<li>There are better containers for ordered insertion eg. deque/vector/plf::list.</li>
		</ol>
 </li>
	<li><h4>Likewise, why not support a fully-functional insert(iterator position, T value)?</h4>
		<ol type="a">
			<li>The same performance and usage issues identified above apply. Also, this would implicitly enable push_back via insert(hive.end(), value).</li>
			<li>Given that a hive can only insert close to <code>position</code> if there is an erased element memory location near it (or if <code>position</code> is close to <code>end()</code>),
			there is no guarantee that the insertion will be anywhere near <code>position</code>. As such it's best not to provide the user with false promises.</li>
			<li>Even in the case that there is an erased element memory location within the same block as <code>position</code>, blocks could be up to 65535 elements wide (possibly more on another implementation), making the position parameter more-or-less irrelevant. If one were to properly assess the nearest erased location from <code>position</code>, as opposed to just grabbing the first erased location in the block from the erased-elements free list, one would either have to scan the skipfield linearly in both directions from <code>position</code>, or do similar calculations using the free list - in either case an O(n) operation. And again, with no guarantee that the resultant location will be even remotely close to <code>position</code>.</li>
		</ol>
 </li>
	<li><h4>Why not constexpr? (yet)</h4>
		<p>At time of writing constexpr containers are still a new-ish thing and
		some of the kinks of usage may yet have to be worked out. Early
		compiler support was not good but this has improved steadily over time. I
		wasn't happy with having to label each and every function as constexpr, but there seem to
		be movements toward labeling classes as a whole as constexpr, so if that
		comes through it will remove that problem. Having said that, one thing to consider is that in the
		reference implementation there is extensive use of reinterpret_cast'ing
		pointers, mainly for two areas:</p>
		<ol>
			<li>The per-block free-list of erased elements for later use by
				insert/assign/emplace/etc.</li>
			<li>Allocating the element block and associated skipfield block in a
				single allocation, increasing performance.</li>
		</ol>
		<p>As reinterpret_cast is not allowed at compile time, 1 could be worked
		around by creating a union between the element type and the free list
		struct/pair. 2 would not be possible at compile time, and the element block
		and skipfield blocks would have to be allocated separately. So it is
		<i>possible</i>, though it could be hard work, and may decrease runtime
		performance.</p>
		<p>For the moment I am happier for std::array and std::vector to be the
		<a href="https://en.wikipedia.org/wiki/Sentinel_species">canaries in the coalmine</a> here.</p>
 </li>
	<li><h4>Licensing for the reference implementation (zLib) - is this
		compatible with libstdc++/libc++/MS-STL usage?</h4>
		<p><a href="https://opensource.stackexchange.com/questions/12755/do-i-have-to-remove-the-license-on-zlib-licensed-code-project-in-order-for-it-to/12765#12765">Yes</a>.
		<a href="https://choosealicense.com/licenses/zlib/">zLib</a> license is
		compatible with both <a href="https://www.gnu.org/licenses/license-list.en.html">GPL3</a> and <a href="https://www.mend.io/blog/top-10-apache-license-questions-answered/">Apache</a>
		licenses (libc++/MS-STL). zLib is a more permissive license than all of
		these, only requiring the following:</p>
		<p><code>This software is provided 'as-is', without any express or implied
		warranty. In no event will the authors be held liable for any damages
		arising from the use of this software.</code></p>
		<p><code>Permission is granted to anyone to use this software for any
		purpose, including commercial applications, and to alter it and
		redistribute it freely, subject to the following restrictions:</code></p>
		<ol>
			<li><code>The origin of this software must not be misrepresented; you
				must not claim that you wrote the original software. If you use this
				software in a product, an acknowledgment in the product documentation
				would be appreciated but is not required.</code></li>
			<li><code>Altered source versions must be plainly marked as such, and
				must not be misrepresented as being the original software.</code></li>
			<li><code>This notice may not be removed or altered from any source
				distribution.</code></li>
		</ol>
		<p><i>Please note that "product" in this instance doesn't mean 'source
		code', as in a library, but a program or executable. This is made clear by
		line 3 which clearly differentiates source distributions from products.</i>
	 </p>
		<p>Representatives from libc++, libstdc++ and
		MS-STL have stated they will likely either use the reference or use it as a
		starting point, and that licensing is unproblematic (with the exception of
		libc++ who stated they would need to run it past LLVM legal reps). However
		if in any case licensing becomes problematic as the sole author of the
		reference implementation <a href="https://opensource.stackexchange.com/questions/12755/do-i-have-to-remove-the-license-on-zlib-licensed-code-project-in-order-for-it-to/12765#12765">I
		am in a position to grant use of the code</a> under other licenses as needed.</p>
 </li>
	<li><h4>How does hive solve the ABA problem where a given iterator/pointer
		points to a given element, then that element is erased, another element is
		inserted and re-uses the previous element's memory location? We now have
		invalidated iterators/pointers which point to valid elements.</h4>
		<p>It doesn't. Detecting these cases is the end user's responsibility, as it is in
		deque or vector when elements are erased. In the case of hive I would
		recommend the use of either a
		unique ID within the element itself. The end user can then build their own
		"handle" wrapper around a pointer or iterator which stores a copy of this
		ID, then compares it against the element itself upon accessing it to see if
		it's the same.</p>
		<p>In terms of guarantees that an element has not been replaced via hive
		usage, replacement may occur if:</p>
		<ol type="a">
			<li>Any number of erasures have occurred, and then at least one insertion
				has occurred.</li>
			<li>clear() has been called and then at least one insertion has
			occurred.</li>
			<li>shrink_to_fit(), reshape(), assign(), sort() or iterator operator =
				have been called.</li>
		</ol>
 </li>
	<li><h4>Asides from it already being a known container type in many domains,
		and its performance vs other standard library containers, what are good
		reasons to standardise hive?</h4>
		<ol type="a">
			<li>"Build it and they will come" - quote from the movie "Field of
				Dreams" which basically means, people don't always know they want
				something until it's there. Once this is available and people
				understand the advantages (performance, memory, pointer validity etc)
				they are likely to use it. Particularly in fields where people are
				doing long-term compatibility/cross-platform work they are unlikely to
				use non-std:: containers, even if there are significant advantages,
				however if it's in the standard, they may.</li>
			<li>std::list. Developers commonly use this for situations where they
				need stable pointers to elements regardless of insertion/erasure (<a
				href="https://docs.libreoffice.org/basegfx/html/b2drangeclipper_8cxx_source.html">here</a>
				is an example in libreoffice, line 125 - std::list is used extensively
				in 50 libreoffice source files), but it is cache-unfriendly, slow for
				the majority of scenarios and wasteful in terms of memory. Hive is
				cache-friendlier, faster and (in terms of reference implementation)
				uses only one skipfield_type (8/16-bit) per element to maintain its
				functionality, as opposed to 2-pointers per-element for std::list. The
				main advantage of std::list is its ordered non-back/front
			insertion.</li>
			<li>Other languages may beat us to it. I know that one person is currently
				developing a hive-equivalent for Rust.</li>
			<li>Having this as std:: will allow greater communication and consistency
				across domains and organisations.</li>
		</ol>
 </li>
	<li><h4>What are the advantages of the user being able to reserve() when an
		allocator can mitigate some of the effects of allocating smaller blocks
		rather than larger ones?</h4>
		<p>An allocator can only decrease the number of allocation calls to the OS.
		While it might allocate one small block contiguous to another in the order
		of the sequence, it also might not (and likely won't), which decreases
		iteration speed. Further, there is a certain amount of metadata necessary
		for each block (regardless of implementation), which needs to be updated
		etc when erasures/insertions happen. Hence, by having more blocks than you
		need, you also increase memory overhead. There is also procedural overhead
		associated with each block, in terms of many of the operations like splice,
		reshape and reserve, where the more blocks you have, the more operations
		you incur (though the cost is typically very low).</p>
 </li>
	<li><h4>This seems over-specified.</h4>
		<p>Places to start: read the first paragraph of the introduction to this
		paper, then the Design Decisions section, then the <a href="#constraints_summary">constraints</a> and <a href="#non_reference_implementations_info">alt implementation</a> appendices.<br>
		Most of the specificity comes from the type of container and the C++
		standard's specifications. In terms of this type of container, this paper
		represents to the best of my knowledge the widest scope of implementation
		while still fulfilling the core invariants of the container, maintaining
		reasonable performance, and satisfying the C++ standard requirements.</p>
		<p>There is also a risk of underspecification. The (at time of writing) MS
		STL version of deque allocates blocks in a fixed size of 16 bytes so any
		element type above 8 bytes makes it a linked list - which it is allowed to
		do by the standard, which does not allow the user to specify block
		capacities. There are advantages to being more specific when you get to
		something more complex than an array because it encourages good
		implementation practice. As a result we attempt to reach a balance between under/over-specification.</p>
 </li>
	<li><h4>What is the reasoning behind which operations do/don't copy
		<code><i>current-limits</i></code> between hives?</h4>
		<p>These generally follow the pattern for allocators - which makes sense as
		their use may have a relationship with user-supplied allocator constraints.
		They're transferred during move and copy construction, operator =
		&amp;&amp; and swap, but not during splice or operator = &amp;. Unlike
		allocators, they are not able to be specified in the copy/move
		constructors, which makes sense for the move constructor since it would have to throw if the transferred blocks did
		not fit within the specified limits. If the user wants to specify new capacity
		limits when copying from another hive, they can do the following instead of
		calling a copy constructor:<br>
		<code>hive&lt;int&gt; h(hive_limits(10,50));<br>
		h = other_hive;</code></p>
		<p>Likewise if the user wants to specify new capacity limits when moving
		from another hive, they could:<br>
		<code>hive&lt;int&gt; h(std::move(other_hive));<br>
		h.reshape(10, 50);</code></p>
 </li>
	<li><h4>Why does erasing the back element, or calling unique() necessarily invalidate the past-the-end iterator?</h4>
	<p>For an iterator to reach end() it must be one ++ iteration past the back element. You could implement this in some complicated way, but it's best not to slow down the ++ operator so just having the end() location be one element (or one skipblock) past the back element is in my view the best approach. If the back block is full, end() is one element past the end of the block (there is allowance for this in the standard, otherwise vector would not work).</p>
<p>Whether an implementation allows there to be erased elements (ie. a skipblock) between the back element and end() is up to the implementation. I allow it because in performance testing this was found to perform better overall than checking whether we've erased the back element and moving end() backwards to behind the new back element. If an implementation chooses not to allow it, then erasing from the back of the container invalidates previous end() values. </p>
<p>But regardless of the style of implementation, the following scenario will invalidate previous end() values: the back block has one element left, and a call to erase() erases that element. As per the hive overview, since the block is now empty it must either be changed to a reserved block or deallocated. Either way it ceases to be a part of the iterable sequence, so the current location of end() cannot be reached by a ++ iteration from the new back element. At this point we must change the location of end() to be one ++ iteration away from the new back element, regardless of whether this is within the new back block, or one-past-the-end of that block (if skipblocks are allowed between the back element and end(), end() will always be one element past the end of the block at this juncture).</p>
<p>You can see from the above that any erasure which erases the back element has the <i>potential</i> to change the location of end(), even if it doesn't necessarily do so. Likewise if unique() ends up erasing the back element, the same considerations apply.</p>
</li>
<li><h4>Under what circumstances do insert/emplace invalidate the past-the-end iterator?</h4>
<p>Obviously if there are no erased element memory locations to insert into, a hive will have to insert to the back of the container, invalidating the end iterator. However even when this is not the case, it is possible for an implementation to change the location of end(). Consider the following scenario: A hive instance is not full to capacity, the back block is not full (ie. end() is not one-element-past-the-end of the back block), and the only block which has erased-element memory locations to insert into is the first block, which is very small (say, 8 elements). The back block is much larger, and the first block only has one element left in it.
Is it better to insert to the first block, knowing that its small size affords low cache locality, or to insert at the back, knowing that this increases the probability that the first, small block will be deallocated (once the final element is erased), increasing overall cache locality and iteration speed?</p>
<p>I would personally opt to just insert to the first block, because assessing the situation requires additional branching and is likely to slow down insertion in general. However, it is a valid (if possibly slower) strategy to not insert into the first block and insert to the back block instead in this case. Or as an alternative, to not insert to any block with only one element left and instead insert to the back block, if it isn't full. The latter two strategies invalidate the past-the-end iterator even if there are erased element memory locations available to insert into.</p>
<p>Hence we need to allow for the idea that any insertion could potentially invalidate the past-the-end iterator, and leave the exact circumstances under which this happens as an implementation detail.</p>
</li>

<li><h4>Under what circumstances could sort invalidate the past-the-end iterator?</h4>
<p>A potential optimization here is moving elements into erased-element memory spaces, if they exist after a given non-erased element, as described in FAQ 20. I think it's unlikely that anyone would apply this optimization, as it's quite specific to the state of a hive instance - but it's possible. If they do, and an element at the back of the container were to be moved into erased element memory space in this way, such that it's memory space was not filled with another element and effectively became erased, and it was the only element in the back block, this would trigger deallocation/reserving of the back block, invalidating the end iterator.</p>
<p>Likewise a sort routine could choose to reallocate elements to unused element memory space at the back of the back block or into reserved blocks (when available), instead of swapping elements around, which would also invalidate the end iterator.</li>

<li><h4>How is this a sequence container if it doesn't allow position-based insertion?</h4>
<p>When discussing this in LEWG, it became clear that though the container is largely a sequence, it's requirement of unspecified insertion location was not the same as sequence containers, in general. However, neither is it an association-based container, so rather than inventing a whole new category specifically for hive alone, it was considered a better approach to put it in sequence containers while noting all differences.</p>



<h3><a id="sg14gameengine"></a>Appendix D - Typical game engine
requirements</h3>

<p>Here are some more specific requirements with regards to game engines,
verified by game developers within SG14:</p>
<ol type="a">
	<li>Elements within data collections link to elements within other data
		collections (through a variety of methods). These
		links must stay valid throughout the course of the game/level. Any
		container where most of its operations cause pointer/index invalidation tend to creates difficulties
		and necessitates workarounds.</li>
	<li>The majority of data is simply
		iterated over, transformed, linked to and utilized with no regard to
	order.</li>
	<li>Erasing or otherwise "deactivating" objects occurs frequently in
		performance-critical code. For this reason methods of erasure or deactivation which create
		strong performance penalties are avoided.</li>
	<li>Inserting new objects in performance-critical code (during gameplay) is
		common - for example, a tree drops leaves, or a player spawns in an
		multiplayer game.</li>
	<li>It is not always clear in advance how many elements there will be in a
		container at the beginning of development, or at the beginning of a level
		during play. Genericized game engines in particular have to adapt to
		considerably different user requirements and scopes. For this reason
		extensible containers which can expand and contract in realtime are
	useful.</li>
	<li>Due to the effects of cache on performance, memory storage which is
		more-or-less contiguous is preferred.</li>
	<li>Memory waste is avoided.</li>
</ol>

<p>std::vector in its default state does not meet these requirements due to:</p>
<ol>
	<li>Poor single insertion performance (regardless of insertion position) due
		to the need for reallocation upon reaching capacity.</li>
	<li>Insert invalidates pointers/iterators to all elements</li>
	<li>Erase invalidates pointers/iterators/indexes to all elements after the
		erased element</li>
</ol>

<p>Game developers therefore tend to either develop custom solutions for each
scenario or implement workarounds for vector. The most common workarounds are
most likely the following or derivatives thereof:</p>
<ol>
	<li>Using a boolean flag or similar to indicate the inactivity of an object
		(as opposed to actually erasing from the vector). Elements flagged as
		inactive are skipped during iteration. A free list may be used in some
		situations.<br>
		<br>
		Advantages: Fast "deactivation". Easy to manage in multi-threaded
		environments. Memory cost can be low if a bitset is used.<br>
		Disadvantages: Can be slower to iterate due to branching code, potential
		bit access, and unknown number of inactive elements between any two active
		elements. The latter creates variable latency when iterating and violates
		time complexity requirements for C++ iterator operations ++/--.</li>
	<li>Using a deque of elements and a vector of indexes. Since deques don't
		reallocate during insertion, this preserves stable pointers to elements
		post-insertion. When erasing, the erasure occurs only in the vector of
		indexes, not the deque of elements, which retains a 'hole' in the deque
		where the destructed element was. It may use some form of free list in the
		deque in order to re-use erased element memory during later insertions.
		When iterating, one iterates over the vector and accesses the elements from
		the deque via the indexes. Erasure involves destructing the element in the
		deque, adding that element space to the free list, then swapping the
		element's index in the vector with the back index, and popping the new back
		index (the latter avoids slowdown and jitter due to reallocation).<br>
		As an alternative one could store a vector-of-elements instead of a deque,
		but then the external objects storing links to the elements must instead
		store an index into the vector-of-elements (since pointers would be
		invalidated upon insertion when capacity() == size()) and be able to access
		the vector directly.<br>
		<br>
		Advantages: Fast iteration.<br>
		Disadvantages: Insertion to the vector may reallocate indexes which is slow
		- can be replaced with a deque to avoid this, at a cost to iteration.
		Indexes must be at system bit-depth to accomodate all possible capacities, therefore
		the memory waste can be large - particularly for small types.</li>
	<li>Combining a swap-with-back-element-and-pop approach to erasure with some
		form of dereferenced lookup system to enable contiguous element allocation
		(sometimes called a <a href="http://bitsquid.blogspot.ca/2011/09/managing-decoupling-part-4-id-lookup.html">Packed
		array</a>). <br>
		Advantages: Iteration is at standard vector speed.<br>
		Disadvantages: Erasure will be slow if objects are large and/or
		non-trivially copyable, thereby making swap costs large. All link-based
		access to elements incur additional costs due to the dereferencing system.
		Stable references to elements cannot be maintained via iterators nor
		pointers, but must be made using a third lookup object which refers to the
		lookup table - since the lookup table itself doesn't refer to elements in
		contiguous order and is thereby unsuitable for iteration. Further, the
		lookup table, regardless of whether it uses pointers or indexes, needs to
		be at system bit-depth, which wastes memory - particularly for small types.
 </li>
</ol>

<p>Hive brings a more generic solution to these contexts with better
performance according to my benchmarks.</p>

<h3><a id="users"></a>Appendix E - User experience reports</h3>

<h4>Richard, Creative Assembly:</h4>

<p>"I'm the lead of the Editors team at Creative Assembly, where we make tools
for the Total War series of games. The last game we released was Three
Kingdoms, currently doing quite well on Steam. The main tool that I work on is
the map creation editor, kind of our equivalent of Unreal Editor, so it's a big
tool in terms of code size and complexity.</p>

<p>The way we are storing and rendering entities in the tool currently is very
inefficient: essentially we have a quadtree which stores pointers to the
entities, we query that quadtree to get a list of pointers to entities that are
in the frustum, then we iterate through that list calling a virtual draw()
function on each entity. Each part of that process is very cache-unfriendly:
the quadtree itself is a cache-unfriendly structure, with nodes allocated on
the heap, and the entities themselves are all over the place in memory, with a
virtual function call on top.</p>

<p>So, I have made a new container class in which to store the renderable
versions of the entities, and this class has a bunch of colonies inside, one
for each type of 'renderable'. On top of this, instead of a quadtree, I now
have a virtual quadtree. So each renderable contains the index of the quadtree
node that it lives inside. Then, instead of asking the quadtree what entities
are in the frustum, I ask the virtual quadtree for a node mask of the nodes
what are in the frustum, which is just a bit mask. So when rendering, I iterate
through all the renderables and just test the relevant bit of the node mask to
see if the renderable is in the frustum. (Or more accurately, to see if the
renderable has the potential to be in the frustum.) Nice and cache friendly.</p>

<p>When one adds an entity to the container, it returns a handle, which is just
a pointer to the object inside one of the colonies returned as a
std::uintptr_t. So I need this to remain valid until the object is removed,
which is the other reason to use a colony."</p>

<h4>Andrew Shuvalov, MongoDB:</h4>

<p>"I implemented a standalone open source project for the thread liveness
monitor: <a href="https://github.com/shuvalov-mdb/thread-liveness-monitor">https://github.com/shuvalov-mdb/thread-liveness-monitor</a>.
Also, I've made a video demo of the project: <a href="https://youtu.be/uz3uENpjRfA">https://youtu.be/uz3uENpjRfA</a></p>

<p>The benchmarks are in the doc, and as expected the plf::colony was extremely
fast. I do not think it's possible to replace it with any standard container
without significant performance loss. Hopefully, this version will be very
close to what we will put into the MongoDB codebase when this project is
scheduled."</p>

<h4>Daniel Elliot, Weta Digital:</h4>

<p>"I'm using it as backing storage for a volumetric data structure (like
openvdb). Its sparse so each tile is a 512^3 array of float voxels.</p>

<p>I thought that having colony will allow me to merge multiple grids together
more efficiently as we can just splice the tiles and not copy or reallocate
where the tiles dont overlap. Also adding and removing tiles will be fast. Its
kind of like using an arena allocator or memory pool without having to actually
write one." <br>
<i>Note: this is a private project Daniel is working on, not one for Weta
Digital.</i></p>

<h4>Ga&scaron;per A&#x17e;man, Citadel Securities:</h4>

<p>"Internally we use it as a slab allocator for objects with very different
lifetime durations where we want aggressive hot memory reuse. It lets us ensure
the algorithms are correct after the fact by being able to iterate over the
container and verify what's alive.</p>

<p>It's a great single-type memory pool, basically, and it allows iteration for
debugging purposes :)</p>

<p>Where it falls slightly short of expectation is having to
iterate/delete/insert under a lock for multithreaded operation - for those
usecases we had to do something different and lock-free, but for
single-threaded applications it's amazing."</p>

<h3><a id="container_guide"></a>Appendix F - A brief and incomplete guide for
selecting the appropriate container from inside/outside the C++ standard
library, based on performance characteristics, functionality and benchmark
results</h3>

<p>Guides and flowcharts I've seen online have either been performance-agnostic
or incorrect. This is not a perfect guide, nor is it designed to suit all
participants, but it should be largely correct in terms of its focus. Note,
this guide does not cover:</p>
<ol type="a">
	<li>All known C++ containers</li>
	<li>Multithreaded usage/access patterns in any depth</li>
	<li>All scenarios</li>
	<li>The vast variety of map variants and their use-cases</li>
	<li>Examinations of technical nuance (eg. at which sizeof threshold on a
		given processor does a type qualify as large enough to consider not using
		it in a vector if there is non-back erasure?). For that reason I'm qualifying 'Very large' or 'large' descriptors in this guide.</li>
</ol>

<p>These are broad strokes and should be treated as such. Specific situations with
specific processors and specific access patterns may yield different results.
There may be bugs or missing information. The strong insistence on
arrays/vectors where-possible is for code simplicity, ease of debugging,
and performance via cache locality. For the sake of brevity I am purposefully avoiding any discussion
of the virtues/problems of C-style arrays vs std::array or vector here. The relevance of all assumptions are subject to
architecture. The benchmarks this guide is based upon are available <a href="#benchmarks">here</a>, <a href="https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-01-overview/">here</a>.
Some of the map/set data is based on <a href="https://abseil.io/docs/cpp/guides/container">google's abseil library
documentation</a>.</p>

<h4>Start!</h4>

<p>a = yes, b = no</p>

<div style="background: #ffffff; overflow:auto; width:auto; border:solid gray;border-width:.1em .1em .1em .8em;padding:.2em .6em;">
<pre style="font-size: 11pt; font-family: sans-serif;">
0. Is the number of elements you're dealing with a fixed amount?
0a. If so, is all you're doing either pointing to and/or iterating over elements?
0aa. If so, use an array (either static or dynamically-allocated).
0ab. If not, can you change your data layout or processing strategy so that pointing to and/or iterating over elements would be all you're doing?
0aba. If so, do that and goto 0aa.
0abb. If not, goto 1.
0b. If not, is all you're doing inserting-to/erasing-from the back of the container and pointing to elements and/or iterating?
0ba. If so, do you know the largest possible maximum capacity you will ever have for this container, and is the lowest possible maximum capacity not too far away from that?
0baa. If so, use vector and reserve() the highest possible maximum capacity. Or use boost::static_vector for small amounts which can be initialized on the stack.
0bab. If not, use a vector and reserve() either the lowest possible, or most common, maximum capacity. Or boost::static_vector.
0bb. If not, can you change your data layout or processing strategy so that back insertion/erasure and pointing to elements and/or iterating would be all you're doing?
0bba. If so, do that and goto 0ba.
0bbb. If not, goto 1.


1. Is the use of the container stack-like, queue-like or ring-like?
1a. If stack-like, use plf::stack, if queue-like, use plf::queue (both are faster and configurable in terms of memory block sizes). If ring-like, use <a href="https://github.com/WG21-SG14/SG14/blob/master/SG14/ring.h">ring_span</a> or <a href="https://github.com/martinmoene/ring-span-lite">ring_span lite</a>.
1b. If not, goto 2.


2. Does each element need to be accessible via an identifier ie. key? ie. is the data associative.
2a. If so, is the number of elements small and the type sizeof not large?
2aa. If so, is the value of an element also the key?
2aaa. If so, just make an array or vector of elements, and sequentially-scan to lookup elements. Benchmark vs absl:: sets below.
2aab. If not, make a vector or array of key/element structs, and sequentially-scan to lookup elements based on the key. Benchmark vs absl:: maps below.
2ab. If not, do the elements need to have an order?
2aba. If so, is the value of the element also the key?
2abaa. If so, can multiple keys have the same value?
2abaaa. If so, use absl::btree_multiset.
2abaab. If not, use absl::btree_set.
2abab. If not, can multiple keys have the same value?
2ababa. If so, use absl::btree_multimap.
2ababb. If not, use absl::btree_map.
2abb. If no order needed, is the value of the element also the key?
2abba. If so, can multiple keys have the same value?
2abbaa. If so, use std::unordered_multiset or absl::btree_multiset.
2abbab. If not, is pointer stability to elements necessary?
2abbaba. If so, use absl::node_hash_set.
2abbabb. If not, use absl::flat_hash_set.
2abbb. If not, can multiple keys have the same value?
2abbba. If so, use std::unordered_multimap or absl::btree_multimap.
2abbbb. If not, is on-the-fly insertion and erasure common in your use case, as opposed to mostly lookups?
2abbbba. If so, use <a href="https://github.com/Tessil/robin-map">robin-map</a>.
2abbbbb. If not, is pointer stability to elements necessary?
2abbbbba. If so, use absl::flat_hash_map&lt;Key, std::unique_ptr&lt;Value&gt;&gt;. Use absl::node_hash_map if pointer stability to keys is also necessary.
2abbbbbb. If not, use absl::flat_hash_map.
2b. If not, goto 3.

Note: if iteration over the associative container is frequent rather than rare, try the std:: equivalents to the absl:: containers or <a href="https://github.com/Tessil">tsl::sparse_map</a>. Also take a look at <a href="https://martin.ankerl.com/2019/04/01/hashmap-benchmarks-05-conclusion/">this page of benchmark conclusions</a> for more definitive comparisons across more use-cases and hash map implementations.


3. Are stable pointers/iterators/references to elements which remain valid after non-back insertion/erasure required, and/or is there a need to sort non-movable/copyable elements?
3a. If so, is the order of elements important and/or is there a need to sort non-movable/copyable elements?
3aa. If so, will this container often be accessed and modified by multiple threads simultaneously?
3aaa. If so, use forward_list (for its lowered side-effects when erasing and inserting).
3aab. If not, do you require range-based splicing between two or more containers (as opposed to splicing of entire containers, or splicing elements to different locations within the same container)?
3aaba. If so, use std::list.
3aabb. If not, use plf::list.
3ab. If not, use hive.
3b. If not, goto 4.


4. Is the order of elements important?
4a. If so, are you almost entirely inserting/erasing to/from the back of the container?
4aa. If so, use vector, with reserve() if the maximum capacity is known in advance.
4ab. If not, are you mostly inserting/erasing to/from the front of the container?
4aba. If so, use deque.
4abb. If not, is insertion/erasure to/from the middle of the container frequent when compared to iteration or back erasure/insertion?
4abba. If so, is it mostly erasures rather than insertions, and can the processing of multiple erasures be delayed until a later point in processing, eg. the end of a frame in a video game?
4abbaa. If so, try the vector erase_if pairing approach listed at the bottom of this guide, and benchmark against plf::list to see which one performs best. Use deque with the erase_if pairing if the number of elements is very large.
4abbab. If not, goto 3aa.
4abbb. If not, are elements large or is there a very large number of elements?
4abbba. If so, benchmark vector against plf::list, or if there is a very large number of elements benchmark deque against plf::list.
4abbbb. If not, do you often need to insert/erase to/from the front of the container?
4abbbba. If so, use deque.
4abbbbb. If not, use vector.
4b. If not, goto 5.


5. Is non-back erasure frequent compared to iteration?
5a. If so, is the non-back erasure always at the front of the container?
5aa. If so, use deque.
5ab. If not, is the type large, non-trivially copyable/movable or non-copyable/movable?
5aba. If so, use hive.
5abb. If not, is the number of elements very large?
5abba. If so, use a deque with a swap-and-pop approach (to save memory vs vector - assumes standard deque implementation of fixed block sizes) ie. when erasing, swap the element you wish to erase with the back element, then pop_back(). Benchmark vs hive.
5abbb. If not, use a vector with a swap-and-pop approach and benchmark vs hive.
5b. If not, goto 6.


6. Can non-back erasures be delayed until a later point in processing eg. the end of a video game frame?
6a. If so, is the type large or is the number of elements large?
6aa. If so, use hive.
6ab. If not, is consistent latency more important than lower average latency?
6aba. If so, use hive.
6abb. If not, try the erase_if pairing approach listed below with vector, or with deque if the number of elements is large. Benchmark this approach against hive to see which performs best.
6b. If not, use hive.


<i>Vector erase_if pairing approach:</i>
Try pairing the type with a boolean, in a vector, then marking this boolean for erasure during processing, and then use erase_if with the boolean to remove multiple elements at once at the designated later point in processing. Alternatively if there is a condition in the element itself which identifies it as needing to be erased, try using this directly with erase_if and skip the boolean pairing. If the maximum number of elements is known in advance, use vector with reserve().
</pre>
</div>

<h3><a id="constraints_summary"></a>Appendix G - Hive constraints summary</h3>

<p>This is a summary of information already contained within P0447.</p>

<h4>Constraints forced by the C++ Standard:</h4>
<ol type="1">
	<li>Iterator operations must be O(1) amortized
		(iterator.requirements.general), forces:
		<ul>
			<li>Non-boolean skipping mechanism (unlike real-world implementations of
				bucket-lists etc) eg. jump-counting skipfield (only known possibility
				at present point in time) or better. For a boolean skipfield there is
				an undefined number of branching statements (ie. erased elements to
				check for and skip over) for each ++/-- operation, worst case linear in
				capacity. See Design Decisions, 'A method of skipping multiple erased
				elements in O(1) time during iteration'.
				<ul>
					<li>this, in turn enables faster, non-branching iteration with more
						consistent latency, and faster range-erasure/range-insertion.</li>
				</ul>
			</li>
			<li>Removal of element blocks from iterative sequence once they become
				empty of erased elements, otherwise skipping over an undefined number
				of empty blocks may occur during ++/-- operations, making those
				operations (in worst case) linear in the total number of blocks in
				terms of time complexity.
				<ul>
					<li>this, in turn, forces storage of block metadata detailing either
						(a) number of non-erased elements in block or (b) number of erased
						elements in block, to be matched with block capacity to determine
						block emptiness. Knowledge of each block's capacity is also
						necessary and may need to be recorded for each block.</li>
				</ul>
			</li>
		</ul>
 </li>
	<li>No exceptions allowed upon erase (container.rev.reqmts), forces:
		<ul>
			<li>Free-list of erased elements or free-list of skipblocks
				as the erased-element-location-recording mechanism, as opposed to a
				stack/vector/etc of pointers to erased element locations or similar, as
				the latter creates occasional allocations upon erase.</li>
		</ul>
 </li>
</ol>

<h4>Constraints forced by container type (most real-world implementations of
bucket-lists etc):</h4>
<ol type="1">
	<li>Stable element memory locations (the reasons for which are described in the
		Introduction and Motivation sections of this paper), forces:
		<ul>
			<li>no reallocation upon insert or erase, and for most other
			operations.</li>
			<li>only linked-list/tree of individually-allocated elements, linked list
				of element blocks + skipping structure, or container of pointers to
				element blocks + skipping structure (only known possibilities) meet
				this requirement without strong memory use - see Design Decisions
				section '1. Collection of element blocks + metadata', and the <a
				href="#sg14gameengine">game engine requirements appendix</a> for more
				info.</li>
		</ul>
 </li>
	<li>High-speed/predictable-latency iteration, forces:
		<ul>
			<li>element blocks + skipping structure as opposed to linked-list/tree of
				individually-allocated elements, due to better memory locality.</li>
			<li>removal of active blocks from iterative sequence once they become empty of erased
				elements, otherwise there would be skips over potentially many empty blocks during a
				single ++ operation.
			</li>
		</ul>
 </li>
	<li>High-speed/predictable-latency insert, forces:
		<ul>
			<li>element blocks + skipping structure as opposed to linked-list/tree of
				elements due to lower number of allocations.</li>
			<li>per-element-block skipping structures rather than global ones. A
				global one will have vector characteristics and force O(n) reallocation
				when insert triggers the creation of a new element block.</li>
		</ul>
 </li>
	<li>High-speed/predictable-latency erase, forces:
		<ul>
			<li>element blocks + skipfields as opposed to linked-list/tree of
				elements due to lower number of deallocations.</li>
			<li>per-element-block skipping structures rather than a global one. A
				global one will have vector characteristics and force O(n) reallocation
				when removing an element block from the iterative sequence, when that
				element block becomes empty of non-erased elements.</li>
			<li>per-element-block free lists of erased elements as opposed to a
				global free list. A global one would mean that, upon removal of a block
				from the iterative sequence (when it becomes empty of non-erased
				elements), hive would have to traverse the entirety of the global free
				list (O(n) in the number of erased elements within the hive) in order
				to remove all the free list entries from that block.</li>
		</ul>
 </li>
</ol>

<h4>Constraints forced by colony/hive (compared to some bucket array-type
implementations):</h4>
<ol type="1">
	<li>Higher-speed iteration/insert and less memory usage, forces:
		<ul>
			<li>Reuse of erased element locations rather than simply erasing till an
				element block becomes empty, then removing the block (not re-using
				locations lowers element locality, increases number of block
				deallocations/allocations and memory usage). Some bucket-array-like
				structures also do this. See Design Decisions, 'Erased-element location
				recording mechanism'.
				<ul>
					<li>this, in turn, forces random positions for insertions.
						Bucket-array structures without re-use can guarantee insertion at
						back of container, at the cost of iteration speed and memory.</li>
				</ul>
			</li>
			<li>Growth factor for element block capacities (up to limit) unless user
				specifies min/max block capacity limits as equal. This reduces number
				of allocations and increases element locality compared to static block
				capacities, when user does not know their max amount of elements in
				advance of construction.
				<ul>
					<li>this, in turn, forces storage of metadata on capacity for each
						element block.</li>
					<li>It also provides the facility for a user to specify their own
						block limits with minimal change to an implementation.</li>
				</ul>
			</li>
		</ul>
 </li>
	<li>&gt;, &lt;, &gt;=, &lt;= and &lt;=&gt; iterator operators, primarily to
		allow for ease-of-use with loops with greater-than-1 iterator
		increments/decrements, forces:
		<ul>
			<li>Depending on implementation, storage of element block number (ordering of active block in iterative
				sequence) metadata may be necessary. In reference implementation this number only needs to
				be updated once in every std::numeric_limits&lt;size_type&gt;::max()
				element block allocations, and occasionally during splice. Other styles of implementation might not need these.
			</li>
		</ul>
 </li>
</ol>

<p>So in order to serve the requirements of high performance, stable memory
locations, and the C++ standard, a standard library implementation of this type
of container is quite constrained as to how it can be specified. Ways of meeting
those constraints which deviate from reference implementation are detailed in the <a href="#non_reference_implementations_info">alt
implementations appendix</a>.</p>

<h3><a id="external_prior_art"></a>Appendix H - Prior art links</h3>

<p>In addition to the stuff below I have written a supporting paper which attempts
to assess prevalence of this type of container within the programming industry - with the results being about
61% of respondents using something like it (see <a href="https://isocpp.org/files/papers/P3011R0.pdf">P3011</a>).</p>

<p>Sean Middleditch talks about 'arrays with holes' on his old blog, which is
similar but using id lookup instead of pointers. There is some code: <a href="http://bitsquid.blogspot.com/2011/09/managing-decoupling-part-4-id-lookup.html">link</a></p>

<p>Jonathan Blow talks about bucket arrays here (note, he says "I call it a bucket
array" which might connote that it's his concept depending on your
interpretation, but it's just an indication that there are many names
for this sort of thing): <a href="https://www.youtube.com/watch?v=COQKyOCAxOQ&amp;t=596s">link</a></p>

<p>A github example (no iteration, lookup-only based on entity id): <a href="https://github.com/zmeadows/ark/blob/master/include/ark/storage/bucket_array.hpp">link</a></p>

<p>This guy describes free-listing and holes in 'deletion strategies': <a href="https://www.gamedeveloper.com/programming/data-structures-part-1-bulk-data">link</a></p>

<p>Similar concept, static array with 64-bit integer as bit-field for skipping:
<a href="https://github.com/lluchs/sparsearray">link</a></p>

<p>Going over old colony emails I found someone whose company had
implemented something like the above but with atomic 64-bit integers for
boolean (bitset) skipfields and multi-blocks for multithreaded use.</p>

<p>I initially thought while I was developing this that it was a newish concept, but
 particularly after doing the cppcon talk, more people
kept coming forward saying, yes, we do this, but with <i>X</i> specific difference.</p>

<p>Pool allocators etc are often constructed similarly to hives, at least in
terms of using free lists and multi memory blocks. However they are not useful
if you have large amounts of elements which require bulk
processing for repeated timeframes, because an allocator doesn't provide
iteration, and manually iterating via say, a container of pointers to objects
in a pool has the same performance and memory-use issues as linked lists.</p>



<h3><a id="non_reference_implementations_info"></a>Appendix I - Information on non-reference-implementation hive designs</h3>

<h4>Core aspect design alternatives</h4>
<p>See the <a href="#design">Design Decisions</a> section for a revision on these and how the
reference implementation applies them, and below for the alternatives.</p>


<h5>1. Collection of element blocks + metadata</h5>

<p>It is possible to implement a hive via a vector of pointers to
blocks+metadata. Some of the metadata could be stored with the pointers in the
vector. More analysis of this approach is described in the section below this
one, <a href="#vector_implementations_info">A full implementation guide using
the vector-of-pointers-to-blocks approach</a>.</p>

<h5>2. A method of skipping erased elements in O(1) time during iteration</h5>

<p>The low-complexity jump-counting pattern used in the reference
implementation has a lot of similarities to the <a href="https://archive.org/details/matt_bentley_-_the_high_complexity_jump-counting_pattern">high
complexity jump-counting pattern</a>, which was a pattern previously used by
the <a href="https://github.com/mattreecebentley/plf_colony">original reference implementation</a>. Using the high-complexity pattern is an
alternative, though the skipfield update time complexity guarantees for insertion/erasure with that
pattern are at-worst O(n) in the number of erased elements in the block. In practice
the majority of those updates constitute a single memcpy operation which may
resolve to a much smaller number of operations at the hardware level. But it
is still slightly slower than the low-complexity jump-counting pattern (around
1-2% in my benchmarking).</p>

<p>A pure boolean skipfield is not usable because it makes iteration time
complexity at-worst O(n - 2) in the summed capacity of two blocks (ie. a non-erased element at the beginning of one block, with no subsequent non-erased elements till the end of the next block). This can result in thousands of branching statements &amp; skipfield reads for a single ++
operation in the case of many consecutive erased elements. In the
high-performance fields for which this container was initially designed, this
brings with it unacceptably unpredictable latency.</p>

<p>However another strategy combining a low-complexity jump-counting pattern
<i>and</i> a boolean skipfield, which saves memory at the expense of
computational efficiency, is possible while preserving O(1) iterator operations. There is a simpler version of this, and a more complex version - both of which have some advantages.</p>

<h6><a id="bit_plus_jc"></a>Bitset + jump-counting pattern - simple variant:</h6>

<ul>
	<li>Instead of storing the data for the jump-counting pattern in its own
		skipfield, have a boolean bitset indicating which elements are erased - 0 for non-erased, 1 for erased.
		Store the jump-counting data in the erased element's memory space
		(possibly alongside free list data).</li>
	<li>When iterating, check whether the element is erased or not using the
		bitset; if it is not erased, do nothing. If it is erased, read the jump
		value from the erased element's memory space and skip forward or backward
		(depending on the direction of iterator) the appropriate number of nodes,
		both in the element block and the bitset.</li>
</ul>

<p>This approach does reduce the memory overhead of the skipfield to 1 bit per-element, but introduces 3 additional sets of operations per iteration via (1) branching operations when checking the bitset, (2) bitmasking + bitshift to read the bits and (3) additional reads (from the erased element memory space). The operation can be made branch-free (asides from the end of end-of-block check) by multiplying the bitset entry by the jump value and adding the result to the iterator's current element pointer and bitset index. ++ iteration then becomes:</p>

<ol>
<li>Add 1 to the current element pointer/index and the current bitset index.</li>
<li>Check to see if we are now beyond the end of the element block. If so, go to step 6. If not, go to step 3.</li>
<li>Read the part of the element's memory space which would contain the jump value, if the element were erased. Multiply this value by the current bitset value (0 for non-erased, 1 for erased).</li>
<li>Add this value to the current element pointer/index and the current bitset index.</li>
<li>Check if iterator is at end of block. If so, proceed to step 6, if not ++ operation is finished.</li>
<li>Change iterator to be at start of next group in the hive, and repeat steps 3 and 4 but no others.</li>
</ol>

<p>Unfortunately this approach also means the type has to be wide enough to fit both free list indexes and the jump-counting data - which means, assuming a doubly-linked free list, the type must be at a minimum 32-bits or greater (assuming max block capacities of 255 elements, meaning the free list indexes can be 8-bit each), otherwise its storage will need to be artificially widened to 32 bits. There is a way around this though:</p>


<h6><a id="bit_plus_jc_complex"></a>Bitset + jump-counting pattern - complex variant:</h6>

<ul>
<li>As with the simple variant, we have a bitset indicating which elements are erased.</li>
<li>If skipblock contains only one 1 element, there is no need to store jump-counting data - we only store free list data in the erased element memory space of the (singular) erased element.</li>
<li>If skipblock is &gt; 1 elements, store the jump-counting data in the memory space of the 2nd erased element + the last erased element in the skipblock. The latter is used with -- iteration but in the case of a 2-element skipblock, it is the same as the 2nd erased element.</li>
</ul>

<p>++ iteration becomes:</p>
<ol>
<li>Increment both the current element pointer and bitset index.</li>
<li>Check to see if we are now beyond the end of the element block. If so, go to step 3. If not, go to step 4.</li>
<li>Change element pointer to the start of the next element block, bitset pointer to the next block's bitset, and set bitset index to 0.</li>
<li>Read bitset[index]. If it is 0, ++ is finished.</li>
<li>If it is 1, read bitset[index + 1] - if this is 0 (or beyond the end of the bitset), this indicates a 1-element skipblock, so increment both the element pointer and bitset index, and go to step 7.</li>
<li>If bitset[index + 1] is 1, dereference (element pointer + 1) to find the jump-counting value, and add this value to the element pointer and the bitset index.</li>
<li>If we have already gone forward one block in step 3, ++ is finished. Otherwise check to see if we are now beyond the end of the element block. If so go to step 3. Otherwise ++ operation is finished.</li>
</ol>

<p>-- iteration becomes:</p>
<ol>
<li>Decrement both the current element pointer and bitset index.</li>
<li>Check to see if we are now beyond the beginning of the element block. If so, go to step 3. If no, go to step 4.</li>
<li>Change element pointer to the back element of the previous element block, bitset pointer to the previous block's bitset, and set bitset index to the capacity of that element block - 1.</li>
<li>Read bitset[index]. If it is 0, -- is finished.</li>
<li>If it is 1, read bitset[index - 1] - if this is 0 (or beyond the beginning of the bitset), decrement both the element pointer and bitset index and go to step 7.</li>
<li>If bitset[index - 1] is 1, dereference the element pointer to find the jump-counting value, and subtract this value from the element pointer and the bitset index.</li>
<li>If we have already gone back one block in step 3, -- is finished. Otherwise check to see if we are now beyond the beginning of the element block. If so go to step 3. If not -- operation is finished.</li>
</ol>

<p>This approach involves a greater number of operations than the simple variant, but means that for a 16-bit element type and a hive with max block capacities of 255 elements (meaning the free list indexes can be 8-bit), the type's storage will not need to be artificially widened in order to store both the doubly-linked free list and jump-counting data.</p>


<h5><a id="alt_erased_recording"></a>3. Erased-element location recording mechanism</h5>
There are two known valid approaches here; both involve per-element-block <a href="https://en.wikipedia.org/wiki/Free_list">free
lists</a>, utilizing the memory space of erased elements to form a list of
erased locations. The first approach forms a free list of all erased elements.
The second forms a free list of the first element in each skipblocks. The reference implementation currently
uses the second approach, as discussed in <a href="#design">Design Decisions</a>.

<p>One cannot use a stack of pointers (or similar) to erased elements for this
mechanism, as early versions of the reference implementation did, because this
can create allocations during erasure, which violates the exception guarantees
of erase() in the standard.</p>



<h6>Why a global free-list won't work:</h6>

<p>If a global (ie. not per-element-block) free list were used, pointers would be necessary instead of indexes, as finding the location of an erased element based on a (global) index would be O(n) in the number of active blocks (counting each block's capacity as we went). This would increase the minimum bitdepth necessary for the hive element type to sizeof(pointer) * 2. A global free list would also decrease cache locality when traversing the free
list by jumping between element blocks. Lastly, when a block was
removed from the <i>active blocks</i> upon becoming empty, it would force an
O(n) traversal of the free-list to find all erased elements (or skipblocks)
within that particular block in order to remove them from the free list.
Hence a global free list is unacceptable for both performance and latency
reasons.</p>

<h6><a id="singly_linked_free_list"></a>Why the reference implementation uses a doubly-linked free list and where
you can have a singly-linked one:</h6>

<p>Previous versions of the reference implementation used a singly-linked free
list of erased elements instead of a doubly-linked free list of skipblocks.
This is possible with the high complexity jump-counting pattern, but not using
the low complexity jump-counting pattern, as the latter cannot calculate the
location of the start node of a skipblock from the value of a non-start node, but the high complexity variant can (see both of
the jump-counting papers listed earlier for more details). But using free-lists
of skipblocks is more efficient as it requires fewer free list
nodes. In addition, re-using only the start or end nodes of a skipblock is
faster because it never splits a skipblock into two skipblocks.</p>

<p>An example of why a doubly-linked free list is necessary for the low
complexity jump-counting pattern, is erasing an element which happens to be
between two skipblocks. In this case two skipblocks must be combined into one
skipblock, and the previous secondary skipblock must be removed from that
block's free list. If the free list is singly-linked, the hive
must do a linear search through the free list, starting from the free list
head, in order to find the skipblock prior to the secondary skipblock
mentioned, to update that free list node's "next" index link. This is at worst
O(n) in the number of skipblocks within that block. However if a doubly-linked
free list is used, that previous skipblock is linked to from the entry in the
skipblock we have to remove, making the free list update constant-time.</p>

<p>Likewise when an erasure occurs just before the front of a skipblock (where
the free-list data is stored), expanding the skipblock, the same scenario
applies; for a singly-linked free list, one has to traverse the whole free list
starting from the free list head, in order to find that skipblock's 'previous'
free list node in order to update the previous node's 'next' link to point to the new start location of the changed skipblock. If the free
list is doubly-linked we don't have to.</p>

<p>If the high-complexity jump-counting pattern is used, then we can calculate
the start of a skipblock from the value of the erased skipfield node; and from the start nodes value we know the length of the skipblock. This
means we can alter skipblocks using the information given to us by any node in the skipblock, not just the back/front nodes. Which in turn means we can make a free list from individual erased elements rather than skipblocks. And this in turn means there is no need to combine or update previously-existing
free list entries in the examples above, and we can simply use a singly-linked free list instead of a doubly-linked one.</p>

<h6>Keeping track of which blocks have erased elements:</h6>

<p>So far I have largely been talking about how to keep track of erasures
within element blocks, not about which blocks have erasures in them. In the reference implementation
the latter is achieved by keeping an intrusive linked list of the groups whose
blocks contained erasures, as mentioned previously. This increases group
metadata memory usage by two pointers. Alternative methods include:</p>
<ol type="a">
	<li>Storing a vector of pointers to groups whose blocks contain erasures. To
		preserve erase() exception guarantees, this vector would have to be
		expanded upon insertion, not erasure, and thus its capacity should always
		be &gt;= the number of active blocks. When an active block becomes completely empty of
		elements and is removed from the iterative sequence, its pointer in the
		vector would be swapped with the back pointer and popped. This approach reduces
		the additional memory cost down to 1 pointer per active block.</li>
	<li>Storing a record of the last one-or-more groups whose blocks have had
		erasures and then re-using from those groups during insertion, until they
		are full and then (on subsequent insertions) searching consecutive groups
		until a group with erasures is found. The records would be updated every
		time an erasure is made. The assumption is made that groups close to the
		recorded group(s) with erasures are more likely to have erasures, but this
		depends on the erasure pattern. This is the approach used by <a href="https://github.com/mattreecebentley/plf_list">plf::list</a>, but it
		is efficient there because the main structure is a vector of metadata, hence
		the search phase is cache-friendly and fast even though the worst case scenario is O(n)
		in the number of active blocks. In hive it would only be cache-friendly if one took
		the <a href="#vector_implementations_info">vector-of-pointers-to-blocks approach</a> and stored the relevant metadata
		which indicates erasures, with the pointer, in the vector. The
		performance of this approach scales with the number of active blocks, so while it
		reduces the cost of keeping track of which active blocks have erasures down to 2
		pointers per hive, it may reduce performance and increase latency
		variability during insertion due to the search phase, as described.</li>
	<li>If a <a href="#vector_implementations_info">vector-of-pointers-to-blocks approach</a> is taken one could construct a
		low-complexity jump-counting skipfield which skips active blocks which <i>do
		not</i> contain erasures, and use this to find the first block with
		erasures in O(1) time. This reduces the memory cost down to 1
		<code>size_type</code> per active block.</li>
</ol>

<h4>Additional info for supporting small types</h4>

<p>In the (current at time of writing) reference implementation we do not accommodate 1-byte types without artificially widening the storage the type uses to be sizeof(skipfield_type) * 2 ie. 2 bytes. This is in order to accomodate the doubly-linked free list of skipblocks, which is expressed in pairs of prev/next indexes
written to the erased element's memory location. Those indexes have to be able
to address the whole of the specific element block, which means they have to be
the same size as skipfield_type. If an implementation wished to create an overload for 1-byte types such that there was no artificial widening of the element storage and resultant
wasted memory, there are 7 valid approaches I can see. However, the simplest and lowest memory cost approach turns out to also be potentially the fastest, so I will only list that approach here. See revision 26 of this paper for the previous (inferior) methods.</p>

<p>Essentially the idea is this: utilize 255 or 256-element blocks, remove the free list from the structure (but retain the intrusive singly-linked list of blocks with erasures in them), create a bitset for the 256 elements consisting of <code>256 / (sizeof(std::size_t) * 8)</code> std::size_t unsigned integers (ie. 4 for 64-bit platforms, 8 for 32-bit, etc). Store the jump-counting data in the erased element memory, iterating by using the bitset to determine whether an element is erased and adding the jump-counting value to the current location, if so (add 1 to the jump-counting value if using 256-element blocks so that a jump from first element to end of block is possible). If using 255-element blocks we can make this branch-free via the instructions in the <a href="#bit_plus_jc">bitset + jump-counting simple variant</a> section.</p>

<p>When erasing, it's a simple matter of setting a bitset index to 1, and updating any jump-counting values for adjacent skipblocks in the erased element memory space. When inserting into a given block, one does the following: check each std::size_t to see if it's non-zero. The first non-zero std::size_t found is the location of the first erasure in that element block. Use std::countr_one (typically uses CPU intrinsics) to find the sub-index of the first 1 in the std::size_t, and use that with the size_t's index + the subindex of the 1 to determine the index into the element memory block, of the insertion position for the new element. Then we set the 1 to 0 in the bitset and update the jump-counting value for any adjacent skipblock. Since we are finding the first 1 starting from the least-significant-bit of the first size_t to have a 1 in it, any potential skipblock will be on the right of the erased element location, meaning we only have to add the jump-counting value from the erased element's memory space to the insertion point and subtract 1, to find the end node of that skipblock and update it.</p>

<p>Because this method only involves checking 4 (or 8) unsigned ints and avoids allocating any additional memory to store free list node values (since the free list no longer exists in this implementation), it is simultaneously the simplest, most effective and potentially the fastest option for small tyepss. The addition memory cost per-element, ignoring any block metadata, is 1 bit. While the method could potentially be used for larger types as well, ideally with most of those we would like to be storing more than 256 elements per-block for cache locality/performance reasons. And while 128 64-bit unsigned ints are enough to create a bitset for 8192 elements, and it is possible that a given system would perform okay with scanning 128 64-bit ints that are already in the cache, at that point, we are starting to get closer to O(n) territory for inserts to erased locations.</p>


<h4><a id="vector_implementations_info"></a>An implementation guide using the
vector-of-pointers-to-blocks approach</h4>

<p>I will give the summary first, then show in detail, how we get there and why
some approaches aren't possible. When I talk about vectors here I'm not really
talking about std::vector, more likely a custom implementation.</p>

<h5>Summary</h5>

<p>Like in the reference implementation, there are structs (referred to herein
as 'groups') containing both an element array + skipfield array + array
metadata such as size, capacity etc. Each group has its own erased-element
free list just like the reference implementation.</p>

<p>The hive contains a vector of pointers to groups (referred to herein as a
'group-vector'). The group-vector contains 2 extra pointers, one at the front
of the active group pointers and one at the back of the active group pointers,
each of which has its own location in memory as its value (these are referred
to herein as the <i>front</i> and <i>back</i> pointers).</p>

<p>Each allocated group also contains a reverse-lookup pointer in its metadata
which points back to the pointer pointing at it in the group-vector. While this
is used in other operations it is also used by iterator comparison operators to
determine whether the group that iterator1 is pointing to is later than the
group that iterator2 is pointer to, in the iterative sequence.</p>

<p>An iterator maintains a pointer to the group, the current element and the
current skipfield location (or just an index into both the element and
skipfield arrays). When it needs to transition from the end or beginning of the
element array to the next or previous group, it takes the value of the
reverse-lookup pointer in the current group's metadata and increments or
decrements it respectively, then dereferences the new value to find the next/previous group's memory location.</p>

<p>If the value of the memory location pointed to is <code>nullptr</code>, it
increments/decrements again till it finds a non-<code>nullptr</code> pointer -
this is the next block. If the value of the memory location pointed to is equal
to the memory location, the iterator knows it has reached the front or back
pointer in the vector, depending on whether it was decrementing or incrementing
respectively. This is the only purpose of the front and back pointers, to inform the iterator of boundaries (see later for more details).</p>

<p>When a group becomes empty of non-erased elements, it is either deallocated
or retained for future insertions by copying its pointer in the group-vector
to an area past the back pointer, depending on implementation. Either way its
original pointer location in the group-vector is <code>nullptr</code>'ed.</p>

<p>There is a hive member counter which counts the number of
<code>nullptr</code>'ed pointers. If it reaches a implementation-specific
threshold, a erase_if operation is called on the vector, removing all
<code>nullptr</code> pointers and consolidating the rest. Subsequently (or as
part of the erase_if operation) the groups whose pointers have been relocated
have their reverse-lookup pointer updated. The threshold prevents (a) iterator
++/-- functions straying too far from being O(1) amortized in terms of number of operations and
(b) too many erase_if operations occurring.</p>

<p>Likewise for any splice operation, when source groups become part of the
destination, the destination group-vector gets pointers to those groups added,
and the reverse-lookup pointers in those groups get updated. All reverse-lookup
pointers get updated when the vector expands and the pointers are
reallocated.</p>

<p>To keep track of groups which currently have erased-element locations ready
to be re-used by insert, we can either keep the reference implementation's
intrusive-list-of-groups-with-erasures approach, or we can remove that metadata
from the group and instead have a secondary vector of size_type with the
same capacity as the group-vector, containing a jump-counting skipfield. </p>

<p>In that skipfield we maintain a record of runs of groups which do Not
currently have erased element locations available for reuse, so that if there
are any such groups available, a single iteration into this skipfield will take
us to the index corresponding to that group in the group-vector. And if there
are no such groups available, that same iteration will take us to the end of
the skipfield. This approach replaces 2 pointers of metadata per-group with one
size_type.</p>

<p>If insertion determines that there are no groups with erasures available, it can
(depending on implementation) either check a hive member counter which counts
the number of <code>nullptr</code>'ed pointers, and if it's not zero,
linear-scan the group-vector to find any <code>nullptr</code> locations and
reuse those to point to a new group - or it could just move the back pointer
forward by 1 and reuse that location to point to a new group (relying on the
occasional erase_if operations to clean up the <code>nullptr</code> pointer
locations instead, and running erase_if itself only if the vector has reached
capacity). If the implementation has retained a pointer to an empty group past
the back pointer (a group made empty by erasures) it could reuse that at this
point.</p>

<h6>Alternative strategies within the above approach:</h6>
<ol>
	<li>One can, instead of storing all block metadata with the block, store some
		of it with the pointer in the vector-of-pointers, or even in its own
		vector (following a struct-of-arrays-style formation).</li>
	<li>For example, one could store capacity and size with the pointer in the
		vector instead of in the group, and set capacity to 0 when a group is
		removed. The iterator could then use the value of capacity, instead of
		nulling the pointer, to indicate a vector entry they need to skip when
		iterating. The hive could then use the pointers themselves to form a free
		list of empty vector positions, and re-use these when a new group needs to be created.</li>
<li>We could store jump-counting data with the pointer to skip over runs of erased groups in O(1) time, and update this on the fly.</li>
	 <li>We could store the per-block free list of erased elements head in a
		separate vector, and linearly-scan for the first free list head without a
		'no erasures' value (if copying the current implementation, this value is
		numeric_limits&lt;skipfield_type&gt;::max()). Again we run the risk of
		latency due to branching, but this consumes no additional memory in order
		to record which blocks have erasures. This would likely require specification complexity adjustment.</li>
</ol>

<h5>How we get there, starting from the simplest approach</h5>

<p>The simplest idea I had for the alternative (non-reference-implementation)
approach was, a vector of pointers to allocated element blocks. In terms of how iteration works, the iterator holds a pointer
to the vector-of-pointers (the structure, not the vector's internal array) and
an index into the vector-of-pointer's array, as well as a pointer/index to the
current element. The hive itself would also store a pointer to the vector
structure, allocating it dynamically, which makes swapping/moving
non-problematic in terms of keeping valid iterators (if the hive stores the
vector as a member, the iterator pointers to the vectors get invalidated on
swap/move).</p>

<p>When the end of a block is reached by the iterator, if it hasn't hit end()
you add 1 to the vector-of-pointers index in the iterator and continue on to
the next block. Since the iterator uses indexes, reallocation of the vector
array upon expansion of the vector-of-pointers doesn't become a problem.
However it <i>is</i> a problem when a block prior to the current block that the
iterator is pointing at, becomes empty of elements and has to be removed from
the iterative sequence. If the pointer-to-that-block gets erased from the
vector-of-pointers, that will cause subsequent pointers to be relocated
backward by one, which in turn will make iterators pointing to elements after
that block invalid (because the relocation invalidates the stored block indexes
in the iterators).</p>

<p>Substituting a swap-and-pop between the erasable pointer and the back
pointer of the vector-of-pointers, instead of erasing/relocating, doesn't solve
this problem, as this produces unintuitive iteration results when an iterator
lies between the back block and the block being erased (suddenly there is a
bunch of elements behind it instead of in front, so forward iteration will miss
those), and it also invalidates iterators pointing to elements in the
(previously) back block.</p>

<p>So at this point we only have two valid approaches, A &amp; B.</p>

<h6>Approach A:</h6>

<p>Here we have to think it terms of what's efficient, not what necessarily
lowers time complexity requirements. Basically instead of erasing pointers to
the erased blocks from the vector, we mark them as <code>nullptr</code> and the
iterator, when it passes to the next block, skips over the <code>nullptr</code>
pointers. This is the opposite of what we try to do with the current approach
in the reference implementation (remove blocks from the iterative
linked-list-of-blocks sequence) because with the current approach, those blocks
represent a latency issue via following pointers to destinations which may not
be within the cache. With a vector approach however, it makes no difference to
latency because the next pointer in the vector chunk already exists in the
cache in, at a guess, 99% of cases. You could, potentially, get a bad result
when using a CPU with poor branch-prediction-recovery performance like core2's
(because this approach introduces a branching loop), when you have a
close-to-50/50 random distribution of <code>nullptr</code>'s and actual
pointers to blocks. But since blocks are generally going to be many factors
fewer than elements within those blocks, this is not likely to be a major
performance hit like a boolean skipfield over elements would be, even in that
case.</p>

<p>In terms of re-using those <code>nullptr</code> pointers, we can't do a
free-list of pointers because then during iteration we would have no idea which
pointer was a pointer to a block and which a pointer to another free-list item
- so instead we simply have a size_type counter in the hive metadata which
counts the number of erased pointers currently in the vector-of-pointers. When
we reach the capacity of existing element blocks and need to create a new block
upon insert, we check the counter - if it's not zero (ie. time to create new a
block pointer at the back of the vector), scan manually along the
vector-of-pointers until you hit a <code>nullptr</code> and re-use that (same
logic as above as to why this isn't a latency/performance issue) and decrement
the 'erased pointer' counter.</p>

<p>Since insertion location is unspecified for hive, inserting a block randomly
into the middle of the iterative sequence causes no fundamental issues, is the
same as re-using an erased element location during insertion.</p>

<h6>Approach B:</h6>

<p>If one is concerned about strict time complexity, and less concerned about
real world effects of that time complexity, you can basically have a
jump-counting skipfield for the vector-of-pointers (secondary vector of
size_type with a low-complexity jump-counting skipfield).</p>

<p>This means (a) iterators can skip over pointers to erased blocks in O(1)
time and (b) the memory locations of the pointers to erased blocks can be used
to form a free list of reusable pointers. So this eliminates both of the
non-O(1) aspects of Approach A, though whether or not this is faster
in-practice comes down to actual benchmarking.</p>

<h6>Metadata and probable advantages of either approach:</h6>

<p>I've left out block metadata (size, capacity, erased element free list head,
etc) in the descriptions above to simplify explanation, but for approach A we
would probably want block metadata as part of a struct which the
vector-of-pointers is pointing to (struct contains the element block too), so
that the non-O(1) linear scans over the vector-of-pointers are as fast as
possible.</p>

<p>For approach B we would probably want the vector-of-pointers to actually be
a vector-of-struct-metadata, with the pointer to the element block being one of
the pieces of metadata. We could also do a 'struct of arrays' approach instead,
depending on the performance result.</p>

<p>Both approaches eliminate the need for the 'block number' piece of metadata
since we get that from the vector-of-pointers for free. They also eliminate the
need for prev-block/next-block pointers, though this is offset by the need for
the vector of pointers in approach A and for approach B the secondary skipfield
- but still a reduction of 2 pointers to 1 pointer/size_type.</p>

<p>The intrusive-list-of-blocks-with-element-erasures from the reference
implementation could in this approach be replaced with a <a href="https://archive.org/details/matt_bentley_-_the_low_complexity_jump-counting_pattern">low-complexity jump-counting skipfield</a> (an additional vector of <code>size_type</code>) which,
instead of denoting runs of erased blocks, denotes runs of blocks with no
prior element erasures (this would include erased blocks, since these are not usable for insert).
If there were no prior erasures, iterating over this skipfield would jump directly to the end of the skipfield
on the first iteration. This further reduces the memory use per-block of
recording down from 2 pointers in the reference implementation, to 1
size_type.</p>

<p>Alternatively if we go the vector-of-metadata-structs route, and don't mind
doing a non-O(1) linear scan upon insert, we can linear-scan the
erased-element-free-list-head metadata of each block to find blocks with
erasures, subsequently eliminating additional memory use for recording
blocks-with-erasures entirely. This approach would be benefited by splitting
the vector-of-metadata-structs into a struct-of-vectors for each metadata
item.</p>

<h6>But <code>splice()</code> changes our requirements:</h6>

<p>Splice requires that iterators to the source and
destination hive's elements not be invalidated when the source hive's elements
are transferred to the destination hive.</p>

<p>If we take a vector-of-pointers/vector-of-metadata approach, and our
iterators use indexes into that vector, those indexes will be invalidated
during splice, as the source vector's contents must be transferred into the
destination vector. Further, the pointer-to-vector which the iterator must hold
in order to transition between blocks, would also be invalidated during splice
for iterators pointing to source elements - which means that	swapping from
vectors to deques and using pointers instead of indexes within the iterators,
would not help.</p>

<p>The solution is unintuitive but functional: the iterator becomes much the same as
the reference implementation: either 3 pointers, one to the block+metadata
struct, one to the element and one to the skipfield, or 1 pointer (to the
struct) and one index (into the element and skipfield blocks respectively). We
add a "reverse-lookup" pointer into the element block's metadata which points back at
the pointer pointing to it in the vector (we could have a pointer to the vector block + an index, but this consumes more memory so we'll just say a pointer to the vector position for the remainder of this text). When the iterator needs to transition
blocks it follows the pointer out to the vector and increments or decrements as
necessary. When splice() is run	it alters the reverse-lookup pointers in each
of the source's blocks such that they point to the correct pointers in the
destination's vector. If the vector has to expand either during splice to accomodate the source blocks, the reverse-lookup pointers will need to be updated for all blocks.</p>

<p>Neither move nor swap are required to update these pointers, since those
operations will simply swap the members of the vector including the pointer to
the dynamically-allocated internal array, and neither the block metadata nor
the iterators contain pointers to the vector itself. As such we no longer need
to dynamically-allocate the vector-of-pointers in the hive and can just have it
as a member.</p>

<p>The solution does not entirely rule out Approach B (vector of metadata
structs) in the above sections, but simply means that the reverse-lookup
pointer <i>must</i> be stored with the element block, while other metadata may
be stored either with the element block, or in the vector, or in separate
vectors (ie. in a struct-of-arrays style), as desired or is determined to
perform well.</p>

<p>Also: this solution allows us to fully erase entries from the vector and relocate
subsequent entries, since we're no longer relying on indexes within the iterators themselves
to keep track of the block pointer location in the vector. An implementation
can choose whether or not they want to consolidate the vector after a block
erasure, and might want to have a maximum threshold of the number of erased
entries in the vector before doing so (or possibly the number of Consecutive
erased entries). This prevents the number of operations per ++/-- iterator
operation from becoming too high in terms of skipping erased entries and causing latency. But more
importantly, it keeps ++/-- iterator operation at O(1) amortized (and removes any
performance problems relating to poor branch-prediction-recovery as described
earlier).</p>

<p>The vector erase operation (erase_if if we're following a threshold approach
and consolidating multiple erased entries at the same time) would process
block metadata and adjust the reverse-lookup pointers for each block to the new
values. Likewise when insertion triggers an expansion of the vector,
the reverse-lookup's get updated (if a deque is used instead of a vector, the latter is unnecessary as no reallocation takes place upon insert).</p>

<p>Lastly, we need a way for the iterator to figure out if it's at the
beginning or end of the vector-of-pointers respectively. While we could store a
pointer to the vector itself in the block metadata as well, this is being
wasteful memory-wise. A less wasteful solution is to have 1 pointer at the beginning and end
of the vector-of-pointers which have unique values ie. two pointers per hive instead of one
pointer per-block. The unique value (since <code>nullptr</code>'ing the element block pointer is already
taken) could be the address of the pointer location itself, since this is a
unique address. If we take the alternative approach described in the summary of using the capacity metadata to indicate erased blocks instead of the pointer, we can instead use nullptr for the front and back pointers.</p>

<h6>iterator greater/lesser/spaceship operators (&gt;/&lt;/&gt;=/etcetera):</h6>

<p>Now we lack a group_number metadata entry but we also lack a way to obtain
the group number from the vector-of-pointers, since neither the block metadata
nor the iterator currently store a pointer to the vector (and the iterator
can't, since the pointer might get invalidated and the iterator can't get
automatically updated by the container).</p>

<p>Luckily however, we don't need to know the group number for these operations to
work; we only need to know if one group is later in the sequence than the
other, and since we're storing a reverse-lookup pointer to the pointer in the
vector-of-pointers, when comparing to see if <code>it1 &lt; it2</code> we only
need to check whether <code>it1-&gt;block_metadata-&gt;reverse-lookup &lt;
it2-&gt;block_metadata-&gt;reverse-lookup</code>.</p>

<p>If we use a deque-of-pointers-to-blocks instead of a vector we can't do the above, as the pointers
are not guaranteed to point to locations within the same block. So for a deque
we need to store pointers to the deque in each block in order to get
block numbers, which as mentioned is wasteful of memory. We could however remove the back and front pointers in the
deque-of-pointers at that point, as the iterator could use the pointer to the
deque to find the front and back of the deque, instead.</p>

<p>This is probably not the only approach possible when not using
the reference implementation, but it certainly will work.</p>
</body>
</html>
