<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 1188: Unordered containers should have a minimum load factor as well as a maximum</title>
<meta property="og:title" content="Issue 1188: Unordered containers should have a minimum load factor as well as a maximum">
<meta property="og:description" content="C++ library issue. Status: NAD">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue1188.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#NAD">NAD</a> status.</em></p>
<h3 id="1188"><a href="lwg-closed.html#1188">1188</a>. Unordered containers should have a minimum load factor as well as a maximum</h3>
<p><b>Section:</b> 23.2.8 <a href="https://wg21.link/unord.req">[unord.req]</a>, 23.5 <a href="https://wg21.link/unord">[unord]</a> <b>Status:</b> <a href="lwg-active.html#NAD">NAD</a>
 <b>Submitter:</b> Matt Austern <b>Opened:</b> 2009-08-10 <b>Last modified:</b> 2019-02-26</p>
<p><b>Priority: </b>Not Prioritized
</p>
<p><b>View other</b> <a href="lwg-index-open.html#unord.req">active issues</a> in [unord.req].</p>
<p><b>View all other</b> <a href="lwg-index.html#unord.req">issues</a> in [unord.req].</p>
<p><b>View all issues with</b> <a href="lwg-status.html#NAD">NAD</a> status.</p>
<p><b>Discussion:</b></p>
<p>
Unordered associative containers have a notion of a maximum load factor:
when the number of elements grows large enough, the containers
automatically perform a rehash so that the number of elements per bucket
stays below a user-specified bound. This ensures that the hash table's
performance characteristics don't change dramatically as the size
increases.
</p>

<p>
For similar reasons, Google has found it useful to specify a minimum
load factor: when the number of elements shrinks by a large enough, the
containers automatically perform a rehash so that the number of elements
per bucket stays above a user-specified bound. This is useful for two
reasons. First, it prevents wasting a lot of memory when an unordered
associative container grows temporarily. Second, it prevents amortized
iteration time from being arbitrarily large; consider the case of a hash
table with a billion buckets and only one element. (This was discussed
even before TR1 was published; it was TR issue 6.13, which the LWG
closed as NAD on the grounds that it was a known design feature.
However, the LWG did not consider the approach of a minimum load
factor.)
</p>

<p>
The only interesting question is when shrinking is allowed. In principle
the cleanest solution would be shrinking on erase, just as we grow on
insert. However, that would be a usability problem; it would break a
number of common idioms involving erase. Instead, Google's hash tables
only shrink on insert and rehash.
</p>

<p>
The proposed resolution allows, but does not require, shrinking in
rehash, mostly because a postcondition for rehash that involves the
minimum load factor would be fairly complicated. (It would probably have
to involve a number of special cases and it would probably have to
mention yet another parameter, a minimum bucket count.)
</p>

<p>
The current behavior is equivalent to a minimum load factor of 0. If we
specify that 0 is the default, this change will have no impact on
backward compatibility.
</p>


<p><i>[
2010 Rapperswil:
]</i></p>


<blockquote><p>
This seems to a useful extension, but is too late for 0x.

Move to Tentatively NAD Future.
</p></blockquote>

<p><i>[
Moved to NAD Future at 2010-11 Batavia
]</i></p>


<p><i>[LEWG Kona 2017]</i></p>

<p>Should there be a shrink_to_fit()? Is it too surprising to shrink on insert()? (We understand that shrinking on erase() is not an option.) Maybe make people call rehash(0) to shrink to the min_load_factor? On clear(), the load factor goes to 0 or undefined (0/0), which is likely to violate min_load_factor() min_load_factor(z)'s wording should match max_load_factor(z)'s, e.g. "May change the container’s maximum load factor" Want a paper exploring whether shrink-on-insert has been surprising. From Titus: Google's experience is that maps don't shrink in the way this would help with. NAD, not worth the time. Write a paper if you can demonstrate a need for this.</p>


<p id="res-1188"><b>Proposed resolution:</b></p>
<p>
Add two new rows, and change rehash's postcondition in the unordered
associative container requirements table in 23.2.8 <a href="https://wg21.link/unord.req">[unord.req]</a>:
</p>

<blockquote>
<table border="1">
<caption>Table 87 &mdash; Unordered associative container requirements
(in addition to container)</caption>

<tr>
<th>Expression</th><th>Return type</th><th>Assertion/note pre-/post-condition</th>
<th>Complexity</th>
</tr>
<tr>
<td><ins>
<code>a.min_load_factor()</code>
</ins></td>
<td><ins>
<code>float</code>
</ins></td>
<td><ins>
Returns a non-negative number that the container attempts to keep the
load factor greater than or equal to. The container automatically
decreases the number of buckets as necessary to keep the load factor
above this number.
</ins></td>
<td><ins>
constant
</ins></td>
</tr>

<tr>
<td><ins><code>a.min_load_factor(z)</code></ins></td>
<td><ins><code>void</code></ins></td>
<td><ins>Pre: <code>z</code> shall be non-negative. Changes the container's minimum
load factor, using <code>z</code> as a hint. [<i>Footnote:</i> the minimum
load factor should be significantly smaller than the maximum. 
If <code>z</code> is too large, the implementation may reduce it to a more sensible value.]
</ins></td>
<td><ins>
constant
</ins></td>
</tr>
<tr>
<td><code>a.rehash(n)</code></td>
<td><code>void</code></td>
<td>
Post: <ins><code>a.bucket_count() &gt;= n</code>, and <code>a.size() &lt;= a.bucket_count()
* a.max_load_factor()</code>. [<i>Footnote:</i> It is intentional that the
postcondition does not mention the minimum load factor.
This member function is primarily intended for cases where the user knows
that the container's size will increase soon, in which case the container's
load factor will temporarily fall below <code>a.min_load_factor()</code>.]</ins>
<del>
<code>a.bucket_cout &gt; a.size() / a.max_load_factor()</code> and <code>a.bucket_count()
&gt;= n</code>.
</del>
</td>
<td>
Average case linear in <code>a.size()</code>, worst case quadratic.
</td>
</tr>
</table>
</blockquote>

<p>
Add a footnote to 23.2.8 <a href="https://wg21.link/unord.req">[unord.req]</a> p12:
</p>

<blockquote>
<p>
The insert members shall not affect the validity of references to
container elements, but may invalidate all iterators to the container.
The erase members shall invalidate only iterators and references to the
erased elements.
</p>

<blockquote><p>
[A consequence of these requirements is that while insert may change the
number of buckets, erase may not. The number of buckets may be reduced
on calls to insert or rehash.]
</p></blockquote>
</blockquote>

<p>
Change paragraph 13:
</p>

<blockquote><p>
The insert members shall not affect the validity of iterators if
<del><code>(N+n) &lt; z * B</code></del> <ins><code>zmin * B &lt;= (N+n) &lt;= zmax * B</code></ins>,
where <code>N</code> is the number of elements in
the container prior to the insert operation, <code>n</code> is the number of
elements inserted, <code>B</code> is the container's bucket count,
<ins><code>zmin</code> is the container's minimum load factor,</ins>
and <code>z<ins>max</ins></code> is the container's maximum load factor.
</p></blockquote>

<p>
Add to the <code>unordered_map</code> class synopsis in section 23.5.3 <a href="https://wg21.link/unord.map">[unord.map]</a>,
the <code>unordered_multimap</code> class synopsis
in 23.5.4 <a href="https://wg21.link/unord.multimap">[unord.multimap]</a>, the <code>unordered_set</code> class synopsis in
23.5.6 <a href="https://wg21.link/unord.set">[unord.set]</a>, and the <code>unordered_multiset</code> class synopsis
in 23.5.7 <a href="https://wg21.link/unord.multiset">[unord.multiset]</a>:
</p>

<blockquote><pre><ins>
float min_load_factor() const;
void min_load_factor(float z);
</ins></pre></blockquote>

<p>
In 23.5.3.2 <a href="https://wg21.link/unord.map.cnstr">[unord.map.cnstr]</a>, 23.5.4.2 <a href="https://wg21.link/unord.multimap.cnstr">[unord.multimap.cnstr]</a>, 23.5.6.2 <a href="https://wg21.link/unord.set.cnstr">[unord.set.cnstr]</a>, and
23.5.7.2 <a href="https://wg21.link/unord.multiset.cnstr">[unord.multiset.cnstr]</a>, change:
</p>

<blockquote><p>
... <code>max_load_factor()</code> returns 1.0 <ins>and
<code>min_load_factor()</code> returns 0</ins>.
</p></blockquote>





</body>
</html>
