<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01//EN"
   "http://www.w3.org/TR/html4/strict.dtd">
<html>
<head>
<meta http-equiv="Content-Type" content="text/html;charset=US-ASCII">

<style type="text/css">

body { color: #000000; background-color: #FFFFFF; }
del { text-decoration: line-through; color: #8B0040; }
ins { text-decoration: underline; color: #005100; }

p.example { margin-left: 2em; }
pre.example { margin-left: 2em; }
div.example { margin-left: 2em; }

code.extract { background-color: #F5F6A2; }
pre.extract { margin-left: 2em; background-color: #F5F6A2;
  border: 1px solid #E1E28E; }

p.function { }
.attribute { margin-left: 2em; }
.attribute dt { float: left; font-style: italic;
  padding-right: 1ex; }
.attribute dd { margin-left: 0em; }

blockquote.std { color: #000000; background-color: #F1F1F1;
  border: 1px solid #D1D1D1;
  padding-left: 0.5em; padding-right: 0.5em; }
blockquote.stddel { text-decoration: line-through;
  color: #000000; background-color: #FFEBFF;
  border: 1px solid #ECD7EC;
  padding-left: 0.5empadding-right: 0.5em; ; }

blockquote.stdins { text-decoration: underline;
  color: #000000; background-color: #C8FFC8;
  border: 1px solid #B3EBB3; padding: 0.5em; }

table { border: 1px solid black; border-spacing: 0px;
  margin-left: auto; margin-right: auto; }
th { text-align: left; vertical-align: top;
  padding-left: 0.8em; border: none; }
td { text-align: left; vertical-align: top;
  padding-left: 0.8em; border: none; }

</style>

<title>Clarifying Memory Allocation</title>
</head>
<body>
<h1>Clarifying Memory Allocation</h1>

<p>
ISO/IEC JTC1 SC22 WG21 N3664 - 2013-04-19
</p>

<address>
Lawrence Crowl, crowl@google.com, Lawrence@Crowl.org
<br>
Chandler Carruth, chandlerc@google.com
<br>
Richard Smith, richardsmith@google.com
</address>

<p>
<a href="#Introduction">Introduction</a><br>
<a href="#Problem">Problem</a><br>
<a href="#Solution">Solution</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Memory">Memory</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Races">Data Races</a><br>
<a href="#Wording">Wording</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#expr.new">5.3.4 New [expr.new]</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#expr.delete">5.3.5 Delete [expr.delete]</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#new.delete.dataraces">18.6.1.4 Data races [new.delete.dataraces]</a><br>
<a href="#Revision">Revision History</a><br>
<a href="#References">References</a><br>
</p>


<h2><a name="Introduction">Introduction</a></h2>

<p>
The allocation and deallocation of memory
has become a significant expense in modern systems.
The optimization of that process is important to good performance.
However, it is important to distinguish between
micro-optimization of the calls
and macro-optimization of the allocation strategy.
In particular, good system performance
may well require adapting the allocation stragegy
to the dynamic behavior of the application,
or even to hints provided by the application.
</p>


<h2><a name="Problem">Problem</a></h2>

<p>
As strict reading of the current C and C++ standards
may lead one to conclude that 
the allocation strategy shall not consider any information
not derivable from the sequence of new and delete expressions.
In essence, the standards may exclude macro-optimization of allocation.
</p>

<p>
On the other hand,
a strict reading of the standards
may lead one to conclude that
the implementation must make an allocation function call
for each and every new expression.
This reading may exclude micro-optimization of allocation.
</p>


<h2><a name="Solution">Solution</a></h2>

<p>
We propose to replace existing mechanistic wording
with wording more precisely focused on essential requirements.
The intent is to enable behavior
that some existing compilers and memory allocators already have.
For example, see TCMalloc <a href="#TCM">[TCM]</a>.
</p>


<h3><a name="Memory">Memory</a></h3>

<p>
An essential requirement on implementations
is that they deliver usable memory,
not that they have a particular sequence of allocation calls.
We propose to relax the allocation calls
with respect to new expressions.
</p>

<ol>
<li><p>
Within certain constraints, the number of allocation calls
is not part of the observable behavior of the program.
This enables implementations
to reduce the number of allocation calls by avoiding them or fusing them.
</p></li>
<li><p>
When avoiding or fusing allocations,
the amount of space requested does not exceed
that implied by the new expressions,
with the exception of additional padding to meet alignment constraints.
This means that the amount of space allocated
does not increase peak allocation.
</p></li>
</ol>

<p>
Because C++ class-specific memory allocators
are often tuned to specific class sizes,
we do not apply this relaxation to those allocators.
</p>


<h3><a name="Races">Data Races</a></h3>

<p>
An essential requirement on implementations
is that they be data-race free,
yet the standards do not say so directly.
We propose to replace the current wording with direct wording,
thus explicitly enabling an implementation to consider information
beyond the strict sequence of allocation and deallocation calls.
</p>


<h2><a name="Wording">Wording</a></h2>

<p>
The wording in this section is relative to WG21
<a href="http://www.open-std.org/JTC1/SC22/WG21/docs/papers/2012/n3485.pdf">
N3485</a>.
</p>


<h3><a name="expr.new">5.3.4 New [expr.new]</a></h3>

<p>
Edit paragraph 8 as follows.
</p>

<blockquote class="std">
<p>
A <var>new-expression</var>
<del>obtains</del> <ins>may obtain</ins> storage for the object
by calling an <var>allocation function</var> (3.7.4.1).
If the <var>new-expression</var> terminates by throwing an exception,
it may release storage by calling a deallocation function (3.7.4.2).
If the allocated type is a non-array type,
the allocation function's name is <code>operator new</code> and
the deallocation function's name is <code>operator delete</code>.
If the allocated type is an array type,
the allocation function's name is <code>operator new[]</code> and
the deallocation function's name is <code>operator delete[]</code>.
[<i>Note:</i>
an implementation shall provide default definitions
for the global allocation functions (3.7.4, 18.6.1.1, 18.6.1.2).
A C++ program can provide
alternative definitions of these functions (17.6.4.6)
and/or class-specific versions (12.5).
&mdash;<i>end note</i>]
</p>
</blockquote>

<p>
Paragraph 9 is unchanged.  It specifies allocation function lookup.
</p>

<p>
Add a new paragraph between the existing paragraphs 9 and 10.
</p>

<blockquote class="stdins">
<p>
An implementation is allowed to omit
a call to a replaceable global allocation function (18.6.1.1, 18.6.1.2).
When it does so,
the storage is instead
provided by the implementation or
provided by extending the allocation of another <var>new-expression</var>.
The implementation may extend the allocation
of a <var>new-expression</var> <code>e1</code>
to provide storage for a <var>new-expression</var> <code>e2</code>
if the lifetime of the object allocated by <code>e1</code>
strictly contains the lifetime of
the object allocated by <code>e2</code>,
<code>e1</code> and <code>e2</code> would invoke
the same replaceable global allocation function,
and, for a throwing allocation function,
exceptions in <code>e1</code> and <code>e2</code>
would be first caught in the same handler.
</p>
</blockquote>

<p>
Edit paragraph 10 as follows.
</p>

<blockquote class="std">
<p>
<ins>When a <var>new-expression</var> calls an allocation function
and that allocation has not been extended,
the</ins>
<del>A</del> <var>new-expression</var> passes the amount of space requested
to the allocation function
as the first argument of type <code>std::size_t</code>.
That argument shall be no less than the size of the object being created;
it may be greater than the size of the object being created
only if the object is an array.
For arrays of <code>char</code> and <code>unsigned char</code>,
the difference between the result of the <var>new-expression</var>
and the address returned by the allocation function
shall be an integral multiple of
the strictest fundamental alignment requirement (3.11)
of any object type whose size is no greater than
the size of the array being created.
[<i>Note:</i>
Because allocation functions are assumed to return pointers to storage
that is appropriately aligned for
objects of any type with fundamental alignment,
this constraint on array allocation overhead
permits the common idiom of allocating character arrays
into which objects of other types will later be placed.
&mdash;<i>end note</i>]
</p>
</blockquote>

<p>
Add a new paragraph after paragraph 10 as follows.
</p>

<blockquote class="stdins">
<p>
When a <var>new-expression</var> calls an allocation function
and that allocation has been extended,
the size parameter to the allocation call
shall be no greater than the sum of
the sizes for the omitted calls as specified above,
plus the size for the extended call had it not been extended,
plus any padding necessary to align the allocated objects
within the allocated memory.
</p>
</blockquote>


<h3><a name="expr.delete">5.3.5 Delete [expr.delete]</a></h3>

<p>
Edit paragraph 7 as follows.
</p>

<blockquote class="std">
<p>
If the value of the operand of the <var>delete-expression</var>
is not a null pointer value, <ins>then:</ins>
</p>
<ul>
<li><p>
<ins>If the allocation call for the <var>new-expression</var>
for the object to be deleted was not omitted (5.3.4),
the <var>delete-expression</var> shall call a deallocation function (3.7.4.2).
The value returned from the allocation call of the <var>new-expression</var>
shall be passed as the first argument to the deallocation function.</ins>
</p></li>
<li><p>
<ins>Otherwise,</ins> the <var>delete-expression</var>
will <ins>not</ins> call a deallocation function (3.7.4.2).
</p></li>
</ul>
<p>
Otherwise, it is unspecified whether the deallocation function will be called.
</p>
<p>
[<i>Note:</i>
The deallocation function is called regardless of whether
the destructor for the object or some element of the array throws an exception.
&mdash;<i>end note</i>]
</p>
</blockquote>

<h3><a name="new.delete.dataraces">18.6.1.4 Data races [new.delete.dataraces]</a></h3>

<p>
Edit paragraph 1 as follows.
</p>

<blockquote class="std">
<p>
For purposes of determining the existence of data races,
the library versions of operator <code>new</code>,
user replacement versions of global operator <code>new</code>,
<del>and</del> the C standard library functions
<code>calloc</code> and <code>malloc</code>
<del>
shall behave as though they accessed and modified
only the storage referenced by the return value.
</del>
<del>The</del> <ins>, the</ins> library versions of
operator <code>delete</code>,
user replacement versions of operator <code>delete</code>,
<del>and</del> the C standard library function <code>free</code>
<del>
shall behave as though they accessed and modified
only the storage referenced by their first argument.
</del>
<del>The</del> <ins>, and the</ins> C standard library function
<code>realloc</code>
<del>shall behave as though it accessed and modified
only the storage
referenced by its first argument and by its return value.</del>
<ins>shall not introduce a data race (17.6.5.9 [res.on.data.races]).</ins>
Calls to these functions
that allocate or deallocate a particular unit of storage
shall occur in a single total order,
and each such deallocation call shall happen before
<ins>(1.10 [intro.multithread])</ins>
the next allocation (if any) in this order.
</p>
</blockquote>


<h2><a name="Revision">Revision History</a></h2>

<p>
This paper revises N3537 - 2013-03-12 as follows.
</p>

<ul>

<li><p>
Switch from changing the effect allocation and deallocation calls
to changing the effect of new and delete expressions.
Revise the introduction accordingly.
</p></li>

<li><p>
Reduce redundancy in resulting paragraph on data races.
</p></li>

<li><p>
Remove unchanged paragraphs in the standard.
</p></li>

</ul>

<p>
N3537 revised N3433 - 2012-09-23 as follows.
</p>

<ul>

<li><p>
Clarify that class-specific allocation operators are unaffected.
</p></li>

<li><p>
Clarify that placement new is unaffectd.
</p></li>

<li><p>
Clarify that array and non-array allocations are not merged.
</p></li>

<li><p>
Clarify that happens-before constraints
on programs' allocations and deallocations.
</p></li>

<li><p>
Change terminology from "nominal" calls to "abstract" calls,
in analogy with the abstract machine.
Likewise, change the terminlogy from "external" calls to "implementation" calls.
</p></li>

<li><p>
Remove wording for the C standard.
The C committee has decided to make no changes.
</p></li>

<li><p>
Add a 'Revision History' section.
</p></li>

</ul>


<h2><a name="References">References</a></h2>

<dl>

<dt><a name="TCM">[TCM]</a></dt>
<dd>
<cite>TCMalloc : Thread-Caching Malloc</cite>,
<a href="http://goog-perftools.sourceforge.net/doc/tcmalloc.html">
http://goog-perftools.sourceforge.net/doc/tcmalloc.html</a>.
</dd>

</dl>


</body>
</html>
