<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN"
 "http://www.w3.org/TR/html4/strict.dtd">
<html><head>
<meta http-equiv="Content-Type" content="text/html; charset=US-ASCII">

<style type="text/css">

body { color: #000000; background-color: #FFFFFF; }
del { text-decoration: line-through; color: #8B0040; }
ins { text-decoration: underline; color: #005100; }

p.example { margin-left: 2em; }
pre.example { margin-left: 2em; }
div.example { margin-left: 2em; }

code.extract { background-color: #F5F6A2; }
pre.extract { margin-left: 2em; background-color: #F5F6A2;
  border: 1px solid #E1E28E; }

p.function { }
.attribute { margin-left: 2em; }
.attribute dt { float: left; font-style: italic;
  padding-right: 1ex; }
.attribute dd { margin-left: 0em; }

blockquote.std { color: #000000; background-color: #F1F1F1;
  border: 1px solid #D1D1D1;
  padding-left: 0.5em; padding-right: 0.5em; }
blockquote.stddel { text-decoration: line-through;
  color: #000000; background-color: #FFEBFF;
  border: 1px solid #ECD7EC;
  padding-left: 0.5empadding-right: 0.5em; ; }

blockquote.stdins { text-decoration: underline;
  color: #000000; background-color: #C8FFC8;
  border: 1px solid #B3EBB3; padding: 0.5em; }

table { border: 1px solid black; border-spacing: 0px;
  margin-left: auto; margin-right: auto; }
th { text-align: left; vertical-align: top;
  padding-left: 0.4em; padding-right: 0.4em; border: 1px solid black; }
td { text-align: left; vertical-align: top;
  padding-left: 0.4em; padding-right: 0.4em; border: 1px dotted grey; }

</style>

<title>Handling Disappointment in C++</title>
</head>

<body>
<h1>Handling Disappointment in C++</h1>

<p>
ISO/IEC JTC1 SC22 WG21 EWG P0157R0 - 2015-11-07
</p>

<p>
Lawrence Crowl, Lawrence@Crowl.org
</p>

<p>
<a href="#Introduction">Introduction</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#Kinds">Kinds of Disappointments</a><br>
<a href="#Traditional">Traditional Approaches</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#retstatus">Return Status</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#special">Intrusive Special Value</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#outstatus">Status via Out Parameter</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#pair">Return a Pair</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#longjump">Long Jump</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#exception">Throw Exception</a><br>
<a href="#Analysis">Analysis</a><br>
<a href="#Problem">Problem</a><br>
<a href="#Recent">Recent Approaches</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#both">Provide Two functions</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#pair">Expected or Unexpected</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#statusvalue">Status and Optional Value</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;<a href="#comparison">Comparison</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#advisory">Advisory Information</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#efficiency_of_return">Efficiency of Return</a><br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<a href="#ease_of_return">Ease of Return</a><br>
<a href="#Standard">Recommendation to the Standard</a><br>
<a href="#Programmers">Recommendation to Programmers</a><br>
</p>


<h2><a name="Introduction">Introduction</a></h2>

<p>
When a function fails to do what we want, we are disappointed.
How do we report that disappointment to callers?
How do we handle that disappointment in the caller?
</p>

<p>
In the discussion of a couple of new approaches to handling disappointment,
the Evolution Working Group wanted general advice to programmers
on how to answer those questions for their application.
This paper provides that advice.
</p>


<h3><a name="Kinds">Kinds of Disappointments</a></h3>

<p>
There are many kinds of disappointments
and programmers will want to report and handle them differently.
</p>

<ul>
<li>
The system implementation failed,
e.g. out of memory.
</li>
<li>
Something passed to the function failed,
e.g. an exception thrown from a callback.
</li>
<li>
The function detected an inappropriate environment,
e.g. null pointer parameter.
</li>
<li>
Data structure bounds prevent completion,
e.g. pushing a full queue.
</li>
<li>
The disappointment is expected,
e.g. popping an empty queue.
</li>
</ul>

<p>
Of these excamples, the last two are not errors.
Hence, we use the term disappointment instead of error.
</p>


<h2><a name="Traditional">Traditional Approaches</a></h2>

<p>
There are traditional C and C++ approaches
to reporting and handling disappointments.
</p>


<h3><a name="retstatus">Return Status</a></h3>

<p>
The most common C approach is to return a status,
typically as an int or enum,
with success as a distinct value.
There are a few problems with this approach.
</p>

<ul>
<li>
Handling the status obscures normal logic,
which makes it hard to infer the normal logic.
</li>
<li>
Writing the handler takes work that programmers would rather not do,
which leads to ignoring any problems.
Ignoring a bad status can quickly lead to undefined behavior.
</li>
<li>
Returning a status displaces a natural return value to an 'out' parameter.
These parameters require the caller to declare a variable
that the callee will overwrite.
</li>
<li>
In practice, the an overwritable type must have a default constructor,
or the functional equivalent.
This restriction excludes some uses.
</li>
<li>
For the caller to interpret and act on the status,
the status must be known when the program is written,
which means that status values not anticipated in the interface
are problematic.
</li>
</ul>


<h3><a name="special">Intrusive Special Value</a></h3>

<p>
Instead of returning a status and displacing the natural return value,
some C functions impose special disappointment semantics
on one value of the return type.
Typically, that value is a null pointer, zero, or negative one.
There are problems.
</p>

<ul>
<li>
When that special is returned,
the actual problem must be reported elsewhere.
</li>
<li>
This report may be global object, like <code>errno</code>,
which inhibits concurrency.
</li>
<li>
When the problem is instead displaced to an 'out' parameter,
the problems above occur.
</li>
<li>
The known-in-advance problem occurs here as well.
</li>
</ul>


<h3><a name="outstatus">Status via Out Parameter</a></h3>

<p>
Another common C-like approach
is to have an 'out' parameter for the status.
This approach has the benefit of not intruding on the natural return.
However, it too has problems.
</p>

<ul>
<li>
It prevents placing the call in the condition to a loop
or the head of a switch statement.
</li>
<li>
While it is not possible to completely ignore the status,
because a variable is required for the reference,
you can easily not test it.
</li>
<li>
In practice, the returned type must have a default constructor,
or the functional equivalent,
because there must be something to return when the function disappoints.
This restriction excludes some uses.
</li>
<li>
The known-in-advance problem occurs here as well.
</li>
</ul>


<h3><a name="pair">Return a Pair</a></h3>

<p>
Another solution is to return a pair of status and value.
In practice, these would then be <code>tie</code>'d to separate variables.
While not yet common in C,
this approach appears in other languages with built-in multiple return values.
(See <a href="http://blog.golang.org/error-handling-and-go">
http://blog.golang.org/error-handling-and-go</a>
for the approach in the Go programming language.)
</p>

<p>
This approach has essentially the same attributes as the approaches above.
The primary difference is that one need to declare a separate variable
to hold the 'other' return object.
<p>


<h3><a name="longjump">Long Jump</a></h3>

<p>
Some applications use long jump to handle disappointments.
The problem is that
long jump has no mechanism to clean up state in intermediate frames.
Consequently, it is usable only in very constrained situations
where either there is no state in intermediate frames
or the program can abandon that state.
Given this constraint, we do not consider it further.
</p>


<h3><a name="exception">Throw Exception</a></h3>

<p>
The C++ exception mechanism addresses the problems above.
</p>

<ul>
<li>
Exception handling code is separate from normal code,
so the normal logic is not obscured.
</li>
<li>
The return type need not have an effective default constructor.
</li>
<li>
Exceptions that are not explicitly handled
are propogated up the call stack.
</li>
<li>
The exception mechanism handles unknown disappointments
as well as known disappointments.
</li>
<li>
Exceptions propogating through a stack frame
will properly destroy local objects,
enabling programming-free handling for many local objects.
</li>
</ul>

<p>
Unfortunately, the exception mechanism introduced other problems.
</p>

<ul>
<li>
The exception mechanism is expensive.
Reporting common disappointments via exceptions
can consume a significant amount of processor time.
</li>
<li>
The exception mechanism induces a large variance
in the cost of a function call.
This variance could cause problems on systems with hard real-time constraints. 
</li>
<li>
Separation of the exception handling code from normal code
can obsure the logic when the action required for a disappointment
is to adjust and try again.
For example, reporting that a hash table does not have an entry
via an exception
would take substantially more code than a boolean return.
</li>
</ul>


<h2><a name="Analysis">Analysis</a></h2>

<p>
We can group traditional approaches into two broad categories
by examining their attributes.
</p>

<table>
<thead>
<tr>
<th>attribute</th>
<th>status-based</th>
<th>exception-based</th>
</tr>
</thead>
<tbody>
<tr>
<td>normal logic is clearer when disappointments are normally</td>
<td>addressed and redone</td>
<td>passed on to other code</td>
</tr>
<tr>
<td>the effect of ignoring disappointment is</td>
<td>often undefined behavior</td>
<td>local variable destruction and exception propagation</td>
</tr>
<tr>
<td>disappointments are applicable when</td>
<td>known in advance</td>
<td>not known in advance</td>
</tr>
<tr>
<td>some form of default construction is</td>
<td>required</td>
<td>not required</td>
</tr>
<tr>
<td>handling overhead is inefficient when disappointments are</td>
<td>rare</td>
<td>not rare</td>
</tr>
<tr>
<td>accomodating real-time constraints is</td>
<td>easier</td>
<td>harder</td>
</tr>
</tbody>
</table>

<p>
The first three attributes are variations on <dfn>actionable</dfn>.
A corrupt file system is rare and unlikely to be actionable in the caller.
On the other hand, an empty queue is likely common and likely actionable.
</p>

<p>
In summary,
the status-based approach is best when
disappointments are actionable and not rare
or when there are hard real-time constraints,
while the exception-based approach is best when
disappointments are not actionable and rare.
</p>


<h2><a name="Problem">Problem</a></h2>

<p>
The problem with traditional approaches
is that whether or not a disappointment is actionable or rare
may depend on the calling environment,
but the implementation of the function does not.
Whether the environment has real-time constraints
may not be known to the programmer of the function.
</p>

<p>
As an example, a function reading a system file
can expect to find it present,
while a function looking for a user's dot file
can expect to find it not present.
A more program-internal example
is an application that knows it will not fill a queue,
and so a full queue indicates a rare error.
On the other hand, another application may rely on a full queue
to provide flow control.
</p>

<p>
Programs will be clearer, more efficient and more robust
when we can leave at least some of the choice in mechananism to the caller.
</p>


<h2><a name="Recent">Recent Approaches</a></h2>

<p>
There have been several new approaches to handling disappointment
developed and deployed recently.
All these approaches address the primary problem
of the character of the disappointment
being known only in the calling environment.
</p>


<h3><a name="both">Provide Two functions</a></h3>

<p>
The first solution is to provide two versions of each function,
one providing a status and one throwing an exception.
This approach enables choice of mechanism at each call site.
</p>

<ul>

<li><p>
In <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3533.html">N3533
C++ Concurrent Queues</a>,
the non-throwing function returns a status.
</p></li>

<li><p>
In <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4100.pdf">N4100
Programming Languages &mdash; C++ &mdash;
File System Technical Specification</a>,
the non-throwing function writes the status through a reference parameter.
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2009/n2838.html">N2838
Library Support for Hybrid Error Handling</a>
proposed a change to standard library specifications so that
</p>
<pre class=example>
<code>void f(error_code&amp; ec=throws());</code>
</pre>
<p>
would stand both overloads of the function,
the one without the reference parameter throwing and
the one with the reference parameter not throwing.
</p>
</ul>

<p>
The non-throwing function in this approach
shares the problem of effectively requiring a default constructor
with the traditional pure-status approach.
</p>

<p>
While less likely in practice,
with two functions it is still possible to request a status, ignore it
and access a missing result.
Whether or not this access is undefined behavior
depends on whether or not the default constructor
produces an object with defined behavior.
In any case, the default object is unlikely to produce what one wants.
</p>


<h3><a name="pair">Expected or Unexpected</a></h3>

<p>
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4109.pdf">N4109
A proposal to add a utility class to represent expected monad</a>
proposes a class template <code>expected</code>
to contain either the normal return object or
an exceptional object, but not both.
</p>

<p>
A conversion from <code>expected</code> to <code>bool</code>
enables determining if the expected value is present.
The dereference operator returns an <em>unchecked</em> reference
to the expected value.
The <code>value</code> member function
returns a checked reference to the expected value.
</p>

<p>
Accessing a missing result is possible
by using the dereference operator.
The behavior is undefined if one does so.
</p>


<h3><a name="statusvalue">Status and Optional Value</a></h3>

<p>
<a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2014/n4233.html">N4233
A Class for Status and Optional Value</a>
proposes a class template <code>status_value</code>
to contain a status and possibly the normal return object.
</p>

<p>
A conversion from <code>status_value</code> to <code>bool</code>
enables determining if the value is present.
The dereference operator returns an <em>checked</em> reference
to the expected value.
The <code>value</code> member function
returns a checked reference to the expected value.
</p>

<p>
Accessing a missing result is not possible.
</p>


<h3><a name="comparison">Comparison</a></h3>

<p>
We compare the recent approaches on three points.
</p>

<h4><a name="advisory">Advisory Information</a></h4>

<p>
The <code>status_value</code> proposal
differs from the <code>expected</code> proposal
in that it always provides a status.
By always having a status
that status can provide advisory information
in addition to the normal return value.
It can say, in effect,
"I was able to satisfy your need this time,
but in the future you need to modify your behavior
to reduce th risk of future disappointment".
A couple of examples are in order.
</p>

<ul>
<li><p>
A hash table insertion returns a status
"success but the table is getting full".
This allows the calling environment
to choose an appropriate time for cleaning or growing the hash table.
</p></li>

<li><p>
A concurrent data structure return a status
"success but under contention".
This allows the calling environment to
defer or ignore non-essential work to keep response time low.
</p></li>
</ul>

<p>
The two-function approach can also provide advisory information
in the case where one chooses the status-based function.
</p>

<p>
With all three approaches,
it is possible to ignore advisory information.
In the two-function approach, simply use the exception-throwing version.
In the other two approaches, use the conversion-to-bool for decisions.
</p>


<h4><a name="efficiency_of_return">Efficiency of Return</a></h4>

<p>
The three approaches have roughly the same efficiency when
the exceptional object is cheap, on the order of an enum.
As exceptional objects become more expensive,
the status-based function and the <code>status_value</code> type
pay an increasing cost.
The exception-based function and the <code>expected</code> type
avoid paying that cost for the non-exceptional case.
</p>

<p>
This difference in efficiency
shows most clearly in the file system technical specification.
The specification uses the two-function approach.
The status-based functions
return a <code>error_code</code>.
while the exception-based functions
throw a <code>filesystem_error</code>
that contains an <code>error_code</code>
plus additional diagnostic information.
The <code>status_value</code> type, as is,
encourages simple enumerations for status.
Returning a full <code>filesystem_error</code>
on every call would be relatively expensive.
In constrast,
the <code>expected</code> type
would only construct a <code>filesystem_error</code> at need.
</p>
 

<h4><a name="ease_of_return">Ease of Return</a></h4>

<p>
While not a major concern,
the status is required in
construction of a <code>status_value</code> object
is simpler than construction of an <code>expected</code> object.
</p>

<p>
While use of the two function approach
would seem simpler than either of the above,
it would likely not prove so in practice.
To prevent redundant implementation,
the main work is likely to be done in the status-based function
and the other exception-based function acts as a wrapper.
Thus, the code is likely more complex for the two-function approach.
</p>


<h2><a name="Standard">Recommendation to the Standard</a></h2>

<p>
Both <code>expected</code> and <code>status_value</code> provide value
and should be considered for adoption.
Some determine effort to unify the two proposal
would likely result in a more consistent outcome.
</p>

<p>
There are some technical changes to the proposals
that would impove them.
</p>

<ul>

<li><p>
Change the <code>status_value::status</code>
operation to return a reference rather than a value.
This change enables more expensive status types.
</p></li>

<li><p>
Change the <code>expected</code> dereference operator
to check for a valid expected object.
While the lack of a check is present for efficiency,
that efficiency would likely be gained by an optimizing
compiler recognizing a redundant comparision when the
<code>expected</code> functions are inlined.
</p></li>

<li><p>
Consider change the <code>expected</code> construction
to have a single constructor with a discriminator parameter
rather than two auxilary maker functions.
</p></li>

</ul>


<h2><a name="Programmers">Recommendation to Programmers</a></h2>

<p>
Programmers of a function
should consider how to communicate disappoinment
to their callers
with the following advice, taken in order.
</p>

<ul>

<li><p>
If the function will never disappoint,
use <code>noexcept</code> and do not have a status.
</p></li>

<li><p>
If there are significant real-time constraints on the program,
where the cost of an exception is prohibitive,
use a status-based function.
In such programs,
a strong review program will be necessary
to ensure that all disappointments are handled.
This path will prohibit some types and programs.
</p></li>

<li><p>
If a disappointment not known in advance, throw an exception.
Programmers cannot handle what they cannot know.
The most common instance of this case
is a function invoking a callback function that throws.
The callback function may throw exceptions unknown to the outer function.
</p></li>

<li><p>
If a disappointment is rare and not actionable, throw an exception.
The concurrent queue returns a status from most functions,
but no status includes a mutex failure.
When a mutex fails, the queue operations will throw an exception.
</p></li>

<li><p>
If the function may provide advisory information,
use <code>status_value</code>.
Callers can decide whether a disappointment is rare or actionable.
</p></li>

<li><p>
If the exceptional object is expensive to construct,
use <code>expected</code>.
Callers can decide whether a disappointment is rare or actionable.
(The exceptional object need/should not have a 'success' value,
as the programmer has access only to the exceptional object on success.)
</p></li>

<li><p>
Otherwise,
use <code>status_value</code> or <code>expected</code>
based on which seems more natural for code explicitly handling disappointment.
</p></li>

</ul>

</body></html>
