<!DOCTYPE html>
<html lang="en">
	<head>
		<title>Safe integral comparisons</title>
	</head>
	<body>
		<address>
			Document number: P0586R1<br/>
			Date: 2018-08-17<br/>
			Project: Programming Language C++<br/>
			Reply-to: <a href="mailto:federico.kircheis@gmail.com">Federico Kircheis</a><br/>
			Audience: Library Evolution Working Group<br/>
		</address>

		<h1>Safe integral comparisons</h1>

		<h2 id="Table">I. Table of Contents</h2>
		<ul style="font-family:monospace">
			<li><a href="#Table"          >I).......Table of Contents</a></li>
			<li><a href="#Motivation"     >II)......Motivation</a></li>
			<li><a href="#Proposal"       >III).....Proposal</a></li>
			<li><a href="#Examples"       >IV)......Examples</a></li>
			<li><a href="#Implementation" >V).......Possible implementation</a></li>
			<li><a href="#Effects"        >VI)......Effects on Existing Code</a></li>
			<li><a href="#Design"         >VII).....Design Decisions</a></li>
			<li><a href="#Further"        >VIII)....Further Considerations</a></li>
			<li><a href="#Related"        >IX)......Related Works</a></li>
			<li><a href="#Wording"        >X).......Proposed Wording</a></li>
		</ul>


		<h2 id="Motivation">II. Motivation</h2>

		<p>
			Comparing integrals of different types may be a more complex task than expected. Most of the time we expect that a simple
		</p>
<pre><code>	if(a &lt; b){
		// ...
	} else {
		// ...
	}
</code></pre>
		<p>
			should work in all cases, but if <code>a</code> and <code>b</code> are of different types, things are more complicated.
		</p>
		<p>
			If <code>a</code> is a signed type, and <code>b</code> unsigned, then, supposing that no integral promotion is taking place, <code>a</code> is converted to the unsigned type.
			If <code>a</code> holds a number less than zero, then the result may be unexpected, since the expression <code>a &lt; b</code> would evaluate to false, even though a strictly negative number is always lower than a positive one.
			The reason of this behavior is that unsigned types have modular arithmetic, but most of the time, for example when working with containers, when mixing signed and unsigned types, we want to have integer arithmetic.
			Also, converting integrals between different types can be challenging. For simplicity, most of the time we assume that values are in range, and write
		</p>
<pre><code>	a = static_cast&lt;decltype(a)&gt;(b);</code></pre>
		<p>
			If we want to write a safe conversion, we need to check if <code>b</code> has a value between <code>std::numeric_limits&lt;decltype(a)&gt;::min()</code> and <code>std::numeric_limits&lt;decltype(a)&gt;::max()</code>.
			We also need to pay attention that no implicit conversion (for example between unsigned and signed types) invalidates our comparison.
		</p>
		<p>
			Comparing and converting numbers, even of different numeric types, should be a trivial task.
			Unfortunately it is not, and because of implicit conversions we may write, without noticing it, unsafe code.
		<p>
			Most compilers are able to provide diagnostics and generate warnings when comparing values of different types, or when doing a narrowing conversion.
		</p>
		<p>
			Developers are tempted to assume that values will mostly be in range and write a simple, but possibly wrong, cast in order to silence the warning, or not to turn on the corresponding compiler warning at all.
		</p>

		<h2 id="Proposal">III. Proposal</h2>

		<p>
			This paper proposes to add a set of <code>constexpr</code> and <code>noexcept</code> functions for converting and comparing integrals of different signedness, except for <code>bool</code> and character types.
		</p>

		<ul>
			<li>
				Two functions to compare if two variables represent the same value or not
<pre><code>	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_equal(T t, U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_not_equal(T t, U u) noexcept;
</code></pre>


			<li>
				A set of functions that can be used to determine the relative order of two values
<pre><code>	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_less(T t, U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_greater(T t, U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_less_equal(T t, U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool std::cmp_greater_equal(T t, U u) noexcept;
</code></pre>

			<li>
				One function to determine if a specific value is inside the range of possible values of another type (i.e. if we can convert the value to the other type safely)
<pre><code>	template &lt;typename R, typename T&gt;
	constexpr bool in_range(T t) noexcept;
</code></pre>
		</ul>


		<h2 id="Examples">IV. Examples</h2>
			<h3>Examples without current proposal</h3>
				<p>Comparing an unsigned int with an int:</p>
<pre><code>	int a = ...
	unsigned int b = ...
	// add static_cast to avoid compiler warnings since we are doing a "safe" comparison
	if(a &lt; 0 || static_cast&lt;unsigned int&gt;(a) &lt; b){
		// do X
	} else {
		// do Y
	}
</code></pre>

				<p>Comparing a uint32_t with an int16_t:</p>
<pre><code>	int32_t a = ...
	uint16_t b = ...
	// add static_cast to avoid compiler warnings since we are doing a "safe" comparison
	if(a &lt; static_cast&lt;int32_t&gt;(b)){
		// do X
	} else {
		// do Y
	}
</code></pre>

				<p>Comparing an int with an intptr_t:</p>
<pre><code>	int a = ...
	intptr_t b = ...
	if(???){ // no idea how to do it in one readable line without some assumption about int and intptr_t
		// do X
	} else {
		// do Y
	}
</code></pre>


			<h3>Example with current proposal</h3>
				<p>
					Comparing one integral type <code>A</code> with another integral type <code>B</code> (both non <code>bool</code> or character type):
				</p>
<pre><code>	A a = ...
	B b = ...
	// no need for any cast since std::cmp_less is taking care of everything
	if(std::cmp_less(a,b)){
		// do X
	} else {
		// do Y
	}
</code></pre>

		<h2 id="Implementation">V. Possible implementation</h2>
			<p>
				A possible implementation can be found on <a href="https://raw.githubusercontent.com/fekir/safeintegral/master/safeintegral/safeintegralop_cmp.hpp">github</a>.
				The only dependencies are the <code>std::numeric_limits</code> function from the <code>limits</code> header, some traits from the <code>type_traits</code> header and a standard conforming C++11 compiler.
			</p>

		<h2 id="Effects">VI. Effects on Existing Code</h2>
			<p>
				Since the proposed functions are not defined in any standard header, the meaning of no existing code will be changed.
			</p>

		<h2 id="Design">VII. Design Decisions</h2>
			<p>
				This proposal addresses how to compare <em>numerical values</em> of different types (aka standard integer types and extended integer types) in a safe and simple way.
				It makes little sense to compare <code>true</code>, <code>false</code>, <code>'a'</code> and other characters to numbers, since they represent different logical entities.
				The encoding of characters is also not specified, therefore the possible valid comparison <code>'a' == 97</code> might yield different results depending on the locale, compiler or platform.
			</p>
			<p>
				Providing an overload for <code>char</code> might not reduce confusion, for example:
			</p>
<pre><code>	int32_t a = ...
	char c = -1;
	cmp_less(c, 0) // true if char is signed, false if char is unsigned.
</code></pre>
			<p>
				If the user has to choose between <code>signed char</code> or <code>unsigned char</code>, the behaviour will always be consistent.
				Using <code>char</code> for storing a number is a valid use case (the language permits it), but the types <code>signed char</code> and <code>unsigned char</code> should be preferred since those are standard integer types and have the same size.
			</p>
			<p>
				I would also recommend not to provide overloads for <code>bool</code> and the character types because it is easier to add them later if needed, whereas removing them might be more difficult since it would be a breaking change.
			</p>
			<p>
				If the LEWG would like to include <code>char</code>, I think it would be better to provide an overload for every character type for consistency.
			</p>
		<h2 id="Further">VIII. Further Considerations</h2>
			<p>
				I've heard rumors that it might be possible that the current <code>operator&lt;</code> et al. could get deprecated and maybe changed someday to behave like the functions proposed in this proposal.<br>
				I would like to add some considerations:
			</p>
			<ul>
				<li>
					<p>Doing the right thing might be less efficient than doing the wrong thing.
					Changing how <code>operator&lt;</code> works on integral types might make it less efficient, it may require extra instructions, even an extra branch instruction.
					Performance is mostly irrelevant if we need to choose between the right result and a possibly wrong result.
					Compilers are able to detect when comparing numbers of different types and they'll very probably be able to do so in the future even if operator&lt; changes meaning.
					If a developer wants better efficiency, they should use the same type to avoid conversions.</p>
				</li>
				<li>
					<p>Even today, comparing numbers might require more instructions and branches than expected on some targets.</p>
				</li>
				<li>
					<p>Because of optimizations and branch prediction, the <code>cmp_less</code> function might be as efficient as the current <code>operator&lt;</code>.</p>
				</li>
				<li>
					<p>There are some use cases where, today, we have a warning as a side-effect that shows the user that the code might be wrong, but by changing <code>operator&lt;</code> it will still be wrong and we will not have the warning anymore:</p>
					<code>for(auto i = 0; i &lt; container.size(); ++i){/**/}</code>.<br>
					<p>The code is wrong with all standard containers because the condition may never be met and there is a possible overflow.
					Since we are comparing, we get a warning because of <code>operator&lt;</code>. The problem is that in this case it's not the comparison that is wrong, but the whole expression (it could also be that size returns a signed type but with a bigger range).
					As stated above, the warning caused by <code>operator&lt;</code> is just a fortunate side-effect. I do not know if compilers in the future will be able to warn about those and more complex expressions.</p>
				</li>
				<li>
					Changing how comparison operators works, but not the assignment operator, will lead to inconsistencies, and can create to subtle bugs:
<pre><code>unsigned int u = std::numeric_limits&ltint&gt::max();
int s = -1;
assert(s!=u); // supposing that operator!= compares between signed and unsigned without modulo behaviour
u = s;
assert(s == u); // expected to pass, but will fail
</code></pre>
					whereas simply deprecating the comparison would enhance the possibilities to spot the error.
				</li>
			</ul>
		<h2 id="Related">IX. Related Works</h2>
			<p>
				In 2016, Robert Ramey did a much bigger proposal (see <a href="http://wg21.link/p0228r0">p0228r0</a>) regarding safe integer types.
				He also used functions similar to those proposed in this paper for implementing his classes and operators, so an alternative implementation can be found on his <a href="https://github.com/robertramey/safe_numerics/blob/master/include/safe_compare.hpp">github repository</a>.
				This proposal addresses a smaller problem, namely comparing integral values, and is therefore much smaller.<br/>
				The functions provided can be also used for creating safe integer types.
			</p>

			<p>
				Another work, by Herb Sutter (see <a href="https://wg21.link/p0515r3">p0515r3</a>), is about a new comparison operator (<code>&lt;=&gt;</code>).
				In its current state the <code>operator&lt;=&gt;</code> will not compare different integral types, but in a previous revision as far as I've understood, the proposal stated that <code>operator&lt;=&gt;</code> should compare different integral types without modulo behaviour making part of this proposal obsolete.
			</p>

		<h2 id="Wording">X. Proposed Wording</h2>
			<p>
				This section presents the wording changes for P0586R1.
				Any differences in semantics are unintentional.
				<a href="http://wg21.link/n4659">n4659</a> has been used as reference.
			</p>

			<p>
				During the meeting at Rapperswil the committee expressed the idea to use the function names of the spaceship operator (<code>is_eq, is_neq, is_lt, ...</code>, see <a href="https://wg21.link/p0515">p0515</a>), and use for the spaceship operator some more verbose function name.
				Since the functions used by the spaceship operator should not appear often since they are use behind the scenes, whereas the functions in this proposal needs to get called explicitly, such a change would have the benefit to provide a short and concise name that can improve the readability.
				I did not rename the functions of this proposal with the function names of the spaceship operator in order to avoid confusion.
			</p>

			<h3 id="Wording">X.a Proposed Wording with long names</h2>
			<p>
				In 23.2.1 Header &lt;utility&gt; synopsis, add declarations:
			</p>
<pre><code>	// 23.2.10, safe integral comparisons
	template &lt;typename R, typename T&gt;
	constexpr bool in_range(const T t) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool cmp_equal(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool cmp_not_equal(const T t, const U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool cmp_less(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool cmp_greater(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool cmp_less_equal(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool cmp_greater_equal(const T t, const U u) noexcept;
</code></pre>

			<p>
				Add a new Section <q>23.2.10, safe integral comparisons</q>, with following content:
			</p>
				<pre>
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or an extended integer type, as specified in 6.9.1,
  the call is ill-formed.
  [Note: std::byte, char, char16_t, char32_t, wchar_t, and bool are not
         comparable with these functions. --end note]

  template &lt;typename T, typename U&gt;
  constexpr bool cmp_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t == u`. Otherwise, if `t` or `u` is negative, returns
      `false`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu == u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t == uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool not_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t != u`. Otherwise, if `t` or `u` is negative, returns
      `true`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu != u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t != uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool cmp_less(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &lt; u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &lt; u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &lt; uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool cmp_greater(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &gt; u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &gt; u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &gt; uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool cmp_less_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &lt;= u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &lt;= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &lt;= uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool cmp_greater_equal(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &gt;= u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &gt;= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &gt;= uu`.


  template &lt;typename R, typename T&gt;
  constexpr bool in_range(T t) noexcept;

    Returns:
      Returns the same value of
      `cmp_greater_equal(t, std::numeric_limits&lt;R&gt;::min()) &amp;&amp;
       cmp_less_equal(t, std::numeric_limits&lt;R&gt;::max())`

				</pre>
				In case the LEWG would like to include char in the argument set, replace <pre>1.</pre> with
				<pre>
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or extended integer type, as defined in
  6.9.1, and not char, the call is ill-formed.  If the implementation
  defines `char` to be a signed type, its corresponding unsigned type,
  in the following, is `unsigned char`.
  [Note: std::byte, char16_t, char32_t, wchar_t, and bool are not
         comparable using these functions. --end note]
				</pre>


			<h3 id="Wording">X.b Proposed Wording with short names</h2>
			<p>
				In 23.2.1 Header &lt;utility&gt; synopsis, add declarations:
			</p>
<pre><code>	// 23.2.10, safe integral comparisons
	template &lt;typename R, typename T&gt;
	constexpr bool in_range(const T t) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool is_eq(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool is_neq(const T t, const U u) noexcept;

	template &lt;typename T, typename U&gt;
	constexpr bool is_lt(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool is_gt(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool is_lteq(const T t, const U u) noexcept;
	template &lt;typename T, typename U&gt;
	constexpr bool is_gteq(const T t, const U u) noexcept;
</code></pre>

			<p>
				Add a new Section <q>23.2.10, safe integral comparisons</q>, with following content:
			</p>
				<pre>
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or an extended integer type, as specified in 6.9.1,
  the call is ill-formed.
  [Note: std::byte, char, char16_t, char32_t, wchar_t, and bool are not
         comparable with these functions. --end note]

  template &lt;typename T, typename U&gt;
  constexpr bool is_eq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t == u`. Otherwise, if `t` or `u` is negative, returns
      `false`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu == u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t == uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool is_neq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t != u`. Otherwise, if `t` or `u` is negative, returns
      `true`. Otherwise, if `T` is a signed type, constructs from `t`
      a value `tu` of the corresponding unsigned type and returns
      `tu != u`.  Otherwise, if `U` is a signed type, constructs from
      `u` a value `uu` of the corresponding unsigned type and returns
      `t != uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool is_lt(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &lt; u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &lt; u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &lt; uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool is_gt(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &gt; u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &gt; u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &gt; uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool is_lteq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &lt;= u`. Otherwise, if `t` is negative, returns `true`.
      Otherwise, if `u` is negative, returns `false`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &lt;= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &lt;= uu`.


  template &lt;typename T, typename U&gt;
  constexpr bool is_gteq(const T t, const U u) noexcept;

    Returns: If `T` and `U` are both signed, or both unsigned types,
      returns `t &gt;= u`. Otherwise, if `t` is negative, returns `false`.
      Otherwise, if `u` is negative, returns `true`.  Otherwise, if `T`
      is a signed type, constructs from `t` a value `tu` of the
      corresponding unsigned type and returns `tu &gt;= u`.  Otherwise,
      if `U` is a signed type, constructs from `u` a value `uu` of the
      corresponding unsigned type and returns `t &gt;= uu`.


  template &lt;typename R, typename T&gt;
  constexpr bool in_range(T t) noexcept;

    Returns:
      Returns the same value of
      `is_gteq(t, std::numeric_limits&lt;R&gt;::min()) &amp;&amp;
       is_lteq(t, std::numeric_limits&lt;R&gt;::max())`

				</pre>
				In case the LEWG would like to include char in the argument set, replace <pre>1.</pre> with
				<pre>
  1. For each of the following functions, if either of `T` or `U` is not
  a standard integer type or extended integer type, as defined in
  6.9.1, and not char, the call is ill-formed.  If the implementation
  defines `char` to be a signed type, its corresponding unsigned type,
  in the following, is `unsigned char`.
  [Note: std::byte, char16_t, char32_t, wchar_t, and bool are not
         comparable using these functions. --end note]
				</pre>

	</body>
</html>
