<!DOCTYPE HTML>
<html>
<head>
	<title>Static analysis and 'safety' of Contracts, P2900 vs. P2680/P3285</title>

	<style>
	p {text-align:justify}
	li {text-align:justify}
	blockquote.note
	{
		background-color:#E0E0E0;
		padding-left: 15px;
		padding-right: 15px;
		padding-top: 1px;
		padding-bottom: 1px;
	}
	ins {color:#00A000}
	del {color:#A00000}
	</style>
</head>
<body>

<address align=right>
Document number: P3362R0
<br/>
Audience: EWG
<br/>
<br/>
<a href="mailto:ville.voutilainen@gmail.com">Ville Voutilainen</a><br/>
<a href="mailto:richard.corden@gmail.com">Richard Corden</a><br/>

2024-08-11<br/>
</address>
<hr/>
<h1 align=center>Static analysis and 'safety' of Contracts, P2900 vs. P2680/P3285</h1>

<h2>Abstract</h2>

<p>This paper discusses two things; first, the part that the author
  finds quite alarming, that various people seem to think that
  the proposed P2900 Contracts are an
  effective addition to C++ to make the language or rather the programs
  written with it 'safer'.</p>

<p>Second, the part that the author finds equally alarming, that
  the proposed P2900 Contracts provide good-enough support
  for static analysis based on contract annotations, which is
  highly related to the first part. It was indeed suggested
  in the St. Louis meeting that P2900 has enough support for
  static analysis, and the shortcomings that have been attempted
  to be discussed were responded to with a rather interesting statement,
  roughly along the lines of "static analyzers already do useful
  things with asserts, but if an analyzer <em>can't prove that
    an assert has no side effects, such an assert is simply <b>ignored</b>".
  </em>
</p>

<p>That response is alarming to the hilt. It seemed to have been
  made as an argument that suggests that the support P2900 lends
  to static analysis is sufficient, and the emphasized part was
  intended to suggest that there's no problem with using the P2900
  Contracts for static analysis, and that those contracts support
  static analysis fine, and that static analysis tools can use
  those contracts just fine.</p>

<p>What is suggested there, in the emphasized part of the paraphrased
  quote, as a non-problem, is a huge problem.
  It is *THE* problem.
</p>

<p>What EWG, and WG21, need to do is to make sure the Contracts approach
  proposed in <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2023/p2680r1.pdf">P2680</a> and <a href="https://www.open-std.org/jtc1/sc22/wg21/docs/papers/2024/p3285r0.pdf">P3285</a> is included in the first incarnation
  of the Contracts facility that we ship. We must change direction,
  and we need to do that before we ship anything in this space. We have
  to do that to have a language facility that is useful for the whole
  C++ community, and a language facility that effectively and meaningfully
  improves the safety of C++ programs.
</p>

<p>The proposed P2900 Contracts don't do that, and can't. The ones proposed
  in P2680/P3285 can.
</p>

<h2>Ignored until proven idempotent</h2>

<p>For illustrative purposes, let's consider the following
  declarations:
<blockquote><pre><code>bool pred(int x);
int func_a() post(r: pred(r));
void func_b(int x) pre(pred(x));
</code></pre></blockquote>
and the first example of their use is
<blockquote><pre><code>func_b(func_a());
</code></pre></blockquote>
Now, the question is, is that code snippet correct?
</p>
<p>Can you, the reader of the code, know whether that snippet is
  correct? Can a static analysis tool tell you that it is correct,
  or that it doesn't seem to be correct?
</p>
<p>The answer is "no" on all accounts. Both you the reader, and
  an analysis tool, can <b>guess</b> that it might be correct,
  because a postcondition assertion that has exactly the same
  condition as a subsequent precondition assertion does has been
  encountered right before the call that calls a function that
  requires that precondition.
</p>
<p>What if <code>pred()</code> has a side effect? Well, you can't
  then tell whether the postcondition and the precondition connect,
  because you can't tell whether they end up having the same value,
  because a side effect might cause them not to.
</p>
<p>What if <code>pred()</code> encounters Undefined Behavior?
  Well, you can't  then tell whether the postcondition and the
  precondition connect,
  because you can't tell whether they end up having the same value,
  because UB might cause them not to.
</p>
<p>What if <code>pred()</code> doesn't encounter Undefined Behavior,
  but something in it is optimized away, and produces a result
  that you didn't expect? Well, in case no UB is encountered, and
  no side effects occur, the postcondition and the precondition
  probably connect. But is the program correct? Are the postcondition
  and the precondition doing what you expect them to do? Maybe not.
</p>
<p>So, for our code reader, and our static analysis tool, those
  post/pre don't tell us much. For better or worse, they give us a hint
  that we can use to guess that the program might be correct. But
  that's all they give us. We can't <b>prove</b> that, we can't reason
  about that, without reading a whole lot of other code.
</p>
<p>So, if you tell a static analyzer to find <b>proofs</b> of correctness
  or incorrectness of your code, for this particular example it will,
  at most, say "I can't". If you ask it to find guesses where there
  are indications of code being correct, it might say that this code
  has some. But as suggested, an analyzer can't <b>prove</b> anything about
  this code, so it'll by and large just ignore those post/pre, at least
  for the analysis that attempts to <b>prove</b> something.
</p>
<p>At this point, let's take a short side-step to the world that
  P2680/P3285 give us. To do that, we need to change our predicate
  slightly, and I'm using a context-sensitive keyword instead of an
  attribute:
<blockquote><pre><code>bool pred(int x) conveyor;
int func_a() post(r: pred(r));
void func_b(int x) pre(pred(x));
</code></pre></blockquote>
and the example of their use is
<blockquote><pre><code>func_b(func_a());
</code></pre></blockquote>
</p>
<p>Can you the reader of the code tell whether that code is correct?
  Yes, you can. The postcondition connects to the precondition. Can
  a static analyzer prove that? Yes, it can. The predicate is
  <b>guaranteed</b> not to have side-effects or UB. Pay particular
  attention to the fact that we do <b>not</b> need to look at any definitions
  to come to this <b>proof</b>. It can be constructed based on declarations,
  and code calling functions so declared.
</p>

<h2>Burden of proof</h2>

<p>So, with P2900 Contracts, there are various cases where an attempt
  to construct a proof simply fails, it just can't be done.
</p>

<h3>No visible definition</h3>

<p>If the definition of <code>pred</code> is in another translation
  unit, and we are not talking about an inline function in a module
  interface here, we are talking about a case where the definition
  just isn't visible, we get the following entertaining results:
  <ul>
    <li>For P2900, game over. Neither a human reader nor a static analyzer
      can construct any proof, either for or against correctness.</li>
    <li>For P2680/P3285, the proof can be constructed as ever, no problem. The
      postcondition connects to a precondition. We know that. There is
      a language guarantee of a lack of side-effects and UB. Nothing
      changed.
  </li></ul>
</p>
<p>Let's look at a slight twist, with slightly different
  building blocks:
<blockquote><pre><code>
class X
{
    virtual bool pred(int x);
public:
    int func_a() post(r: pred(r));
    void func_b(int x) pre(pred(x));
};
</code></pre></blockquote>
and the example of their use is
<blockquote><pre><code>void do_stuff(X& x) {
    x.func_b(x.func_a());
}      
</code></pre></blockquote>
</p>
<p>For P2900, it's still game over. No proofs can be constructed,
  without finding the final overrider of X::pred, and looking at
  its definition.
</p>
<p>For P2680/P3285, we do the same tweak as before,
<blockquote><pre><code>
class X
{
    virtual bool pred(int x) conveyor;
public:
    int func_a() post(r: pred(r));
    void func_b(int x) pre(pred(x));
};
</code></pre></blockquote>
and we have no problem constructing a proof. We can still construct
that proof without knowing what
the call resolves to, and without looking at the definition of that
final overrider.
</p>

<h3>Definition visible, the cost of proof</h3>

<p>First of all, static analyzers available for C++ are fantastic.
  If you know which one to procure, and have the budget for it.
  For example, analyzers like Perforce's Helix QAC and Polyspace advertise
  being able to prove that your code doesn't run into Undefined
  Behavior. In some cases. And with varying effort/cost, which
  is configurable; you can tell such an analyzer how hard it
  will try to construct such a proof.
</p>

<p>See, that proof asks the questions, "does this code invoke a side-effect?",
  and "does this code invoke Undefined Behavior, perhaps as a result of
  such a side-effect?". To some of us, those questions seem familiar.
  They are the same question as "does this code <em>halt</em>?".
</p>

<p>So, for all cases and in general, that proof can't be done. But
  it is, of course, not quite so bleak. The tools apply symbolic
  evaluation, do flow analysis, and when so configured, effectively
  symbolically fuzz your code. The employ various analyses, some simpler,
  some more complex, going all the way to a brute-force approach
  if so configured.
</p>

<p>So, depending on the complexity and the amount of code you have
  in your predicates, that proof-attempt may be quite heavy. And it
  will be quite different for different invocations of the same
  predicate, and for different predicates. Sometimes it'll be just
  impossible, and the analyzer will have to give up and tell you
  it can't compute the proof.
</p>

<p>Consider, for example,
<blockquote><pre><code>
class X
{
    virtual bool pred(int x);
public:
    int func_x() post(r: pred(r));
    void func_a(int x) pre(pred(x));
    void func_b(int x) pre(pred(x));
};
</code></pre></blockquote>
and a use
<blockquote><pre><code>void do_stuff(X& x) {
    int a = x.func_x();
    x.func_a(a);
    x.func_a(a + 10);
    x.func_b(a);   
    x.func_b(a + 10);   
}      
</code></pre></blockquote>
An attempt to prove that that postcondition connects to the preconditions
can't necessarily be done just once for the predicate. It may need to be re-proven
for the calls with different arguments, taking into account the possible
effects of side-effects in the predicate in between.
</p>

<p>And again, for P2680/P3285, all we need is a conveyor function as a
  predicate,
<blockquote><pre><code>
class X
{
    virtual bool pred(int x) conveyor;
public:
    int func_x() post(r: pred(r));
    void func_a(int x) pre(pred(x));
    void func_b(int x) pre(pred(x));
};
</code></pre></blockquote>
and we do not need to run or re-run any analysis of the predicate.
We know that it didn't have side effects or UB. It doesn't have them
for any argument values, whereas without that knowledge in hand,
the analysis may need to be re-done.
</p>
<p>And yes, I know that side-effects in func_x, func_a or func_b
  can wreck that analysis. But the difference is that for P2900,
  the <em>predicate itself</em> may need to be analyzed multiple times, whereas
  for P2680/P3285, it doesn't need to be analyzed at all.
</p>
<p>There is of course the difference that the compiler will need to
  do a semantic check for the implementation of X::pred. But the compiler
  will do that *once*. And that result can be reused by multiple
  analyses by a static analyzer. And multiple runs. All that is required
  is the knowledge that the code compiled, i.e. followed the rules
  for 'strict' contracts and conveyor functions.
</p>

<h2>What the cost and feasibility difference means</h2>

<p>Considering the much lighter weight of static analysis
  that is based on P2680/P3285, that analysis is feasible
  to do during your build. Or on the fly in your IDE. The
  analysis never requires looking into predicate definitions,
  not once, let alone multiple times per run. It can be
  done looking at calls in a single TU, and looking at the declarations
  of what's being called.
</p>
<p>In practice, a static analysis based on P2900 can't be done
  near as often. It's much heavier, has false positives and false negatives,
  and sometimes doesn't produce a useful result. You're not going
  to enable such analyses in an IDE if you run into problematic
  cases, and neither will anyone else.
</p>

<h2>The elephant in the room: safety</h2>

<p>When talking about things like memory-safety, look at what modern
  memory-safe programming languages do: they make memory-unsafe constructs
  ill-formed, whenever they can. They employ run-time checks when
  they <em>have to</em>, due to halting-problem challenges for things
  like bounds-checking, but whenever they can, they just reject
  memory-unsafe constructs. Or make things so that memory-unsafe
  operations cannot occur (you can't have a dangling reference in a properly
  garbage-collected language).
</p>
<p>Many people in WG21 already think Contracts is a partial
  solution to safety problems, including memory-safety problems.
  Apparently the same expectation is held by some people outside
  WG21.</p>

<p>To avoid causing a huge disappointment, we should ship a Contracts
  facility that's amenable to finding problems when software is built,
  not just when it's run. And we should do so effectively, not with
  a facility whose viability for static analysis depends on far too many things
  that could go wrong.</p>

<h3>It's not just about static analysis, either</h3>

<p>The 'strict' contracts proposed in P2680/P3285 aren't just
  better for static analysis. They are better for runtime-checking
  as well. When side-effects and UB are not in the picture, it's going
  to be far easier to write checks that actually do what the programmer
  intended, without the caveats, pitfalls, and traps of side-effects
  and UB.
</p>

<h2>And again, for the millionth time: this doesn't mean that the facilities proposed in P2900 shouldn't exist, and shouldn't be standardized</h2>

<p>Since the false argument keeps getting repeated, I'm going to try to
  dispel it the umpteenth time: this doesn't mean we shouldn't have
  'relaxed' contracts. They are fine for some uses, and there are
  cases where you may not be able to use a 'strict' contract.
</p>
<p><b>However</b>, if we wish to avoid a fiasco of cosmic proportions
  utterly failing to deliver something actually effective to our users
  in a reasonable timeframe, we can't just ship the 'relaxed' contracts
  and pretend we can feasibly ship the 'strict' ones later. They will
  need to come in the same package, or if that can't be done, the
  'strict' ones will need to come first. Otherwise we are looking
  at a technical disaster and a non-technical disaster.
</p>

<h2>P2900 can't and won't meet its goals</h2>

<p>There's a particularly intriguing realization that occurred to me
  when working on this problem space. Given P2900, how do we actually
  increase safety and correctness of C++ programs? You know, considering
  that we don't have any language help and support for knowing that
  our <em>contracts</em> are correct. Which, one would think, would
  be a rather important step in knowing that the rest of our programs
  are correct.
</p>

<p>P2680/P3285 give language support for avoiding all sorts of problems
  in the contracts themselves. P2900 gives none.</p>

<p>So, presumably the answer must be what it is for C++ code in general,
  since P2900 contracts are C++ code, nothing new, the P2900 contracts have all
  the things normal C++ code has. And that's suggested as a benefit.
  So, that means we will run our usual static analyzers on the code
  in those contracts, and we'll run our usual test approaches on them.
</p>

<p>The first problem is that now there's more code to analyze and test,
  <b>and</b> there's new interactions with that additional code and the
  code it's going to runtime-check. So the overall complexity of the
  correctness-checking task increases, and it increases in new ways.
</p>

<p>The second problem is that our existing practices Do Not Work. They
  may work for some shops, and for some particular users, but for
  the user community at large, they Do Not Work. Thousands and thousands
  of CVEs prove otherwise.
</p>

<p>
  So, now we would be applying the approaches
  known not to work on <em>more code</em>, and need to take into account
  increased complexity of that code, and <em>new interactions</em> in it,
  interactions of the runtime-checking code and the code that's being checked.
</p>

<p>That is logically infeasible. That's logically impossible. We need
  a different approach. And P2680/P3285 provide us with that. The approach
  proposed there is better for everybody, including the staunch proponents
  of P2900, including ones that oppose spending time at this point on
  P2680/P3285. Everybody, including those two somewhat overlapping
  audiences, gets a facility that is better for <b>both</b> run-time
  checking <b>and</b> static analysis.</p>

<p>And maybe, just maybe, we can talk about Contracts in C++ being
  an effective safety measure without failing miserably in the field
  of practice, and having our <b>users</b> suffer the various consequences
  thereof.
</p>

<h2>Brief rumination about the opposition to P2680/P3285</h2>

<p>
  It's of course not an exact science why SG21 and SG23 members
  have thought that P2680/P3285 shouldn't be further worked
  on during the C++26 cycle. But one of the major arguments
  against that seems to be a thought that it's unnecessary
  to provide special UB/side-effect protection for contract
  conditions, when a more general protection thereof in the language
  will then half-automatically benefit contracts as well.
</p>

<p>There's two reasons why that argument is faulty:
  <ul>
    <li>It seems likely that the more general protection is an opt-in, so it
      might be enabled, it might not be. Nevertheless, what is necessary
      for contracts is that they are UB-free and side-effect-free regardless
      of whether the rest of the language is. You can apply contracts
      on code that's built without any UB protection on. You can apply contracts
      on code that's built with some sort of UB protection on. In both
      of those cases, the contracts should be UB-free and side-effect-free.
    </li>
    <li>It doesn't seem likely that any language profile will provide for
      the side-effect-free part. If a precondition or a contract assert
      ends up accidentally modifying a function parameter, that is not
      a borrowing violation. If a contract assert ends up modifying
      local variables in a function definition, that is not a borrowing
      violation.
    </li>
  </ul>
  That is why contracts should have those special protections, regardless
  of whether the rest of the language does. And the other argument in
  favor of having those protections especially for contracts still applies:
  contracts are special, because they are correctness annotations. Correctness
  annotations are less likely to be correct themselves if all the caveats
  and bug farms of all the rest of the language are present completely
  unmitigated.
</p>
</body>
</html>
