﻿<hr>
<p>title: “Implementation defined coroutine extensions”<br>
document: P3203<br>
date: 2024-03-22<br>
audience: Core<br>
author:</p>
<ul>
<li>name: Klemens David Morgenstern<br>
email: <a href="mailto:klemens.d.morgenstern@gmail.com">klemens.d.morgenstern@gmail.com</a></li>
</ul>
<hr>
<h1 id="proposed-changes">Proposed Changes</h1>
<p>This paper proposes two wording changes to the standard that would make it legal (i.e. implementation defined)<br>
for users to provide their own coroutine implementations.</p>
<p><strong>coroutine.handle.general-2</strong></p>
<blockquote>
<p>If a program declares an explicit or partial specialization of coroutine_handle, the behavior is undefined.</p>
</blockquote>
<p>Changed to</p>
<blockquote>
<p>If a program declares an explicit or partial specialization of coroutine_handle, the behavior is <strong>implementation defined</strong>.</p>
</blockquote>
<p><strong>coroutine.handle.export.import-2</strong></p>
<blockquote>
<p><em>Preconditions</em>: addr was obtained via a prior call to address on an object whose type is a specialization of coroutine_handle.</p>
</blockquote>
<p>Changed to</p>
<blockquote>
<p><em>Preconditions</em>: addr was obtained via a prior call to address on an object whose type is a specialization of coroutine_handle<br>
<strong>which is neither explicit nor partial, obtained by a call to address on noop_coroutine_handle<br>
or points to a section of memory that is ABI compatible with the implementation provided by the former</strong>.</p>
</blockquote>
<h1 id="technical-background">Technical background</h1>
<p>The coroutine frame implementations are the same on MSVC, Gcc &amp; Clang and look like this for a given <code>promise_type</code>.</p>
<pre class=" language-cpp"><code class="prism  language-cpp"><span class="token keyword">struct</span> coroutine_frame
<span class="token punctuation">{</span>
  <span class="token keyword">void</span> <span class="token punctuation">(</span>resume <span class="token operator">*</span><span class="token punctuation">)</span> <span class="token punctuation">(</span>coroutine_frame <span class="token operator">*</span> <span class="token punctuation">)</span><span class="token punctuation">;</span>
  <span class="token keyword">void</span> <span class="token punctuation">(</span>destroy <span class="token operator">*</span><span class="token punctuation">)</span><span class="token punctuation">(</span>coroutine_frame <span class="token operator">*</span> <span class="token punctuation">)</span><span class="token punctuation">;</span>
  promise_type promise<span class="token punctuation">;</span>
  <span class="token comment">// auxiliary data goes here, like the function arguments</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre>
<p>The <code>std::coroutine_handle</code> functions to <code>resume</code> &amp; <code>destroy</code> call the appropriate function pointers,<br>
whereas <code>promise</code> returns a reference to the <code>promise</code> member and <code>done</code> checks if <code>resume</code> is <code>null</code>.</p>
<h1 id="motivation">Motivation</h1>
<p>Allowing users to provide their own coroutine types is useful for public interfaces.</p>
<p>An example can be found in <a href="https://github.com/boostorg/cobalt/blob/develop/example/python.cpp">boost.cobalt</a> where <code>python</code> awaits <code>C++</code> coroutines.<br>
Because this example does not include defined behaviour, it uses a superfluous coroutine <a href="https://github.com/boostorg/cobalt/blob/develop/example/python.cpp#L305"><code>py_coroutine</code></a> as glue,<br>
which causes an additional &amp; unnecessary allocation &amp; indirection.<br>
This superfluous coroutine could be eliminated with the proposed changed, which is likely even more useful for bindings to faster languages like <code>rust</code>.</p>
<h2 id="stackful-coroutines">Stackful coroutines</h2>
<p>Boost.cobalt also has an <a href="https://github.com/boostorg/cobalt/blob/fiber/include/boost/cobalt/experimental/context.hpp">experimental implementation</a> that provides stackful coroutines<br>
as an alternative runner for C++20 coroutines.</p>
<p>That is, instead of</p>
<pre class=" language-cpp"><code class="prism  language-cpp">boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>promise<span class="token operator">&lt;</span><span class="token keyword">void</span><span class="token operator">&gt;</span> <span class="token function">stackless</span><span class="token punctuation">(</span><span class="token punctuation">)</span> 
<span class="token punctuation">{</span>
  co_await boost<span class="token operator">::</span>asio<span class="token operator">::</span><span class="token function">post</span><span class="token punctuation">(</span>boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>use_op<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// the simplest possible async operation</span>
<span class="token punctuation">}</span>

boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>promise<span class="token operator">&lt;</span><span class="token keyword">void</span><span class="token operator">&gt;</span> cs <span class="token operator">=</span> <span class="token function">stackless</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre>
<p>it can be run stackful (powered by boost.context) with the following code:</p>
<pre class=" language-cpp"><code class="prism  language-cpp">boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>promise<span class="token operator">&lt;</span><span class="token keyword">void</span><span class="token operator">&gt;</span> <span class="token function">stackful</span><span class="token punctuation">(</span>
    boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>experimental<span class="token operator">::</span>context<span class="token operator">&lt;</span>boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>promise<span class="token operator">&lt;</span><span class="token keyword">void</span><span class="token operator">&gt;&gt;</span> ctx<span class="token punctuation">)</span>
<span class="token punctuation">{</span>
  ctx<span class="token punctuation">.</span><span class="token function">await</span><span class="token punctuation">(</span>boost<span class="token operator">::</span>asio<span class="token operator">::</span><span class="token function">post</span><span class="token punctuation">(</span>boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>use_op<span class="token punctuation">)</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span>

boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>promise<span class="token operator">&lt;</span><span class="token keyword">void</span><span class="token operator">&gt;</span> cs <span class="token operator">=</span> boost<span class="token operator">::</span>cobalt<span class="token operator">::</span>experimental<span class="token operator">::</span><span class="token function">make_context</span><span class="token punctuation">(</span><span class="token operator">&amp;</span>stackful<span class="token punctuation">)</span><span class="token punctuation">;</span>
</code></pre>
<p>The <code>coroutine_frame</code> gets created in <code>make_context</code> and embedded in the coroutine stack, avoiding a second allocation.<br>
This gives a user the benefits of a stackful coroutine (like interacting with coroutine unaware APIs) while being able<br>
to interact with any <code>co_await</code>-able API (such as boost.cobalt’s utilities) without any overhead.</p>
<p>It is worth nothing, that also (already) works with <a href="https://nxmnpg.lemoda.net/3/ucontext"><code>ucontext</code></a> and <a href="https://learn.microsoft.com/en-us/windows/win32/procthread/fibers"><code>WinFiber</code></a>,<br>
since <code>boost.context</code> supports either.</p>
<h2 id="any-asynchronous-completion">Any asynchronous completion</h2>
<p>Asynchronous completion has been a hotly debated issue over the last few years with may papers involved.<br>
By allowing user extensions here, any completion could be plugged into a coroutine_handle.<br>
If we are furthermore allowed to specialize these handles, the overhead can be minimized by templating the <code>await_suspend</code><br>
function on an awaitable.</p>
<pre class=" language-cpp"><code class="prism  language-cpp"><span class="token keyword">struct</span> my_awaitable
<span class="token punctuation">{</span>
    <span class="token keyword">bool</span> <span class="token function">await_ready</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
    <span class="token keyword">template</span><span class="token operator">&lt;</span><span class="token keyword">typename</span> Promise<span class="token operator">&gt;</span>
    <span class="token keyword">void</span> <span class="token function">await_suspend</span><span class="token punctuation">(</span>std<span class="token operator">::</span>coroutine_promise<span class="token operator">&lt;</span>Promise<span class="token operator">&gt;</span> h<span class="token punctuation">)</span><span class="token punctuation">;</span> <span class="token comment">// this makes it transparent to the compiler</span>
    <span class="token keyword">void</span> <span class="token function">await_resume</span><span class="token punctuation">(</span><span class="token punctuation">)</span><span class="token punctuation">;</span>
<span class="token punctuation">}</span><span class="token punctuation">;</span>
</code></pre>
<h1 id="conclusion">Conclusion</h1>
<p>This relatively minor change is purely legal, as it only declares currently undefined behaviour as implementation defined behaviour.<br>
That is, no work of any compiler vendor is needed.</p>
<p>These changes will allow libraries like boost.cobalt, which shares the author with this paper,<br>
to experiment and provide more functionality and integration into existing code bases that do not run on C++20 coroutines yet.</p>
<p>It furthermore opens up the only model for any asynchronous completion. This might not be the most efficient model,<br>
but it will allow developers to provide public APIs that can be consumed by other things than coroutines.</p>
<p>The main feature however will be that other coroutine implementations, such as fibers, or models from other languages.</p>

