<!DOCTYPE html>
<html xmlns="http://www.w3.org/1999/xhtml">
<head>
  <meta charset="UTF-8"/>
  <meta name="viewport" content="width=device-width, initial-scale=1"/>
  <meta name="title" content="Merged Modules and Tooling"/>

  <title>Merged Modules and Tooling</title>

  <style type="text/css">

html
{
  font-family: "Helvetica Neue", Helvetica, "Segoe UI", Arial, freesans, sans-serif;
  font-weight: normal;
  font-size: 18px;
  line-height: 1.4em;
  letter-spacing: 0.01em;

  color: #292929;
}

body {margin: 0;} /* There is non-0 default margin for body. */

/* See notes on what's going on here. */
body {min-width: 17em;}
@media only screen and (min-width: 360px)
{
  body {min-width: 19em;}
}

/*
 * Header (optional).
 */

#header-bar
{
  width: 100%;

  background: rgba(0, 0, 0, 0.04);
  border-bottom: 1px solid rgba(0, 0, 0, 0.2);

  padding: .4em 0 .42em 0;
  margin: 0 0 1.4em 0;
}

#header
{
  /* Same as in #content. */
  max-width: 41em;
  margin: 0 auto 0 auto;
  padding: 0 .4em 0 .4em;

  -webkit-box-sizing: border-box;
  -moz-box-sizing: border-box;
  box-sizing: border-box;

  width: 100%;
  display: table;
  border: none;
  border-collapse: collapse;
}

#header-logo, #header-menu
{
  display: table-cell;
  border: none;
  padding: 0;
  vertical-align: middle;
}

#header-logo {text-align: left;}
#header-menu {text-align: right;}

/* These overlap with #header's margin because of border collapsing. */
#header-logo {padding-left: .4em;}
#header-menu {padding-right: .4em;}

#header-logo a
{
  color: #000;
  text-decoration: none;
  outline: none;
}
#header-logo a:visited {color: #000;}
#header-logo a:hover, #header-logo a:active {color: #000;}

#header-menu a
{
  font-size: 0.889em;
  line-height: 1.4em;
  text-align: right;
  margin-left: 1.2em;
  white-space: nowrap;
  letter-spacing: 0;
}

#header-menu a
{
  color: #000;
  outline: none;
}
#header-menu a:visited {color: #000;}
#header-menu a:hover, #header-menu a:active
{
  color: #3870c0;
  text-decoration: none;
}

/* Flexbox-based improvements though the above works reasonably well. */
#header-menu-body
{
  width: 100%;

  display: -webkit-inline-flex;
  display: inline-flex;

  -webkit-flex-flow: row wrap;
  flex-flow: row wrap;

  -webkit-justify-content: flex-end;
  justify-content: flex-end;
}

/* Whether we want it (and at which point) depends on the size of the menu. */
/*
@media only screen and (max-width: 567px)
{
  #header-menu-body
  {
    -webkit-flex-direction: column;
    flex-direction: column;
  }
}
*/

/*
 * Content.
 */

#content
{
  max-width: 41em;
  margin: 0 auto 0 auto;
  padding: 0 .4em 0 .4em; /* Space between text and browser frame. */

  -webkit-box-sizing: border-box;
  -moz-box-sizing: border-box;
  box-sizing: border-box;
}

/*
 * Footer (optional).
 */

#footer
{
  color: #767676;
  font-size: 0.7223em;
  line-height: 1.3em;
  margin: 2.2em 0 1em 0;
  text-align: center;
}

#footer a
{
  color: #767676;
  text-decoration: underline;
}
#footer a:visited {color: #767676;}
#footer a:hover, #footer a:active {color: #3870c0;}

/* Screen size indicator in the footer. The before/after content is in case
   we don't have any content in the footer. Margin is to actually see the
   border separate from the browser frame. */

/*
#footer:before {content: "\A0";}
#footer:after {content: "\A0";}

#footer
{
  border-left: 1px solid;
  border-right: 1px solid;
  margin-left: 1px;
  margin-right: 1px;
}

@media only screen and (max-width: 359px)
{
  #footer {border-color: red;}
}

@media only screen and (min-width: 360px) and (max-width: 567px)
{
  #footer {border-color: orange;}
}

@media only screen and (min-width: 568px) and (max-width: 1023px)
{
  #footer {border-color: blue;}
}

@media only screen and (min-width: 1024px)
{
  #footer {border-color: green;}
}
*/

/*
 * Common elements.
 */

p, li, dd {text-align: justify;}
.code {text-align: left;} /* Manually aligned. */
pre {text-align: left;}   /* If it is inside li/dd. */

/* Notes. */

.note
{
  color: #606060;
}

div.note
{
  margin: 2em 0 2em 0; /* The same top/bottom margings as pre box. */

  padding-left: 0.5em;
  border: 0.25em;
  border-left-style: solid;
  border-color: #808080;

  page-break-inside: avoid;
}

div.note :first-child {margin-top:    0;}
div.note :last-child  {margin-bottom: 0;}

span.note::before {content: "[Note: "}
span.note::after  {content: "]"}

/* Links. */
a
{
  color: #3870c0;
  /*color: #4078c0;*/
  text-decoration: none;
}

a:hover, a:active
{
/*color: #006fbf;*/
/*color: #0087e7;*/
  text-decoration: underline;
}

a:visited
{
/*color: #003388;*/
  color: #00409c;
}

/* Standard lists. */
ul, ol, dl {margin: 1em 0 1em 0;}
ul li, ol li {margin: 0 0 .4em 0;}
ul li {list-style-type: circle;}
dl dt {margin: 0 0 0 0;}
dl dd {margin: 0 0 .6em 1.8em;}

code, pre
{
  font-family: Consolas, "Liberation Mono", Menlo, Courier, monospace;
  font-size: 0.92em;
  letter-spacing: 0;
}

pre {white-space: pre-wrap;}

@media only screen and (max-width: 567px)
{
  pre {word-break: break-all;}
}

/* Use page rather than system font settings. */
input
{
  font-family: inherit;
  font-weight: inherit;
  font-size:   inherit;
  line-height: inherit;
}

pre
{
  background-color: rgba(0, 0, 0, 0.05);
  border-radius: 0.2em;
  padding: .8em .4em .8em .4em;
  margin: 2em -.4em 2em -.4em; /* Use margins of #content. */
}

code
{
  background-color: rgba(0, 0, 0, 0.05);
  border-radius: 0.2em;
  padding: .2em .32em .18em .32em;
}

/*
code::before
{
  letter-spacing: -0.2em;
  content: "\00a0";
}

code::after
{
  letter-spacing: -0.2em;
  content: "\00a0";
}
*/

/* Bases:
 *
 * common.css
 * pre-box.css
 * code-box.css
 *
 */

#content
{
  max-width: 43.6em;
  padding-left: 3em; /* Reserve for headings. */
}

h1
{
  font-weight: normal;
  font-size: 2em;
  line-height: 1.4em;
  margin: 1.6em 0 .6em -1.4em;
}

h1.preface
{
  margin-left: -.56em;
}

h2
{
  font-weight: normal;
  font-size: 1.556em;
  line-height: 1.4em;
  margin: 1.6em 0 .6em -.8em;
}

h3
{
  font-weight: normal;
  font-size: 1.3em;
  line-height: 1.4em;
  margin: 1.6em 0 .6em -.2em;
}

/* Title page */

#titlepage {
  margin: 0 0 4em 0;
  border-bottom: 1px solid black;
}

#titlepage .title {
  font-weight: normal;
  font-size: 2.333em;
  line-height: 1.4em;
  letter-spacing: 0;
  text-align: center;
  margin: 2em 0 2em 0;
}

#titlepage p {
  font-size: 0.889em;
  line-height: 1.4em;
  margin: 2em 0 .6em 0;
}

table.toc
{
  border-style      : none;
  border-collapse   : separate;
  border-spacing    : 0;

  margin            : 0.2em 0 0.2em 0;
  padding           : 0 0 0 0;
}

table.toc tr
{
  padding           : 0 0 0 0;
  margin            : 0 0 0 0;
}

table.toc * td, table.toc * th {
  border-style      : none;
  margin            : 0 0 0 0;
  vertical-align    : top;
}

table.toc * th
{
  font-weight       : normal;
  padding           : 0 0.8em 0 0;
  text-align        : left;
  white-space       : nowrap;
}

table.toc * table.toc th
{
  padding-left      : 1em;
}

table.toc * td
{
  padding           : 0 0 0 0;
  text-align        : left;
}

table.toc * td.preface
{
  padding-left      : 1.35em;
}


#titlepage .title
{
  font-size: 2em;
}

/*
 * Property list table.
 */
.proplist
{
  width: calc(100%); /* Fill the page. */

  table-layout: fixed;

  border: none;
  border-spacing: 0 0;
}

.proplist th, .proplist td {padding: .1em 0 .1em 0;}

.proplist th
{
  font-weight: normal;
  text-align: left;

  width: 7em;
}
.proplist th:after {content: ":";}
  </style>

</head>
<body>
<div id="content">

  <div id="titlepage">
    <div class="title">Merged Modules and Tooling</div>

    <table class="proplist">
      <tbody>
	<tr><th>Document</th><td><a href="https://wg21.link/P1156R0">P1156R0</a></td></tr>
	<tr><th>Audience</th><td>EWG</td></tr>
	<tr><th>Authors</th><td>Boris Kolpackov</td></tr>
	<tr><th>Reply-To</th><td>boris@codesynthesis.com</td></tr>
	<tr><th>Date</th><td>2018-10-04</td></tr>
      </tbody>
    </table>
  </div>
  <h2 id="note">Note</h2>

  <div class="note">
  <p>A draft of this paper was discussed at the Bellevue (ad hoc) meeting (<a
  href="https://wg21.link/p1136r0">P1136R0</a>) as well as referenced by
  another paper (<a href="https://wg21.link/p1180r0">P1180R0</a>). The
  contents of that draft are therefore preserved unchanged in this final paper
  with post-meeting notes added at the end of each section and marked with the
  BELLEVUE keyword.</p>
  </div>

  <h2 id="toc">Contents</h2>

  <table class="toc">
    <tr><th>1</th><td><a href="#abstract">Abstract</a></td></tr>
    <tr><th>2</th><td><a href="#global">Global Fragment
Restrictions</a></td></tr>
    <tr><th>3</th><td><a href="#non-modular">Non-Modular Code</a></td></tr>
    <tr><th>4</th><td><a href="#preamble">Preamble End</a></td></tr>
    <tr><th>5</th><td><a href="#partitions">Module Partitions</a></td></tr>
    <tr><th>6</th><td><a href="#ack">Acknowledgments</a></td></tr>
  </table>

  <h2 id="abstract">1 Abstract</h2>

  <p><i>This paper describes a number of tooling-related issues that we have
  identified with the merged modules proposal (<a
  href="https://wg21.link/p1103r0">P1103R0</a>). Summary of the proposed
  changes:</i></p>

  <ul>
  <li><i>Remove <code>#include</code>s-only restriction on the global module
  fragment.</i></li>

  <li><i>Add import preamble requirement to non-module translation
  units.</i></li>

  <li><i>Add explicit preamble end marker.</i></li>
  </ul>

  <h2 id="global">2 Global Fragment Restrictions</h2>

  <p><a href="https://wg21.link/p1103r0#subsection.2.3.1">P1103R0 Section
  2.3.1</a> Clause 2 states:</p>

  <p><i>"Only <code>#include</code>s are permitted to appear in the global
  module fragment, but there are no special restrictions on the contents of
  the <code>#include</code>d file."</i></p>

  <p>The rationale for this restriction given at the Rapperswill meeting is to
  allow simple module-aware tools without the need for preprocessing or
  elaborate parsing. It was also acknowledged that such tools won't be able to
  handle all valid module translation units since both module declarations and
  import declarations can be <code>#include</code>ed.</p>

  <p>While not handling "exotic" translation units like this may be an
  acceptable trade off, such simple tools will also fail to handle (or, more
  likely, mishandle) translation units that use <code>#if</code> for
  conditional importation. And we believe this will be common in real-world
  code, for example:</p>

  <pre>module foo;

#ifdef EXTRA
import bar;
#endif

...</pre>

  <p>Furthermore, this restriction prevents useful practices, most notably,
  the ability to forward-declare in the global module fragment, for
  example:</p>

  <pre>module;

//#include "heavy.h"  // Expensive.
class heavy;          // Illegal.

module foo;
...</pre>

  <p>Additionally, the complexity of deciding when to stop scanning for
  module-related declarations (see <a href="#preamble">Preamble End</a>) will
  most likely result in compiler implementations providing support for
  extracting information from the module preamble (see, for example, GCC's
  <code>-fmodule-preamble</code>). With this support, simple tools should be
  able to achieve greater reliability without significant extra
  complexity.</p>

  <p>As a result, because this restriction only offers false hope of
  simplicity while preventing established and useful practices, we propose
  that it be removed.</p>

  <div class="note">
  <p>BELLEVUE: It was suggested during the meeting that instead of relaxing
  the global module fragment restrictions, it should be removed entirely in
  favor of using legacy header modules. This, however, would make it
  impossible to include into modular code headers that are not sufficiently
  well-behaved to be represented as legacy modules.</p>
  </div>

  <h2 id="non-modular">3 Non-Modular Code</h2>

  <p><a href="https://wg21.link/p1103r0#subsection.2.3.3">P1103R0 Section
  2.3.3</a> Clause 1 states:</p>

  <p><i>"Modules and legacy header units can be imported into non-modular
  code. Such imports can appear anywhere, and are not restricted to a
  preamble."</i></p>

  <p>Furthermore, from <a
  href="https://wg21.link/p1103r0#section.19.3">P1103R0 Section 19.3</a>
  Clause 1 it follows that in such non-module units macros exported by a
  legacy header module are visible immediately after the import declaration
  (as opposed to at the end of the preamble) and therefore can affect
  subsequent importations.</p>

  <p>As discussed in detail in <a
  href="https://wg21.link/p1052r0#macros">P1052R0 Section 3</a>, this
  "relaxed" model for non-module translation units will significantly
  complicate module dependency extraction by build systems and other tools.
  Briefly, the build system will no longer be able to determine the module
  dependency information at the outset, before starting the compilation while
  the compiler may not have access to all the (up-to-date) BMIs (binary module
  interfaces) to perform the compilation. As a result, the compiler will have
  to query (i.e., <i>call back</i> into) the build system on encountering
  every import declaration in order to obtain an (up-to-date) BMI that it can
  use (and which the build system might still have to compile, potentially
  triggering a recursive chain of callbacks).</p>

  <p>Note also that this does not appear to be a transition-only issue since
  it is not evident the end state of a modularization process should be a
  codebase without any non-module translation units. For example, it is not
  clear why the translation unit that defines <code>main()</code> would ever
  need to be a module. <span class="note">One such reason could be unit
  testing: <code>main()</code> may need to belong to a module in order to gain
  access to non-exported entities.</span></p>

  <p>As a result, we propose that rules similar to the module preamble be
  applied to the non-module translation units, with the exception for
  automatic mapping of <code>#include</code> directives to legacy header
  module imports. <span class="note">Such a mapping is specified as both
  optional and implementation-defined and so such a relaxation seems harmless;
  see <a href="https://wg21.link/p1103r0#subsection.2.3.3">P1103R0 Section
  2.3.3</a> Clause 2 for details</span>.</p>

  <div class="note">
  <p>BELLEVUE: Per the discussion at the meeting, there appears to be
  agreement that modularizing a codebase should eventually result in the
  replacement of all non-modular translation units with modules and that
  enforcing the preamble restrictions in such units would make the gradual
  modularization process difficult. <a
  href="https://wg21.link/p1180r0">P1180R0</a> proposed an alternative
  approach which would have also resolved this issue. However, it was not
  adopted. As a result, we believe build system vendors may end up imposing
  additional ad hoc restrictions on non-modular translation units, such as
  that proposed in P1180R0 or explicit specification of legacy module
  dependencies.</p>
  </div>

  <div class="note">
  <p>BELLEVUE: The issue identified by <a
  href="https://wg21.link/p1180r0">P1180R0</a> with this proposal (header
  inclusions containing <code>import</code>s) also affects the global module
  fragment in modular translation units.</p>
  </div>

  <h2 id="preamble">4 Preamble End</h2>

  <p><a href="https://wg21.link/p1103r0#section.2.1">P1103R0 Section 2.1</a>
  Clause 1 states:</p>

  <p><i>"A module unit begins with a preamble, comprising a module declaration
  and a sequence of imports: [...] Within a module unit, imports may only
  appear within the preamble."</i></p>

  <p>Furthermore, from <a
  href="https://wg21.link/p1103r0#section.19.3">P1103R0 Section 19.3</a>
  Clause 1 it follows that the importation of modules within the preamble
  cannot depend on macros exported from legacy header units.</p>

  <p>The motivation for these restriction is to allow tools (such as build
  systems) that wish to extract the module-related information from a module
  translation unit to parse the preamble without supplying any of the BMIs
  (binary module interfaces) for imported modules (legacy or not). It is
  expected that compiler implementations will provide support for
  preamble-only preprocessing that such tools will use (see, for example,
  GCC's <code>-fmodule-preamble</code>).</p>

  <p><a href="https://wg21.link/p1103r0#section.19.3">P1103R0 Section 19.3</a>
  Clause 4 defines the preamble as a sequence of <i>preprocessing-tokens</i>
  that match a production pattern (<i>pp-preamble</i>). However, in practice,
  detecting where the preamble ends appears to be challenging since peeking at
  the next <i>preprocessing-token</i> may involve processing directives (such
  as <code>#include</code> or <code>#error</code>) that are difficult to do
  partially or undo. Consider this example:</p>

  <pre>module M;

import foo;
import "fox.h";
                        // (a)
#ifndef EXTRA
#  include "bar.h"
#endif
                        // (b)
void f ();</pre>

  <p>Where exactly the preamble ends in this example depends on whether
  <code>EXTRA</code> is a module-exported macro and what is inside header
  <code>bar.h</code>. Some possible scenarios:</p>

  <ol>
  <li>If <code>EXTRA</code> is a module-exported macro, then the preamble ends
  at (a) unless <code>bar.h</code> contains import declarations in which case
  the translation unit is invalid.</li>

  <li>If <code>EXTRA</code> is not a module-exported macro, then preamble ends
  at (a) unless <code>bar.h</code> contains import declarations in which case
  it ends at (b) (or somewhere inside <code>bar.h</code>, if it also contains
  other declarations).</li>
  </ol>

  <p>However, without loading the BMI for legacy header module
  <code>fox.h</code> the compiler cannot know what kind of macro
  <code>EXTRA</code> is (or whether it is actually defined) and without
  preprocessing header <code>bar.h</code> it doesn't know what it contains.
  And speculatively preprocessing header <code>bar.h</code> may have various
  side effects. For example, it may contain <code>#error</code> or not even
  exist if <code>fox.h</code> does in fact define <code>EXTRA</code>.</p>

  <p>To overcome this, the current (admittedly experimental) implementation in
  GCC suggests explicitly marking the preamble end with a stray semicolon. For
  example, if <code>bar.h</code> does exist, the user sees the following
  diagnostics:</p>

  <pre>m.cxx:6:1: warning: module preamble ended immediately before
                    preprocessor directive
m.cxx:6:1: note: explicitly mark the end with an earlier ‘;’</pre>

  <p>However, if <code>bar.h</code> does not exist, GCC terminates with a
  fatal error before having a chance to issue the above suggestion.</p>

  <p>Based on this we believe the current semantics of determining the
  preamble end will lead to brittle tooling with confusing diagnostics. As a
  result, we propose adding an explicit <i>preamble end marker</i> (or
  <i>preamble concluder</i>) similar to the leading module marker proposed in
  <a href="https://wg21.link/p0713r1">P0713R1</a> and adopted by <a
  href="https://wg21.link/p1103r0">P1103R0</a> (where it is called <i>module
  introducer</i>). For example:</p>

  <pre>module M;

import foo;
import "fox.h";

import;  // Preamble end.

#ifndef EXTRA
#  include "bar.h"
#endif

void f ();</pre>

  <div class="note">
  <p>Nobody will argue that this is inelegant but one way or another there
  appears to be a cost for supporting exportation of macros from modules.</p>
  </div>

  <div class="note">
  <p>BELLEVUE: There was no consensus at the meeting on whether to add the
  explicit preamble end marker. However, the following alternative syntax was
  generally viewed as a better option to either the stray semicolon or the
  empty <code>import</code>:</p>

  <pre>module M
{
  import foo;
  export import bar;
  import "fox.h";

} // Preamble end.

...</pre>

  <p>Or even (inspired by the Go's import declaration syntax):</p>

  <pre>module M;

import (
  foo,
  export bar,
  "fox.h"
); // Preamble end.

...</pre>
  </div>

  <h2 id="partitions">5 Module Partitions</h2>

  <p>This section contains a collection of notes on module partitions and
  their implications for tooling. At this stage it does not propose any
  changes to P1103R0.</p>

  <p>Partition names are a sequence of identifiers, the same as the module
  names themselves (<a
  href="https://wg21.link/p1103r0#subsection.10.7.1">P1103R0 Section
  10.7.1</a>). It is not clear why this support for hierarchical partitions is
  desirable. On the other hand it will surely complicate the mapping of the
  combined module/partition names to filesystem entities (for example, BMI
  files).</p>

  <p>Implementation partitions can be imported by other translation units
  belonging to the same module (<a
  href="https://wg21.link/p1103r0#subsection.10.7.1">P1103R0 Section
  10.7.1</a> Clause 4, 10). In other words, we now have <i>importation of
  implementations</i> which means there will have to be BMIs for them. <s>The
  fact that there can (presumably) be both interface and implementation
  partition units for the same partition further complicates things (do they
  end up with separate BMIs or is it merged, etc). What happens during
  importation of such a dual-unit partition does not appear to be specified
  (presumably entities from both become visible).</s></p>

  <div class="note">
  <p>At the Rapperswill meeting it was mentioned that various strategies are
  available to implementations when it comes to module partitions: they can be
  represented as separate partition BMIs or they can contribute to a combined
  module BMI. However, it feels that a combined BMI approach will reduce the
  build system's ability to parallelize compilation.</p>
  </div>

  <div class="note">
  <p>BELLEVUE: As discussed in <a
  href="https://wg21.link/p1180r0">P1180R0</a>, the above crossed-out
  understanding is incorrect: a module partition is a single (interface or
  implementation) translation unit. In other words, a partition is always an
  "interface", but depending on what kind of partition it is, it can be
  "public" (with its exported declarations visible outside of the module) or
  "private" (with all its declarations visible but only inside the
  module).</p>
  </div>

  <h2 id="ack">6 Acknowledgments</h2>

  <p>Thanks to Nathan Sidwell for clarifications on the preamble end detection
  algorithm as implemented in GCC. Thanks to Richard Smith for the response
  paper (<a href="https://wg21.link/p1180r0">P1180R0</a>) as well as further
  clarifications on the module partition semantics.</p>

</div>

</body>
</html>
