<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html>
  <head>
    <meta content="text/html;charset=UTF-8" http-equiv="Content-Type">
    <title>Comma omission and comma deletion</title>
    <style type="text/css">
      html { margin: 0; padding: 0; color: black; background-color: white; }
      body { padding: 2em; font-size: medium; font-family: "DejaVu Serif", serif; line-height: 150%; }
      code { font-family: "DejaVu Sans Mono", monospace; color: #006; }

      h1, h2, h3 { margin: 1.5em 0 .75em 0; line-height: 125%; }

      div.code { white-space: pre-line; font-family: "DejaVu Sans Mono", monospace;
                 border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1em;
                 border-radius: 4px; }

      div.strictpre { white-space: pre; }

      sup, sub { line-height: 0; }

      .docinfo { float: right; }
      .docinfo p { margin: 0; text-align:right; }
      .docinfo address { font-style: normal; }

      .quote { display: inline-block; clear: both; margin-left: 1ex;
                 border: thin solid #E0E0E0; background-color: #F8F8F8; padding: 1ex; }

      .modify { border-left: thick solid #999; border-right: thick solid #999; padding: 0 1em; }
      .insert { border-left: thick solid #0A0; border-right: thick solid #0A0; padding: 0 1em; }
      .comment { color: #456; }

      p.skip { margin-top: 1em; }

      table { border-top: 2px solid black; border-bottom: 2px solid black; border-collapse: collapse; margin: 3em auto; }
      thead th { border-bottom: 2px solid black; }
      th, td { text-align: left; vertical-align: top; padding: 1ex 1ex 1ex 5em; }
      th:first-child, td:first-child { padding-left: 1ex; }

      ins { color: #090; }
      del { color: #A00; }
      ins code, del code { color: inherit; }
    </style>
  </head>
  <body>
    <div class="docinfo">
      <p>Date: 2016-05-08</p>
      <address>Thomas K&ouml;ppe &lt;<a href="mailto:tkoeppe@google.com">tkoeppe@google.com</a>&gt;</address>
      <p class="skip">ISO/IEC JTC1 SC22 WG21 P00306r1</p>
      <p>To: EWG, CWG, WG14 liaison</p>
      <p class="skip">ISO/IEC JTC1 SC22 WG14 N2044</p>
      <p>To: WG21 liaison</p>
    </div>

    <h1>Comma omission and comma deletion</h1>

    <!-- fgrep -e "<h2 id=" nocomma.html | sed -e 's/.*id="\(.*\)">\(.*\)<\/h2>/<li><a href="#\1">\2<\/a><\/li>/g' -->
    <h2>Contents</h2>
    <ol>
      <li><a href="#history">Revision history</a></li>
      <li><a href="#summary">Summary</a></li>
      <li><a href="#problem">The problem</a></li>
      <li><a href="#defsol">Defining the goal</a></li>
      <li><a href="#alt">Alternative designs</a></li>
      <li><a href="#proposal">Discussion and proposal</a></li>
      <li><a href="#impact">Impact</a></li>
      <li><a href="#experience">Implementation experience</a></li>
      <li><a href="#wording">Proposed wording</a></li>
      <li><a href="#wg14">Compatibility with C</a></li>
      <li><a href="#ack">Acknowledgements</a></li>
    </ol>

    <h2 id="history">Revision history</h2>
    <ul>
    <li><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0306r0.html">WG21 P00306r0</a>: Initial proposal.</li>
    <li>WG14 <a href="http://www.open-std.org/jtc1/sc22/wg14/www/docs/n2034.htm">N2034</a>: Same as P00306r0, but addressed to WG14.</li>
    <li>WG21 P00306r1/WG14 N2044: This version. Changed proposed wording for <code>__VA_OPT__</code> from P00306r0/N2034 to be clear about
      balanced parentheses and resulting tokens; added alternative &ldquo;<code>#define F(X...)</code>&rdquo;.</li>
    </ul>

    <p><em>Note to WG14</em>: This paper is being proposed to WG21, and the section numbering
      and some of the examples make superficial reference to C++. The wording and the problem
      are identical in C, though, with the section mapping described at the end.</p>

    <h2 id="summary">Summary</h2>

    <p>This is a proposal to make variadic macros easier to use with no arguments by adding a new
      special functional macro <code>__VA_OPT__</code>.</p>

    <h2 id="problem">The problem</h2>

    <p>Function-style macros that can have variable arguments suffer from a number of
      ill-specified corner cases. Consider the following macro definitions:</p>

    <div class="code">#define F(X, ...) f(10, X, __VA_ARGS__)
      #define G(X) &nbsp; &nbsp; &nbsp;f(10, X)
      #define H(...) &nbsp;&nbsp; f(10, __VA_ARGS__)</div>

    <p>Invocations of these macros are surprising:</p>

    <table>
      <thead>
        <tr><th>Invocation</th><th>Effect</th><th>Notes</th></tr>
      </thead>
      <tbody>
        <tr><td><code>F(a, b, c)</code></td><td><code>f(10, a, b, c)</code></td><td>variable arguments are &ldquo;<code>b, c</code>&rdquo;</td></tr>
        <tr><td><code>F(a, )</code></td><td><code>f(10, a, )</code></td><td>variable arguments contain zero tokens<br>syntax error</td></tr>
        <tr><td><code>F(a)</code></td><td>ill-formed</td><td>violates 16.3p12 (no variable arguments)</td></tr>
        <tr><td><code>G(a)</code></td><td><code>f(10, a)</code></td><td></td></tr>
        <tr><td><code>H(a, b, c)</code></td><td><code>f(10, a, b, c)</code></td><td>variable arguments are &ldquo;<code>a, b, c</code>&rdquo;</td></tr>
        <tr><td><code>H(a)</code></td><td><code>f(10, a)</code></td><td>variable arguments are &ldquo;<code>a</code>&rdquo;</td></tr>
        <tr><td><code>H()</code></td><td><code>f(10, )</code></td><td>variable arguments are &ldquo;&rdquo;<br>syntax error</td></tr>
      </tbody>
    </table>

    <p>There are two problems:</p>
    <ol>
      <li>When the macro definition ends in <code>...)</code>, the invocation must contain at least
        as many commas as the macro has mandatory parameters. This makes the invocation <code>F(a)</code>
        invalid.</li>
      <li>In the case of zero mandatory parameters, it is not possible to distinguish between zero
        arguments in the invocation and one (variable) argument that is empty. The root of this problem
        is the unfortunate human convention by which a list of <em>n</em> elements is presented with
        <em>n</em>&nbsp;&minus;&nbsp;1 infix separators. This convention does not degenerate to the case
        <em>n</em>&nbsp;=&nbsp;0, and empty lists are always a special case.</li>
    </ol>

    <p>However, it is quite natural for a macro invocation with variable arguments
      to degenerate to the case where there are no arguments. In the example, we would
      like <code>F(a)</code> to be replaced with <code>f(10, a)</code>. A more realistic
      example is a custom diagnostic facility such as the following:</p>

    <div class="code">#define ERROR(msg, ...) std::printf("[" __FILE__ ":%d] " msg, __LINE__, __VA_ARGS__)

      ERROR("%d errors.\n", 0); &nbsp;// OK, std::printf("[" "file.cpp" ":%d] " "%d errors.\n", 7, 0);
      ERROR("No errors.\n"); &nbsp; &nbsp; // Error
    </div>

    <p>The complication arises when we consider <code>H()</code>. We may perhaps wish it to
      be replaced with <code>f(10)</code>. However, we may also wish to have a macro such as</p>
    <div class="code">#define ADD_COMMA(...) , __VA_ARGS__</div>
    <p>which always produces a comma, even when invoked with no arguments. The difference is
      that we consider <code>H()</code> to have zero arguments, whereas we consider
      <code>ADD_COMMA()</code> to have one, empty argument.</p>

    <h2 id="defsol">Defining the goals</h2>
    
    <p>We would like to make the preprocessor more expressive to allow users to write macros
      for all of the situations described above. This requires two distinct changes, one simple
      and the other complex.</p>

    <p><strong>Goal 1.</strong> Allow the omission of the comma before the variable arguments
      in the invocation (i.e. allow <code>F(a)</code> rather than requiring <code>F(a, )</code>).</p>
      
    <p><strong>Goal 2.</strong> Provide a mechanism to express a replacement text that contains
      the variable arguments but which contains a separating comma only if the variable
      arguments are non-empty (i.e.  allow both <code>f(10, a)</code> and <code>f(10, a,
      b)</code> as possible replacements of <code>F</code>). At the same time, continue to
      provide a mechanism that unconditionally contains comma before the (possibly empty)
      variable arguments, like <code>ADD_COMMA</code> above.</p>

    <p>This behaviour of Goal 1 is already supported by many popular compilers as a
      non-conforming extension. It is a non-breaking change, since the current syntax
      <code>F(a)</code> is ill-formed. Goal 2 is much harder to solve, since there is
      no single simple enhancement of the existing semantics that satisfies all possible
      use cases.</p>

    <p>We will step through a series of possible solutions (inspired by existing vendor
      extensions) and analyse their shortcomings, before presenting the proposed solution.</p>

    <h2 id="alt">Alternative designs</h2>

    <h3>1. Delete any commas if there are no variable arguments</h3>

    <p>This approach does not add any new syntax. It merely solves Goal 1 above by allowing
      a variadic macro invocation to not contain any variable arguments. However, under this
      approach, the absence of variable arguments is taken as a request to <em>delete</em> an
      existing comma immediately preceding the <code>__VA_ARGS__</code> token:
    </p>
    <div class="code">F(a, ) &nbsp;  => &nbsp; f(10, a, ) &nbsp; // has variable argument, replaced as-is
      F(a) &nbsp; &nbsp; => &nbsp; f(10, a) &nbsp; &nbsp; // no variable arguments, final comma deleted
      H() &nbsp; &nbsp;&nbsp; => &nbsp; f(10, ) &nbsp; &nbsp;&nbsp; // has (empty) variable argument, replaced as-is
    </div>

    <p>This is a minimal, unsurprising extension. However, it suffers from the major draw-back
      that it offers no mechanism to delete a trailing comma from a variadic macro with zero
      mandatory parameters.</p>

    <p>A variant of this extension is currently provided by MSVC++ and Embarcadero compilers,
      which always delete the comma, even in the case of zero mandatory arguments. Another
      possible extension is to provide those semantics under a new name (e.g.
      <code>__VA_ARGS_FOO__</code>).</p>

    <h3>2. Hijack existing syntax to opt in to comma deletion</h3>

    <p>This approach also allows the omission of the variable arguments,
      and in addition it reuses the concatenation operator <code>##</code>
      to control comma deletion explicitly:</p>

    <div class="code">#define F1(X, ...) f(10, X, __VA_ARGS__)
      #define F2(X, ...) f(10, X, ## __VA_ARGS__)

      #define H1(...) f(10, __VA_ARGS__)
      #define H2(...) f(10, ## __VA_ARGS__)

      F1(a, b) &nbsp; => &nbsp; f(10, a, b) &nbsp; // standard
      F1(a, ) &nbsp;&nbsp;  => &nbsp; f(10, a, ) &nbsp;&nbsp; // standard (empty variable arguments)
      F1(a) &nbsp; &nbsp;&nbsp; => &nbsp; f(10, a, ) &nbsp;&nbsp; // extension (no variable arguments), final comma unaffected

      F2(a, b) &nbsp; => &nbsp; f(10, a, b) &nbsp; // variable arguments not empty
      F2(a, ) &nbsp;&nbsp;  => &nbsp; f(10, a, ) &nbsp;&nbsp; // variable arguments present (though empty), final comma unaffected
      F2(a) &nbsp; &nbsp;&nbsp; => &nbsp; f(10, a) &nbsp; &nbsp;&nbsp; // no variable arguments, final comma deleted

      H1(a) &nbsp; &nbsp;&nbsp; => &nbsp; f(10, a) &nbsp; &nbsp;&nbsp; // standard
      H1() &nbsp; &nbsp; &nbsp; => &nbsp; f(10, ) &nbsp; &nbsp; &nbsp; // standard (empty variable arguments)

      H2(a) &nbsp; &nbsp;&nbsp; => &nbsp; f(10, a) &nbsp; &nbsp;&nbsp; // variable arguments not empty
      H2() &nbsp; &nbsp; &nbsp; => &nbsp; f(10) &nbsp; &nbsp; &nbsp; &nbsp; // variable arguments present (though empty), final comma deleted
    </div>

    <p>This extension is somewhat difficult to explain, but it generally Does What You Want. The
      complete omission of variable arguments is required for comma deletion (compare <code>F2(a,
      )</code> and <code>F2(a)</code>), though omission of the variable arguments alone is not
      enough to delete the comma (compare <code>F1(a, )</code> and <code>F1(a)</code>), but the
      case of zero mandatory parameters is special, and in that case it is mere absence of tokens
      from the variable arguments that enables the comma deletion when the <code>##</code> operator
      is used.</p>

    <p>The downside of this extension is three-fold: 1) Parsing this syntax requires look-ahead,
      adding cost to the translation. 2) The extension reuses an unrelated piece of syntax, muddling
      the language. 3) The extension hides its dependency on the presence or absence of the variable
      arguments and whether the variable arguments contain tokens in subtle and non-explicit ways.</p>

    <h3>3. &ldquo;Named pack&rdquo; style</h3>

    <p>A rather more different approach abandons the use of C99&rsquo;s <code>__VA_ARGS__</code>
      token in favour of something like <code>#define F(X, Args...)</code> or <code>#define F(X,
      ...Args)</code>. GCC has long provided the former (where the replacement text would use
      <code>Args</code> for the variable arguments, and <code>, ##Args</code> (with mandatory
      whitespace after the comma!) requests comma deletion). The template-pack-like syntax
      <code>...Args</code> does not appear to be used by any preprocessor and may provide a
      less obstructed extension route (e.g. one could say that <code>x, y, ...Args</code>
      always has comma deletion semantics).</p>

    <p>However, all these approaches seem undesirable. First off, they are a departure, and
      perhaps even a regression, from the direction taken by C99 and its <code>__VA_ARGS__</code>
      token. Second, this design would only satisfy those needs that require comma deletion,
      leaving use cases like the above <code>ADD_COMMA</code> to use the existing syntax. Thus
      there would be two parallel but dissimilar constructions living side by side, which seems
      inelegant and wasteful.</p>

    <h3>4. Omit comma from the macro definition</h3>

    <p><em>Note</em>: This idea came up during the discussion of N2034 in WG14.</p>

    <p>We could use syntax like <code>#define F(X ...) f(10, __VA_ARGS__)</code> to request
      comma deletion (note the absence of a comma before the ellipsis in the definition).
      This approach does not degenerate to the case of macros with no named parameters,
      though. Moreover, WG14 felt that this was too clever and too subtle, whereas the
      proposed solution below is highly visible and explicit. Also, an unrelated difference
      between C++ and C is that C++ allows omitting the final comma from the parameter list
      of a variable function declaration. While this has nothing to do with the preprocessor,
      the semantics of the optional comma of that feature are the opposite of this present
      consideration, which is unnecessarily confusing.</p>

    <h3>5. A new token</h3>

    <p>All of the considered extensions so far have in common that they end up creating
      a parallel set of constructions which are identical to the existing macro facilities
      except when the macro is invoked with no variable arguments, and they all provide
      some automatic mechanism to determine when to delete a comma. However, none of them
      are quite explicit about what they are doing.</p>

    <p>For the next idea, we consider adding a new token. Let us call it <code>__VA_ARGS_OPT__</code>,
      with the semantics that wherever it appears in the replacement text, it is replaced
      with the variable arguments (just like <code>__VA_ARGS__</code>), but additionally,
      whenever the variable arguments do contain tokens, a comma is prepended:</p>
    <div class="code">#define F(X, ...) f(10, X __VA_ARGS_OPT__)

      F(a, b)&nbsp; => &nbsp; f(10, a, b) &nbsp;// __VA_ARGS_OPT__ => ', b'
      F(a, ) &nbsp;  => &nbsp; f(10, a) &nbsp; &nbsp; // empty variable arguments, __VA_ARGS_OPT__ => ''
      F(a) &nbsp; &nbsp; => &nbsp; f(10, a) &nbsp; &nbsp; // empty variable arguments, __VA_ARGS_OPT__ => ''
    </div>

    <p>In this approach, we have separated Goals 1 and 2 entirely; whether a leading (!)
      comma is inserted now only depends on whether the variable arguments contain tokens,
      not on whether they are present at all.</p>

    <h2 id="proposal">Discussion and proposal</h2>

    <p>We already said that solutions 1, 2 and 3 are ultimately inelegant, since they create a
      redundant structure that replicates existing facilities and only differs in subtle
      details. Solution 4 (a new token) feels cleaner and more orthogonal. In the words of
      Richard Smith:</p>
    
    <p>&ldquo;I remain unconvinced that implicitly adding or removing a comma is a good idea.
      We need the user to tell us which behavior they want.&rdquo;</p>

    <p>We can do a little better than solution 4. Our proposal is to add a new, special kind
      of <em>functional</em> macro <code>__VA_OPT__</code>. This macro may only be used in
      the replacement text of a variadic macro:</p>

    <div class="code">#define F(X, Y, ...) a b c __VA_OPT__(<em>content</em>)</div>

    <p>The semantics are as follows: If the variable arguments contain no tokens, then
      <code>__VA_OPT__(<em>content</em>)</code> is replaced by no tokens (more precisely,
      by a placemarker). Otherwise, it is replaced by <em>content</em>, which can contain
      any admissible replacement text, including <code>__VA_ARGS__</code>.</p>

    <p>The canonical use case of <code>__VA_OPT__</code> is for an optional separator:</p>

    <div class="code">#define LOG(msg, ...) printf(msg __VA_OPT__(,) __VA_ARGS__)

      LOG("hello world") &nbsp; // => printf("hello world")
      LOG("hello world", ) // => printf("hello world")
      LOG("hello %d", n) &nbsp; // => printf("hello %d", n)</div>

    <p>However, this mechanism allows other constructions, too:</p>

    <div class="code">#define SDEF(sname, S, ...) S sname __VA_OPT__(= { __VA_ARGS__ })

      SDEF(foo); &nbsp; &nbsp; &nbsp; &nbsp; &nbsp; // => S foo;
      SDEF(bar, 1, 2, 3); &nbsp;// => S foo = { 1, 2, 3 };</div>

    <p></p>

    <div class="code strictpre">#define LOG(...)                                      \
    printf("at line=%d" __VA_OPT__(": "), __LINE__);  \
    __VA_OPT__(printf(__VA_ARGS__);)                  \
    printf("\n")

LOG();                          // => printf("at line=%d", 123); printf("\n");
LOG("All well in zone %n", n);  // => printf("at line=%d: ", 123); printf("All well in zone %n", n); printf("\n");</div>

    <h2 id="impact">Impact</h2>

    <p>The proposal is a pure extension of the preprocessor. Syntax that was previously not
      allowed becomes admissible under the proposed changes.</p>

    <h2 id="experience">Implementation experience</h2>

    <p>The proposed extension to allow omission of the variable arguments has been implemented
      by many compilers. We do not know of any experience with anything resembling the
      <code>__VA_OPT__</code> extension.</p>

    <h2 id="wording">Proposed wording</h2>

    <p>Change paragraph 16.3p4 as follows.</p>

    <div class="modify">
      <p>If the <em>identifier-list</em> in the macro definition does not end with an ellipsis,
        the number of arguments (including those arguments consisting of no preprocessing
        tokens) in an invocation of a function-like macro shall equal the number of parameters
        in the macro definition. Otherwise, there shall be <del>more</del><ins>at least as
        many</ins> arguments in the invocation <del>than</del><ins>as</ins> there are parameters
        in the macro definition (excluding the <code>...</code>). There shall exist a
        <code>)</code> preprocessing token that terminates the invocation.</p>
    </div>

    <p>Change paragraph 16.3p5 as follows.</p>
    <div class="modify">
      <p>The <del>identifier</del><ins>identifiers</ins> <code>__VA_ARGS__</code>
      <ins>and <code>__VA_OPT__</code></ins> shall occur only in the replacement-list
      of a function-like macro that uses the ellipsis notation in the parameters.</p>
    </div>

    <p>Change paragraph 16.3p12 as follows.</p>

    <div class="modify">
      <p>If there is a <code>...</code> immediately preceding the <code>)</code> in the
        function-like macro definition, then the trailing arguments<ins> (if any)</ins>,
        including any separating comma preprocessing tokens, are merged to form a single item:
        the variable arguments. The number of arguments so combined is such that, following
        merger, the number of arguments is <ins>either equal to or</ins> one more than the
        number of parameters in the macro definition (excluding the <code>...</code>).</p>
    </div>

    <p>Append a new paragraph to subsection 16.3.1 as follows.</p>
    <div class="insert">
      <p><ins>The identifier <code>__VA_OPT__</code> shall always occur as part of the token
        sequence <code>__VA_OPT__(<em>content</em>)</code>, where <code><em>content</em></code>
        is an arbitrary sequence of <em>preprocessing-token</em>s other than
        <code>__VA_OPT__</code>, which is terminated by the closing <code>)</code> and skips
        intervening pairs of matching left and right parentheses.
        The token sequence <code>__VA_OPT__(<em>content</em>)</code> that occurs
        in the replacement list is replaced by a placemarker preprocessing token (16.3.3,
        16.3.4) if the variable arguments consist of no tokens, otherwise it is replaced
        by <code><em>content</em></code>. [Example:</ins></p>
      <div class="code strictpre"><ins>#define F(...)           f(0 __VA_OPT__(,) __VA_ARGS__)
#define G(X, ...)        f(0, X __VA_OPT__(,) __VA_ARGS__)
#define SDEF(sname, ...) S sname __VA_OPT__(= { __VA_ARGS__ })

F(a, b, c)        // replaced by f(0, a, b, c)
F()               // replaced by f(0)

G(a, b, c)        // replaced by f(0, a, b, c)
G(a, )            // replaced by f(0, a)
G(a)              // replaced by f(0, a)

SDEF(foo);        // replaced by S foo;
SDEF(bar, 1, 2);  // replaced by S bar = { 1, 2 };</ins></div>

      <p><ins>&ndash; end example]</ins></p>
    </div>

    <h2 id="wg14">Compatibility with C</h2>

    <p>The entire proposal (rationale, implementation experience and wording) applies almost
      verbatim to the C language as well. (For the wording changes, the C++ section 16.3
      corresponds to the C section 6.10.3.) We would like to ask the WG14 liaison to discuss
      this proposal with WG14 and provide feedback, and we would like to encourage WG14 to adopt
      the same extension for C in the interest of future compatibility.</p>

    <h2 id="ack">Acknowledgements</h2>

    <p>Many thanks to Dawn Perchik, David Krauss and Richard Smith for valuable discussion,
      guidance, suggestions, examples and review! Thanks also go to the members of WG14 for
      their hospitality and a very productive discussion.</p>

  </body>
</html>
