
<!-- saved from url=(0113)http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77 -->
<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
  
  <meta http-equiv="x-ua-compatible" content="ie=edge">
  <meta name="viewport" content="width=device-width, initial-scale=1">
  <style type="text/css">
    body {
      color: #000000; background-color: #FFFFFF; max-width: 7in;
    }
    ul.toc {
      list-style-type: none;
    }
    code {
      padding: 2px;
      padding-left: 10px;
      display:table;
      white-space:pre;
      margin:2px;
      margin-bottom:10px;
    }
    blockquote {
      border-left: 2px solid #d0d0d0;
      padding-left: 10px;
      margin: 0;
    }
    table { border-collapse:collapse; }
    table.tbl, table.tbl th, table.tbl td {
      border: 1px solid black;
      padding: 5px
    }
    ins {
      text-decoration:none;
      font-weight:bold;
      background-color:#A0FFA0
    }
    del {
      text-decoration:line-through;
      background-color:#FFA0A0
    }
    .todo {
      color: red;
    }
    .todo::before {
      font-weight: bold;
      content: "TODO: ";
    }
  </style>
  <title>Vector Length Agnostic SIMD</title>
</head>
<body>

<table>
  <tbody><tr>
    <td width="172" align="left" valign="top">Document number:</td>
    <td width="435">P1101R0</td>
  </tr>
  <tr>
    <td width="172" align="left" valign="top">Date:</td>
    <td width="435">2018-05-22</td>
  </tr>
  <tr>
    <td width="172" align="left" valign="top">Project:</td>
    <td width="435">SG1, EWG</td>
  </tr>
  <tr>
    <td width="172" align="left" valign="top">Reply-to:</td>
    <td width="435">
      Mikhail Maltsev
      &lt;<a href="mailto:mikhail.maltsev@arm.com">mikhail.maltsev@arm.com</a>&gt;<br>
      Richard Sandiford
      &lt;<a href="mailto:richard.sandiford@arm.com">richard.sandiford@arm.com</a>&gt;
    </td>
  </tr>
</tbody></table>

<h1>Vector Length Agnostic SIMD</h1>

<p></p><ul class="toc">
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#intro">1. Introduction</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#sizeless">2. Sizeless types</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#core">3. Core language design considerations</a>
    <ul class="toc">
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#sizeof">3.1. Interaction with <tt>sizeof</tt></a></li>
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#builtin">3.2. Built-in sizeless types</a></li>
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#datamem">3.3. Sizeless data members</a></li>
    </ul>
  </li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ts">4. Changes to Parallelism TS</a>
    <ul class="toc">
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#abi">4.1. ABI tags</a></li>
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#size">4.2. Changes to the <tt>size</tt> member functions</a></li>
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ctors">4.3. SIMD type constructors</a></li>
      <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#nonmemb">4.4. Non-member functions</a></li>
    </ul>
  </li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#impl">5. Implementation experience</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#questions">6. Questions proposed for discussions and polls</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#wording-is">7. C++ Standard Wording</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#wording-ts">8. Parallelism TS v2 Wording</a></li>
  <li><a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#references">9. References</a></li>
</ul><p></p>

<h2><a name="intro">1. Introduction</a></h2>

<p>This paper proposes extensions to the C++ Standard and the Parallelism v2 TS
  <a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ref-ts">[1]</a> enabling the support for vector length agnostic
  SIMD extensions such as the Arm Scalable Vector Extension (SVE) and the
  RISC-V Vector Extension. This paper is mostly based on the SVE.</p>

<p>The Scalable Vector Extension (SVE) <a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ref-sve">[2]</a> is an extension
  of the ARMv8-A A64 instruction set developed to target HPC workloads. Unlike
  traditional SIMD architectures, which define a fixed size for their vector
  registers, SVE only specifies a maximum size. This freedom of choice is done to
  enable different Arm-based CPU vendors develop their own implementation,
  targeting specific workloads and technologies which could benefit from a
  particular vector length.</p>

<p>A key goal of SVE is to allow the same program image to be run on any
  implementation of the architecture (which might implement different vector
  lengths), so it includes instructions which permit vector code to adapt
  automatically to the current vector length at runtime.</p>

<h2><a name="sizeless">2. Sizeless types</a></h2>

<p>This proposal introduces a notion of <em>sizeless</em> types into the C++ type
  system. The size of an object of a sizeless type is not known at compile time,
  but becomes known at run time and does not change throughout the object
  lifetime.</p>

<p>Objects of sizeless types can be allocated on the stack and in certain CPU
  registers, thus such objects are only allowed to have automatic storage
  duration. Types of function parameters and return types can be sizeless.</p>

<p>It is allowed to create pointers and references to sizeless types.</p>

<p>Sizeless types retain some of the restrictions of the standard-defined
  incomplete types:</p>
<p></p><ul>
  <li>The argument to <tt>sizeof</tt> and <tt>alignof</tt> cannot be a sizeless
    type, or an object of sizeless type.</li>
  <li>It is not possible to perform arithmetic on pointers to sizeless
    types.</li>
  <li>Sizeless type cannot be used as an operand of a new-expression.</li>
  <li>Members of unions, structures and classes cannot have sizeless type.</li>
  <li>It is not possible to throw or catch objects of sizeless type.</li>
  <li>Standard library containers like <tt>std::vector</tt> cannot have a
    sizeless <tt>value_type</tt>.</li>
</ul><p></p>

<h2><a name="core">3. Core language design considerations</a></h2>

<p>We realize that sizeless types are a feature implemented only in a limited set
  of architectures, and this feature is only used in SIMD computations. Therefore
  we would prefer to minimize the impact on the C++ standard for its users and
  implementors.</p>

<h3><a name="sizeof">3.1. Interaction with <tt>sizeof</tt></a></h3>

<p>Since it is not possible to evaluate the size of an object of a sizeless type
  at compile time, we see two possible ways of sizeof behavior for this case</p>
<p></p><ol>
  <li>Evaluate the size at run time</li>
  <li>Make <tt>sizeof</tt> ill-formed for sizeless types</li>
</ol><p></p>

<p>In this paper we propose the latter, since sizeof is currently a core
  constant expression and changing this would make this proposal very
  invasive.</p>

<h3><a name="builtin">3.2. Built-in sizeless types</a></h3>

<p>The set of built-in sizeless types is implementation-defined (and
  implementations that do not support any sizeless types will still be considered
  conforming). We do not propose to standardize the names of built-in sizeless
  types. The users are encouraged to use the SIMD library types instead.</p>

<h3><a name="datamem">3.3. Sizeless data members</a></h3>

<p>There are multiple possible options for sizeless data members, such as:</p>
<p></p><ol>
  <li>Do not allow sizeless objects to be class members</li>
  <li>Allow sizeless object to be the last member of a class</li>
  <li>Allow embedding sizeless objects at arbitrary locations</li>
  <li>Make sizeless classes follow an entirely new set of rules (for example,
    allow the implementation to reorder data members of sizeless classes)</li>
</ol><p></p>

<p>Some support of sizeless members is required to implement this proposal at
  least as an implementation detail since <tt>simd</tt> and <tt>simd_mask</tt> are
  standard library templates rather than built-in types, and thus need to have an
  implementation.</p>

<p>This proposal takes the most conservative approach 1, which can be extended
  later. Specifically, we do not specify any mechanism that would allow users to
  define classes with sizeless data members. SIMD library implementors will need
  to use compiler-specific features not explicitly defined in the standard in
  order to implement templates such as <tt>simd</tt>, <tt>simd_mask</tt> and
  <tt>where_expression</tt> for sizeless SIMD types. This resembles the
  C++ atomics: while the <tt>std::atomic</tt> template is merely a library type,
  it cannot be implemented without proper support in the core language
  and compiler-specific built-ins.</p>

<p>The Arm HPC Compiler team is experimenting with option 4 mentioned above.
  Specifically, the HPC Compiler supports a new <em>class-key</em>,
  <tt>__sizeless_struct</tt>. Classes defined as <tt>__sizeless_struct</tt> can
  include sizeless data members, and their size and layout (offsets of data
  members) only becomes known at run time. The use of such classes is more
  restricted compared to normal classes (for example querying member offsets
  using the <tt>offsetof</tt> macro is not allowed).</p>

<h2><a name="ts">4. Changes to Parallelism TS</a></h2>

<p>The general direction proposed in this paper is to allow using sizeless SIMD
  types in the same way the sized types are used, and at the same time not to
  restrict the users who do not care about the architectures with vector
  length-agnostic SIMD.</p>

<p>Specifically, we propose to add a new kind of ABI tags and avoid changing
  the requirements for the instantiations of SIMD library templates that do not
  use the sizeless ABI tags.</p>

<p>An alternative (described in <a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ref-size">[4]</a>) would be to define
  a separate set of SIMD library types for the vector length agnostic SIMD, such
  as <tt>sizeless_simd</tt>, <tt>sizeless_simd_mask</tt>, etc. The TS should also
  define a common set of operations supported by both the sizeless types, and the
  sized ones (for example, in form of concepts).</p>

<h3><a name="abi">4.1. ABI tags</a></h3>

<p>We need to provide a new kind of ABI tags for sizeless types. Since currently
  the TS allows implementations define extended ABI tags, we cannot assume that
  there will be only a single tag for sizeless SIMD type.</p>

<p>It is implementation-defined whether sizeless SIMD types are supported.</p>

<p>The user needs a way to distinguish the ABI tags of sizeless types from the
  sized ones. The most simple way to do this is to add a static boolean member to
  the ABI tag classes, i.e.:</p>

<code>struct scalar {
  static constexpr bool is_sized = true;
};</code>

<h3><a name="size">4.2. Changes to the <tt>size</tt> member functions</a></h3>

<p>Since sizeless types can only be used in restricted contexts (for example,
  under the current proposal, sizeless objects are not allowed to have static
  storage duration), we need to explicitly define the set of SIMD library
  types that are allowed to be defined as sizeless and the conditions under which
  they are defined as sizeless. Specifically, the types</p>

<p></p><ul>
  <li><tt>simd&lt;T, Abi&gt;</tt></li>
  <li><tt>simd_mask&lt;T, Abi&gt;</tt></li>
  <li><tt>const_where_expression&lt;simd_mask&lt;T,&nbsp;Abi&gt;,
    simd&lt;T,&nbsp;Abi&gt;&gt;</tt></li>
  <li><tt>where_expression&lt;simd_mask&lt;T,&nbsp;Abi&gt;,
    simd&lt;T,&nbsp;Abi&gt;&gt;</tt></li>
</ul><p></p>

<p>are sizeless iff <tt>Abi::is_sized</tt> is <tt>false</tt>.</p>

<h3><a name="ctors">4.3. SIMD type constructors</a></h3>

<p>Class template <tt>simd</tt> supports 4 kinds of constructors: conversion
  constructor, broadcast constructor, generator constructor and load constructor.
  Only the conversion and generator constructors are affected by sizeless
  types.</p>

<p>The conversion constructor only allows conversions from and to 
 <tt>fixed_size&lt;N&gt;</tt> ABI tags. We propose to add a new overload for
 sizeless types.</p>

<p>The generator constructor is defined in such a way that it is only
  implementable for constexpr size (the type of the argument passed to
  <tt>gen</tt> is <tt>std::integral_constant</tt>). We propose to change
  it to <tt>size_t</tt>.</p>

<h3><a name="nonmemb">4.4. Non-member functions</a></h3>

<p><tt>split</tt> and <tt>concat</tt> inherently rely on width being a
  compile-time constant. We propose to disable these functions for sizeless types.
  Alternatively, we could allow using them with sizeless types by providing
  additional overloads that do not perform return type deduction, and require the
  user to provide objects of correct widths (otherwise the behavior is
  undefined).</p>

<p><tt>simd_cast</tt>, <tt>static_simd_cast</tt> and <tt>to_fixed_size</tt>
  require changes in wording to support sizeless types. Casting between
  fixed-sized and sizeless SIMD types can be made possible by saying that the
  behavior is undefined if the actual size of <tt>From</tt> is less than the
  size of <tt>To</tt> at run time. For now we propose to disable casting between
  sized and sizeless types.</p>

<h2><a name="impl">5. Implementation experience</a></h2>

<p>The specification defining the SVE extension <a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ref-sve">[2]</a> 
  has been published. Although it is currently in “beta” stage,
  it is highly unlikely that it will be changed in a way that would affect
  this proposal.</p>

<p>The “ARM C Language Extensions for SVE” specification
  <a href="http://wiki.edg.com/pub/Wg21rapperswil2018/SG1/P1101R0.html?twiki_redirect_cache=5fb36459d3057a3cf02c6c525acc6e77#ref-acle">[3]</a> defining the extensions for C and C++ standards
  required for SVE intrinsics support is also available.</p>

<p>Initial SVE support has been upstreamed both in GCC and LLVM. These toolchains
  support use of SVE instructions in assembly language. GCC can generate SVE
  instructions in auto-vectorized code. Support for SVE types and intrinsics in C
  and C++ is currently available in an LLVM-based toolchain provided by Arm.</p>

<p>Upstreaming the intrinsics support in GCC and LLVM is work-in-progress.</p>

<p>Hardware implementations of SVE-enabled cores are currently not available
  commercially but are expected.</p>

<h2><a name="questions">6. Questions proposed for discussions and polls</a></h2>

<p>Core language:</p>
<p></p><ol>
  <li>Should we proceed with this proposal and sizeless types?</li>
  <li><tt>sizeof(sizeless_t)</tt>: ill-formed vs. non-constant?</li>
  <li>Sizeless class members (in user code)?</li>
</ol><p></p>

<p>Parallelism TS:</p>
<p></p><ol>
  <li>Sizeless specializations vs separate set of primary templates?</li>
  <li><tt>simd</tt> generator constructor: disable for sizeless types or change
    the parameter type of <tt>gen</tt>?</li>
  <li>Allow conversion between sizeless and fixed-size SIMD types?</li>
  <li>Allow concatenation and splitting of sizeless SIMD types (by requiring the
    user to provide objects of correct widths)?</li>
</ol><p></p>

<h2><a name="wording-is">7. C++ Standard Wording</a></h2>
<p>Change in [basic.def]/5:</p>
<blockquote>
  <p>A program is ill-formed if the definition of any object gives the object an
    <del>incomplete</del><ins>indefinite</ins> type.</p>
</blockquote>

<p>Add new paragraph after [basic.def]/5:</p>
<blockquote>
  <p><ins>A program is ill-formed if any declaration of an object gives it both
    a sizeless type and either static or thread-local storage
    duration.</ins></p>
</blockquote>

<p>Change in [basic.types]/5:</p>
<blockquote>
  <p>A class that has been declared but not defined, an enumeration type in
    certain contexts ([dcl.enum]), or an array of unknown bound or of
    <del>incomplete</del><ins>indefinite</ins> element type, is an
    <em><del>incompletely-defined</del><ins>indefinite</ins> object type</em>.
    <del>Incompletely-defined</del><ins>Indefinite</ins> object types and cv void
    are <em><del>incomplete</del><ins>indefinite</ins> types</em>
    ([basic.fundamental]). Objects shall not be defined to have an
    <del>incomplete</del><ins>indefinite</ins> type.</p>
  <p><ins>Object and void types are further partitioned into <em>sized</em> and
    <em>sizeless</em>; all basic and derived types defined in this standard are
    sized, but an implementation may provide additional sizeless types.</ins></p>
  <p><ins>An object or void type is said to be <em>complete</em> if it is both
    sized and definite; all other object and void types are said to be
    <em>incomplete</em>. The term <em>completely-defined object type</em> is
    synonymous with <em>complete object type</em>.</ins></p>
  <p><ins>Arrays and enumeration types are always sized, so for them the term
    <em>incomplete</em> is equivalent to (and used interchangeably with) the
    term <em>indefinite</em>.</ins></p>
</blockquote>

<p>Change in [basic.types]/7:</p>
<blockquote>
  <p>[ Note: The rules for declarations and expressions describe in which
  contexts <del>incomplete</del><ins>indefinite</ins> types are prohibited.
  — end note ]</p>
</blockquote>

<p>Change in [basic.fundamental]/9:</p>
<blockquote>
  <p>A type cv <tt>void</tt> is <del>an incomplete</del><ins>sized
    indefinite</ins> type that cannot be completed <ins>(made
    definite)</ins>; …</p>
</blockquote>

<p>Change in [basic.compound]/3:</p>
<blockquote>
  <p>… Pointers to incomplete types <ins>(including indefinite
    types)</ins> are allowed although there are restrictions on what can be done
    with them. …</p>
</blockquote>

<p>Change in [basic.lval]/9:</p>
<blockquote>
  <p>Unless otherwise indicated ([expr.call]), a prvalue shall always have
    <del>complete</del><ins>definite</ins> type or the void type.  A glvalue
    shall not have type cv void.  [ Note: A glvalue may have
    <del>complete</del><ins>definite</ins> or
    <del>incomplete</del><ins>indefinite</ins> non-void type. Class and array
    prvalues can have cv-qualified types; other prvalues always have
    cv-unqualified types. See [expr.prop]. — end note ]</p>
</blockquote>

<p>Change in [conv.lval]/1:</p>
<blockquote>
  <p>A glvalue of a non-function, non-array type <tt>T</tt> can be converted to
    a prvalue.  If <tt>T</tt> is an <del>incomplete</del><ins>indefinite</ins>
    type, a program that necessitates this conversion is ill-formed.  If
    <tt>T</tt> is a non-class type, the type of the prvalue is the
    cv-unqualified version of <tt>T</tt>.  Otherwise, the type of the prvalue is
    <tt>T</tt>.</p>
</blockquote>

<p>Change in [expr.call]/7:</p>
<blockquote>
  <p>… When a function is called, the parameters that have object type
  shall have <del>completely-defined</del><ins>definite</ins> object type.
  [ Note: this still allows a parameter to be a pointer or reference to an
  <del>incomplete</del><ins>indefinite</ins> class type. However, it prevents a
  passed-by-value parameter to have an
  <del>incomplete</del><ins>indefinite</ins> class type. — end note ]
  …</p>
</blockquote>

<p>Change in [expr.unary.op]/1:</p>
<blockquote>
  <p>… [ Note: Indirection through a pointer to an
  <del>incomplete</del><ins>indefinite</ins> type (other than cv void) is valid.
  The lvalue thus obtained can be used in limited ways (to initialize a
  reference, for example); this lvalue must not be converted to a prvalue, see
  [conv.lval]. — end note ]</p>
</blockquote>

<p>Change in [expr.delete]/2:</p>
<blockquote>
  <p>If the operand has a class type, the operand is converted to a pointer type
  by calling the above-mentioned conversion function, and the converted operand
  is used in place of the original operand for the remainder of this
  subclause. <ins>The type of the operand must now be a pointer to a sized type,
  otherwise the program is ill-formed.</ins> …</p>
</blockquote>

<p>Change in [dcl.array]/1:</p>
<blockquote>
  <p>… <tt>T</tt> is called the array element type; this type shall not
  be a reference type, cv void, <ins>a sizeless type,</ins> a function type or
  an abstract class type. …</p>
</blockquote>

<p>Change in [class.static.data]/2:</p>
<blockquote>
  <p>The declaration of a non-inline static data member in its class definition
  is not a definition and may be of an <del>incomplete</del><ins>sized
  indefinite</ins> type other than cv void.</p>
</blockquote>

<p>Add new paragraph after [class.static.data]/7:</p>
<blockquote>
   <p><ins>A static data member shall not have sizeless type.</ins></p>
</blockquote>

<p>Change in [temp.arg.type]/2:</p>
<blockquote>
  <p>… [ Note: A template type argument may be an
  <del>incomplete</del><ins>indefinite</ins> type. — end note ]</p>
</blockquote>

<p>Change in [meta.unary.prop]/3:</p>
<blockquote>
  <p>For all of the class templates X declared in this subclause, instantiating
  that template with a template-argument that is a class template specialization
  may result in the implicit instantiation of the template argument if and only
  if the semantics of X require that the argument is a
  <del>complete</del><ins>definite</ins> type.</p>
</blockquote>

<p>Replace all occurrences of “complete” with “definite”
in the table in [meta.unary.prop]/4.</p>

<p>Replace all occurrences of “complete” with “definite”
in the table in [meta.rel]/2.</p>

<h2><a name="wording-ts">8. Parallelism TS v2 Wording</a></h2>

<p>Change in [parallel.simd.general]/1:</p>
<blockquote>
  The data-parallel library consists of data-parallel types and operations on
  these types. A data-parallel type consists of elements of an underlying
  arithmetic type, called the element type. <del>The number of elements is a
  constant for each data-parallel type and called the width of that
  type.</del>
</blockquote>

<p>Add paragraph after [parallel.simd.general]/2:</p>
  <ins>The number of elements of a data-parallel object does not change
    during object lifetime and is called the width of the corresponding
    data-parallel type.</ins>


<p>Change in [parallel.simd.syn]:</p>
<blockquote>
  <p><tt>
    struct scalar <del>{}</del>;<br>
    template&lt;int N&gt; struct fixed_size <del>{}</del>;
  </tt></p>
</blockquote>

<p>Change in [parallel.simd.abi]:</p>
<blockquote>
  <p><tt>struct scalar {<del>};</del><br>
    <ins>&nbsp;&nbsp;static constexpr bool is_sized = true;<br>
    };</ins></tt></p>
  <p><tt>template&lt;int N&gt; struct fixed_size {<del>};</del><br>
    <ins>&nbsp;&nbsp;static constexpr bool is_sized = true;<br>
    };</ins></tt></p>
</blockquote>

<p>Add paragraph after [parallel.simd.abi]/8:</p>
<blockquote>
  <ins>An implementation shall define the static constexpr boolean data member
  <tt>is_sized</tt> in each extended ABI tag. The width of the
  <tt>simd&lt;T,&nbsp;Abi&gt;</tt> specializations for which
  <tt>Abi::is_sized</tt> is <tt>false</tt> is not known at compile time.</ins>
</blockquote>

<p>Change in [parallel.simd.overview]:</p>
<blockquote>
  <tt>static <del>constexpr</del><ins><em>see below</em></ins>
    size_t size() noexcept;</tt>
</blockquote>

<p>Change in [parallel.simd.overview]/1-2:</p>
<blockquote>
  <p>The class template simd is a data-parallel type.
    The width of a given <tt>simd</tt> specialization is <del>a constant
    expression,</del> determined by the template parameters <ins>and the
    platform</ins>.</p>
  <p>Every specialization of <tt>simd</tt> shall be a
    <del>complete</del><ins>definite</ins> type. The specialization
    <tt>simd&lt;T,&nbsp;Abi&gt;</tt> is supported if <tt>T</tt> is a vectorizable
    type and</p>
  <p></p><ul>
    <li>Abi is <tt>simd_abi::scalar</tt>, or</li>
    <li>Abi is <tt>simd_abi::fixed_size&lt;N&gt;</tt>, with <tt>N</tt>
      constrained as defined in [simd.abi].</li>
  </ul><p></p>
  <p>If Abi is an extended ABI tag, it is implementation-defined whether
    simd&lt;T,&nbsp;Abi&gt; is supported. [ Note: The intent is for implementations
    to decide on the basis of the currently targeted system. — end note ]</p>
  <p><ins>if <tt>Abi::is_sized</tt> is <tt>false</tt> and the specialization
    <tt>simd&lt;T,&nbsp;Abi&gt;</tt> is supported, the specialization shall be a
    sizeless type, otherwise it shall be a sized (complete) type.</ins></p>
</blockquote>

<p>Change in [parallel.simd.overview]/4:</p>
<blockquote><dl>
  <dt><tt>static constexpr size_t size() noexcept;</tt><br>
     <ins><tt>static size_t size() noexcept;</tt></ins></dt>
  <dd>
    <p><em>Returns:</em> The width of <tt>simd&lt;T, Abi&gt;</tt>.</p>
    <p><ins><em>Remarks:</em> This function is declared <tt>constexpr</tt> iff
      <tt>abi_type::is_sized</tt> is <tt>true</tt></ins></p>
  </dd>
</dl></blockquote>

<p>Add paragraphs after [parallel.simd.ctor]/4:</p>
<blockquote><dl>
  <dt><ins>
    <tt>template &lt;class U&gt; simd(const simd&lt;U, abi_type&gt;&amp; x);</tt>
  </ins></dt>
  <dd>
    <p><ins>
      <em>Effects:</em> Constructs an object where the i-th element equals
      <tt>static_cast&lt;T&gt;(x[i])</tt> for all <tt>i ∊ [0, size())</tt>.
    </ins></p>
    <p><ins><em>Remarks:</em> This constructor shall not participate in
      overload resolution unless</ins>
      </p><ul>
        <li><ins><tt>abi_type::is_sized</tt> is <tt>false</tt>, and</ins></li>
        <li><ins>every possible value of U can be represented with type
            value_type, and</ins></li>
        <li><ins>if both <tt>U</tt> and <tt>value_type</tt> are integral, the
            integer conversion rank [conv.rank] of <tt>value_type</tt> is
            greater than the integer conversion rank of <tt>U</tt>.</ins></li>
      </ul>
    <p></p>
  </dd>
</dl></blockquote>

<p>Change in [parallel.simd.ctor]/8-11:</p>
<blockquote><dl>
  <dt><tt>template &lt;class G&gt; simd(G&amp;&amp; gen);</tt></dt>
  <dd>
    <p>Effects: Constructs an object where the i-th element is initialized to
      <tt>gen(<del>integral_constant&lt;size_t,&nbsp;i&gt;()</del><ins>i</ins>).</tt></p>
    <p>Remarks: This constructor shall not participate in overload
      resolution unless
      <tt>simd(gen(<del>integral_constant&lt;size_t,&nbsp;i&gt;()</del><ins>size_t{}</ins>))</tt>
      is well-formed <del>for all <tt>i ∊ [0, size())</tt></del>. The calls to
      <tt>gen</tt> are unsequenced with respect to each other.
      Vectorization-unsafe standard library functions may not be
      invoked by gen ([algorithms.parallel.exec]).</p>
  </dd>
</dl></blockquote>

<p>Change in [parallel.simd.casts]/1-6:</p>
<blockquote><dl>
  <dt><tt>template&lt;class T, class U, class Abi&gt;
    <em>see below</em> simd_cast(const simd&lt;U, Abi&gt;&amp; x)</tt></dt>
  <dd>
    <p>Let <tt>To</tt> identify <tt>T::value_type</tt> if
      <tt>is_simd_v&lt;T&gt;</tt> is <tt>true</tt>, or <tt>T</tt> otherwise.</p>
    <p><em>Returns:</em> A simd object with the i-th element initialized to
      <tt>static_cast&lt;To&gt;(x[i])</tt> for all <tt>i ∊ [0, size())</tt>.</p>
    <p><em>Throws:</em> Nothing.</p>
    <p><em>Remarks:</em> The function shall not participate in overload resolution unless
      every possible value of type <tt>U</tt> can be represented with type
      <tt>To</tt>, and either</p>
    <ul>
      <li><tt>is_simd_v&lt;T&gt;</tt> is <tt>false</tt>, or</li>
      <li><ins><tt>T::abi_type</tt> is <tt>Abi</tt>, and <tt>U</tt> is <tt>T</tt>
        or</ins></li>
      <li><ins><tt>Abi::is_sized</tt> is <tt>true</tt>, and
        <tt>T::abi_type::is_sized</tt> is <tt>true</tt>, and</ins>
        <tt>T::size() == simd&lt;U,&nbsp;Abi&gt;::size()</tt> is <tt>true</tt>.</li>
    </ul>
    <p>The return type is</p>
    <ul>
      <li><tt>T</tt> if <tt>is_simd_v&lt;T&gt;</tt> is <tt>true</tt>,
        otherwise</li>
      <li><tt>simd&lt;T,&nbsp;Abi&gt;</tt> if <tt>U</tt> is <tt>T</tt>,
        otherwise</li>
      <li><tt>simd&lt;T,&nbsp;simd_abi::fixed_size&lt;simd&lt;U,Abi&gt;::size()&gt;&gt;</tt></li>
    </ul>
  </dd>
</dl></blockquote>

<p>Change in [parallel.simd.casts]/7-12:</p>
<blockquote><dl>
  <dt>
    <tt>template&lt;class T, class U, class Abi&gt;
    <em>see below</em> static_simd_cast(const simd&lt;U, Abi&gt;&amp; x)</tt>
  </dt>
  <dd>
    <p>Let <tt>To</tt> identify <tt>T::value_type</tt> if
      <tt>is_simd_v&lt;T&gt;</tt> is <tt>true</tt>, or <tt>T</tt> otherwise.</p>
    <p><em>Returns:</em> A simd object with the i-th element initialized to
      <tt>static_cast&lt;To&gt;(x[i])</tt> for all <tt>i ∊ [0, size())</tt>.</p>
    <p><em>Throws:</em> Nothing.</p>
    <p><em>Remarks:</em> The function shall not participate in overload
      resolution unless either</p>
    <ul>
      <li><tt>is_simd_v&lt;T&gt;</tt> is <tt>false</tt>, or</li>
      <li><ins><tt>T::abi_type</tt> is <tt>Abi</tt>, and <tt>U</tt> is <tt>T</tt>
        or <tt>U</tt> and <tt>T</tt> are integral types that only differ in
        signedness, or</ins></li>
      <li><ins><tt>Abi::is_sized</tt> is <tt>true</tt>, and
        <tt>T::abi_type::is_sized</tt> is <tt>true</tt>, and</ins>
        <tt>T::size() == simd&lt;U,&nbsp;Abi&gt;::size()</tt> is <tt>true</tt>.</li>
    </ul>
    <p>The return type is</p>
    <ul>
      <li><tt>T</tt> if <tt>is_simd_v&lt;T&gt;</tt> is <tt>true</tt>,
        otherwise</li>
      <li><tt>simd&lt;T,&nbsp;Abi&gt;</tt> if <tt>U</tt> is <tt>T</tt> or
        <tt>U</tt> and <tt>T</tt> are integral types that only differ in
        signedness, otherwise</li>
      <li><tt>simd&lt;T,&nbsp;simd_abi::fixed_size&lt;simd&lt;U, Abi&gt;::size()&gt;&gt;</tt></li>
    </ul>
  </dd>
</dl></blockquote>

<p>Add paragraph after [parallel.simd.casts]/14:</p>
<blockquote><dl>
  <dt><tt>template&lt;class T, class Abi&gt;<br>
    fixed_size_simd&lt;T, simd_size_v&lt;T, Abi&gt;&gt; to_fixed_size(const
    simd&lt;T, Abi&gt;&amp; x) noexcept;<br>
    template&lt;class T, class Abi&gt;<br>
    fixed_size_simd_mask&lt;T, simd_size_v&lt;T, Abi&gt;&gt;
    to_fixed_size(const simd_mask&lt;T, Abi&gt;&amp; x) noexcept;
    </tt></dt>
  <dd>
    <p>
      <em>Returns:</em> A data-parallel object with the i-th element
      initialized to <tt>x[i]</tt> for all <tt>i ∊ [0, size())</tt>.</p>
    <p><ins><em>Remarks:</em> These functions shall not participate in overload
      resolution unless <tt>Abi::is_sized</tt> is <tt>true</tt></ins></p>
  </dd>
</dl></blockquote>

<p>Change in [parallel.simd.mask.overview]:</p>
<blockquote>
  <tt>static <del>constexpr</del><ins><em>see below</em></ins>
    size_t size() noexcept;</tt>
</blockquote>

<p>Change in [parallel.simd.mask.overview]/1-2:</p>
<blockquote>
  <p>The class template <tt>simd_mask</tt> is a data-parallel type with the
    element type <tt>bool</tt>. The width of a given <tt>simd_mask</tt>
    specialization is <del>a constant expression,</del> determined by the
    template parameters <ins>and the platform</ins>. Specifically,
    <tt>simd_mask&lt;T,&nbsp;Abi&gt;::size() == simd&lt;T,&nbsp;Abi&gt;::size()</tt>.
  </p><p>Every specialization of <tt>simd_mask</tt> shall be a
    <del>complete</del><ins>definite</ins> type. The specialization
    <tt>simd_mask&lt;T, Abi&gt;</tt> is supported if <tt>T</tt> is a vectorizable
    type and</p>
  <p></p><ul>
    <li>Abi is <tt>simd_abi::scalar</tt>, or</li>
    <li>Abi is <tt>simd_abi::fixed_size&lt;N&gt;</tt>, with <tt>N</tt>
      constrained as defined in [simd.abi].</li>
  </ul><p></p>
  <p>If Abi is an extended ABI tag, it is implementation-defined whether
    simd_mask&lt;T, Abi&gt; is supported. [ Note: The intent is for implementations
    to decide on the basis of the currently targeted system. — end note ]
    <ins>If <tt>Abi::is_sized</tt> is <tt>false</tt> and the specialization
    <tt>simd&lt;T,&nbsp;Abi&gt;</tt> is supported, the specialization shall be a
    sizeless type, otherwise it shall be a sized (complete) type.</ins>
    If <tt>simd_mask&lt;T,&nbsp;Abi&gt;</tt> is not supported, the specialization
    shall have a deleted default constructor, deleted destructor,
    deleted copy constructor, and deleted copy assignment.</p>
</blockquote>

<p>Change in [parallel.simd.mask.overview]/4:</p>
<blockquote><dl>
  <dt>
    <tt>static constexpr size_t size() noexcept;</tt><br>
    <ins><tt>static size_t size() noexcept;</tt></ins>
  </dt>
  <dd>
    <p><em>Returns:</em> The width of <tt>simd&lt;T, Abi&gt;</tt>.</p>
    <p><ins><em>Remarks:</em> This function is declared <tt>constexpr</tt> iff
      <tt>abi_type::is_sized</tt> is <tt>true</tt></ins></p>
  </dd>
</dl></blockquote>

<h2><a name="references">9. References</a></h2>
<p></p><ol>
  <li>
    <a name="ref-ts"></a>
    <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/n4742.html">
      N4742</a> Working Draft, Technical Specification for C++ Extensions for Parallelism Version 2
  </li>
  <li>
    <a name="ref-sve"></a>
    <a href="https://developer.arm.com/docs/ddi0584/latest/arm-architecture-reference-manual-supplement-the-scalable-vector-extension-sve-for-armv8-a">ARM
    Architecture Reference Manual Supplement — The Scalable Vector Extension
    (SVE), for ARMv8-A</a>
  </li>
  <li>
    <a name="ref-acle"></a>
    <a href="https://developer.arm.com/docs/100987/latest/arm-c-language-extensions-for-sve">ARM C Language Extensions for SVE</a>
  </li>
  <li>
    <a name="ref-size"></a>
    <a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2016/p0349r0.pdf">P0349R0</a> Assumptions about the size of datapar
  </li>
</ol><p></p>



</body></html>