<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<html>

<head>
<meta http-equiv=Content-Type content="text/html; charset=us-ascii">
<meta name=Generator content="Microsoft Word 10 (filtered)">
<title>Errata to the Regular Expression Proposal</title>

<style>
<!--
 /* Font Definitions */
 @font-face
	{font-family:Tahoma;
	panose-1:2 11 6 4 3 5 4 4 2 4;}
@font-face
	{font-family:Courier-Bold;}
 /* Style Definitions */
 p.MsoNormal, li.MsoNormal, div.MsoNormal
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:12.0pt;
	font-family:"Times New Roman";}
h1
	{margin-right:0cm;
	margin-left:0cm;
	font-size:24.0pt;
	font-family:"Times New Roman";
	font-weight:bold;}
h2
	{margin-right:0cm;
	margin-left:0cm;
	font-size:18.0pt;
	font-family:"Times New Roman";
	font-weight:bold;}
h3
	{margin-top:12.0pt;
	margin-right:0cm;
	margin-bottom:3.0pt;
	margin-left:0cm;
	page-break-after:avoid;
	font-size:13.0pt;
	font-family:Arial;
	font-weight:bold;}
h6
	{margin-top:12.0pt;
	margin-right:0cm;
	margin-bottom:3.0pt;
	margin-left:0cm;
	font-size:11.0pt;
	font-family:"Times New Roman";
	font-weight:bold;}
a:link, span.MsoHyperlink
	{color:blue;
	text-decoration:underline;}
a:visited, span.MsoHyperlinkFollowed
	{color:purple;
	text-decoration:underline;}
p
	{margin-right:0cm;
	margin-left:0cm;
	font-size:12.0pt;
	font-family:"Times New Roman";}
code
	{font-family:"Courier New";}
pre
	{margin:0cm;
	margin-bottom:.0001pt;
	font-size:10.0pt;
	font-family:"Courier New";}
ins
	{text-decoration:none;}
span.msoIns
	{text-decoration:underline;}
span.msoDel
	{text-decoration:line-through;
	color:red;}
@page Section1
	{size:612.0pt 792.0pt;
	margin:72.0pt 90.0pt 72.0pt 90.0pt;}
div.Section1
	{page:Section1;}
 /* List Definitions */
 ol
	{margin-bottom:0cm;}
ul
	{margin-bottom:0cm;}
-->
</style>

<meta name="vs_targetSchema"
content="http://schemas.microsoft.com/intellisense/ie5">
</head>

<body lang=EN-US link=blue vlink=purple>

<div class=Section1>

<div>

<div>

<h1>Errata to the Regular Expression Proposal </h1>

<p>Doc.no.:J16/03-0090=WG21/N1507=<br>
Date: 16 Sept 2003<br>
Project: Programming Language C++ <br>
Subgroup: Library<br>
<span class=spelle>Wiki</span>: <a
href="http://www.research.att.com/~ark/cgi-bin/lwg-wiki/wiki?RegularExpressions">http://www.research.att.com/~ark/cgi-bin/lwg-wiki/wiki?RegularExpressions</a><br>
Reply to: John Maddock &lt;john@johnmaddock.co.uk&gt;</p>

<p>This document contains a list of corrections to <a
href="http://anubis.dkuug.dk/jtc1/sc22/wg21/docs/papers/2002/n1429.htm">N1429</a>:
&#8220;A Proposal to add Regular Expressions to the Standard Library&#8221;.</p>

<p>The document is divided up into two sections: the first deals with errors in
the original proposal (mostly minor issues), the second with more substantive
issues which may require further debate.</p>

<h2>Section 1: Errata to the original proposal</h2>

<h2>Bad rationale for regex_ prefixes</h2>

<p><em>Pete Becker writes: </em>The proposal has the following statement:</p>

<blockquote style='margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'>

<p>Many people may see the latter as an evil, but experience suggests that
statements such as:</p>

<pre><span class=grame>using</span> namespace whatever;</pre>

<p><span class=grame>are</span> remarkably common in practice. For this reason
the &quot;regex_&quot; prefixed names that Boost regex uses are retained in
this proposal.</p>

</blockquote>

<p>I'm not strongly for or against the regex_ prefixes. They may well be
helpful in understanding code. But I'm strongly against the notion that the
standard library should use prefixes because users abuse using declarations.
One of the key reasons for the existence of namespaces is to provide a
language-level way of distinguishing otherwise identical names in disparate
libraries. If the standard library has to use prefixes for this purpose then
namespaces have failed. We should either encourage users who have written bad
code to clean it up, or we should remove namespaces from the language.</p>

<p><em>John Maddock notes:</em> there was strong agreement on the library
reflector that the rationale here is just plain wrong.&nbsp; There was some
feeling that the regex_ prefixes did help improve code readability though.</p>

<p><em>Proposed changes:</em> </p>

<p>Replace all of:</p>

<blockquote style='margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'>

<p>The names of the algorithms in this proposal are based upon the names Boost
uses and all have a &quot;regex_&quot; prefix, which may cause some controversy
(as may the naming used in general in this proposal). This proposal does not
use overly generic names like match/compare/search/find here, as the
possibility for ambiguous overloads occurring should be obvious. An alternative
is to use a generic name, but placed in a nested namespace (let's say <code><span
style='font-size:10.0pt'>std::regex</span></code>). This may be an appropriate
option, but ambiguities can still occur either as a result of argument
dependent lookup, or a using directive. Many people may see the latter as an
evil, but experience suggests that statements such as:</p>

<pre><span class=grame>using</span> namespace whatever;</pre>

<p><span class=grame>are</span> remarkably common in practice. For this reason
the &quot;regex_&quot; prefixed names that Boost regex uses are retained in
this proposal. Here are the twelve forms for <code><span style='font-size:10.0pt'>regex_match
</span></code>and <code><span style='font-size:10.0pt'>regex_search</span></code>:</p>

</blockquote>

<p>With: </p>

<blockquote style='margin-top:5.0pt;margin-right:0cm;margin-bottom:5.0pt'>

<p>The names of the algorithms in this proposal are based upon the names Boost
uses and all have a &quot;regex_&quot; prefix, which may cause some controversy
(as may the naming used in general in this proposal). These names are retained
largely on the grounds of code legibility. Here are the twelve forms for <code><span
style='font-size:10.0pt'>regex_match </span></code>and <code><span
style='font-size:10.0pt'>regex_search</span></code>:</p>

</blockquote>

<h2>Unintended occurrence of <span class=spelle>reg_expression</span>:</h2>

<p>There is&nbsp;a systematic error in the &quot;proposed text&quot; section:
the various algorithms have been defined to accept a type &quot;<span
class=spelle>reg_expression</span>&quot; which does not in fact exist in the
proposal, and which should of course be called &quot;<span class=spelle>basic_regex</span>&quot;.&nbsp;
This is an editing error that crept in when the name of that class was changed
from <span class=spelle>reg_expression</span> to <span class=spelle>basic_regex</span>.</p>

<p>The fix is to just replace all occurrences of &quot;<span class=spelle>reg_expression</span>&quot;
with &quot;<span class=spelle>basic_regex</span>&quot; throughout that section.</p>

<h2>Iterators have incorrect definitions of the types &quot;reference&quot; and
&quot;pointer&quot;.</h2>

<p>In <span class=spelle>regex_iterator</span> and <span class=spelle>regex_token_iterator</span>
the definitions given for the types &quot;iterator&quot; and
&quot;reference&quot; are wrong: as given these types refer/point to the <span
class=spelle>value_type</span> of the underlying iterator type, but should of
course refer/point to the actual <span class=spelle>value_type</span> being
enumerated (the two are not the same type).</p>

<p>The fix is to change:</p>

<pre><span class=grame>typedef</span> <span class=spelle>typename</span> <span
class=spelle>iterator_traits</span>&lt;<span class=spelle>BidirectionalIterator</span>&gt;::pointer <span
class=spelle>pointer</span>;</pre><pre><span class=grame>typedef</span> <span
class=spelle>typename</span> <span class=spelle>iterator_traits</span>&lt;<span
class=spelle>BidirectionalIterator</span>&gt;::reference <span class=spelle>reference</span>; </pre>

<p>To:</p>

<pre><span class=grame>typedef</span> const <span class=spelle>value_type</span>* pointer;</pre><pre><span
class=grame>typedef</span> const <span class=spelle>value_type</span>&amp; reference; </pre>

<p><span class=grame>In both the </span><span class=spelle>regex_iterator</span><span
class=grame> and </span><span class=spelle>regex_token_iterator</span><span
class=grame> definitions.</span></p>

<h2><span class=grame>regex_iterator</span> does not handle zero-length matches
correctly.</h2>

<p>There is a subtle bug in <span class=spelle>regex_iterator:</span><span
class=grame>:operator</span>++; when the previous match found matched a
zero-length string, then the iterator needs to take special action to avoid
going into an infinite loop, the current wording does this but gets it wrong
because it does not allow two consecutive zero length matches, for example
iterating occurrences of &#8220;^&#8221; in the text &#8220;\n\n&#8221; yields
just one match rather than three as it should.&nbsp; The actual behavior should
be as follows:</p>

<p>When the previous match was of zero length, then check to see if there is a
non-zero-length match starting at the same position, otherwise move one
position to the right of the last match (if such a position exists), and
continue searching as normal for a (possibly zero length) match.</p>

<p>The reason for checking for a non-zero length match at the same position as
the last match is shown by the following example:&nbsp; enumerating occurrences
of &#8220;^|<span class=spelle>abc</span>|$&#8221; in the text &#8220;<span
class=spelle>abc</span>&#8221; should find three matches:</p>

<p><span class=grame>A zero length string at the start of the text.</span></p>

<p><span class=grame>The string &#8220;</span><span class=spelle>abc</span><span
class=grame>&#8221;.</span></p>

<p><span class=grame>A zero length string at the end of the text.</span></p>

<p>This behavior is then consistent with Perl, however <span class=spelle>ECMAScript</span>
appears to fall into the infinite loop trap in its description of <span
class=grame>RegExp.prototype.exec(</span>string) (<span class=spelle>ECMAScript</span>
15.10.6.2), while <span class=spelle>String.prototype.match</span> (<span
class=spelle>regexp</span>) (<span class=spelle>ECMAScript</span> 15.5.4.10)
does get this right, albeit in a different manner: enumerating &#8220;^|<span
class=spelle>abc</span>|$&#8221; in the text &#8220;<span class=spelle>abc</span>&#8221;
would find only two matches (the zero length strings at the start and end of
the text).&nbsp; </p>

<p>This issue also effects <span class=spelle>regex_token_iterator</span> which
uses the similar <span class=spelle>standardese</span> as <span class=spelle>regex_iterator</span>,
it might be better rewritten in terms of <span class=spelle>regex_iterator</span>
to avoid the duplication.</p>

<p><i>Proposed changes: </i>are incorporated into the following issue.</p>

<h2><span class=spelle>Regex_iterator</span> does not set <span class=spelle>match_results::postion</span>
correctly.</h2>

<p>As currently specified, given:</p>

<p><span class=spelle>regex_iterator</span>&lt;something&gt; <span
class=spelle>i</span>;</p>

<p><span class=grame>then</span> <code><span style='font-size:10.0pt'>i-&gt;position()
== </span></code><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New"'>i</span></span><code><span style='font-size:10.0pt'>-&gt;prefix().length()</span></code>
for all matches found.</p>

<p>This is correct for the first match found, but makes little sense for
subsequent matches where the result of <span class=spelle>i</span>-&gt;position()
is only useful if it returns the distance from the start of the string being
searched to the start of the match found. </p>

<p>[ footnote: recall that <span class=spelle>i</span>-&gt;prefix() contains
everything from the end of the last match found, to the start of the current
match, this allows search and replace operations to be constructed by copying <span
class=spelle>i</span>-&gt;prefix() unchanged to output, and then outputting a
modified version of whatever matched. ]</p>

<p>For example this problem showed up when converting a <span class=spelle>boost.regex</span>
example program from the <span class=spelle>regex_grep</span> algorithm (not
part of the proposal) to use <span class=spelle>regex_iterator</span>: the
example takes the contents of a C++ source file as a string, and creates an
index that maps C++ class names to file positions in the form of a <span
class=spelle>std::map</span>&lt;<span class=spelle>std::string</span>, <span
class=spelle>int</span>&gt;.&nbsp; In order for the program to take a <span
class=spelle>regex_iterator</span> and from that add an item to the index, it
needs to know how far it is from the start of the text being searched to the
start of the current match: that was what <span class=spelle>regex_match::</span><span
class=grame>position(</span>) was intended for, but as the proposal stands it
instead returns the distance from the end of the last match to the start of the
current match.&nbsp; </p>

<p>The proposed text for <span class=spelle>regex_iterator:</span><span
class=grame>:operator</span>++ needs to be modified to handle this, likewise
the text the for <span class=spelle>match_results::position</span> needs
updating accordingly (see also the issue &#8220;What does <span class=spelle>match_results::position</span>
return when passed an out of range index?&#8221;).</p>

<p><i>Proposed changes: </i></p>

<p>Replace:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>difference_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>
position(unsigned <span class=spelle>int</span> sub = 0)const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns </span><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>std::</span></span><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New";color:black'>distance(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>prefix().first,
(*this)[sub].first).</span></p>

<p>With:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>difference_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>
position(unsigned <span class=spelle>int</span> sub = 0)const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> <span class=grame>If </span></span><code><span
style='font-size:10.0pt'>!(*this)[sub].matched</span></code><span
style='color:black'> then returns -1.&nbsp; Otherwise returns </span><code><span
style='font-size:10.0pt'>distance(base, (*this)[sub].first)</span></code><span
class=normalwebchar>, where <i>base</i> is the start iterator of the sequence
that was searched.&nbsp; [Note &#8211; unless this is part of a repeated search
with a </span><code><span style='font-size:10.0pt'>regex_iterator</span></code><span
class=normalwebchar> then <i>base</i> is the same as </span><code><span
style='font-size:10.0pt'>prefix().first </span></code><span
class=normalwebchar>&#8211; end note]</span></p>

<p>Replace:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex_iterator</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>&amp; operator+<span
class=grame>+(</span>);</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:
</span></b><span style='color:black'>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>what.prefix(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>).first !=
what[0].second</span><span style='color:black'> and if the element </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>match_prev_avail</span></span><span style='color:black'> is not
set in </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>flags</span><span style='color:black'> then sets it. Then calls </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>regex_</span></span><span class=grame><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>search(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>what[0].second,
end, what, *pre, ((what[0].first == what[0].second) ? flags | <span
class=spelle>match_not_null</span> : flags))</span><span style='color:black'>,
and if this returns false then sets </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>*this</span><span style='color:black'>
equal to the end of sequence iterator.</span></p>

<p>With:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex_iterator</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>&amp; operator+<span
class=grame>+(</span>);</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:
</span></b><span style='color:black'>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>what.prefix(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>).first !=
what[0].second</span><span style='color:black'> and if the element </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>match_prev_avail</span></span><span style='color:black'> is not
set in </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>flags</span><span style='color:black'> then sets it. Then behaves
as if by calling </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>regex_search</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>(what[0].second,
end, what, *pre, flags)</span>, with the following variation: in the event that
the previous match found was of zero length (<code><span style='font-size:10.0pt'>what[0].length()
== 0</span></code>) then attempts to find a non-zero length match starting at <code><span
style='font-size:10.0pt'>what[0].second</span></code>, only if that fails and
provided <code><span style='font-size:10.0pt'>what[0].second != suffix().second</span></code>
does it look for a (possibly zero length) match starting from <code><span
style='font-size:10.0pt'>what[0].second + 1</span></code>.<span
style='color:black'>&nbsp; <span class=grame>If no further match is found then
sets </span></span><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>*this</span><span style='color:black'> equal to the
end of sequence iterator.</span></span></p>

<p><b><span style='color:black'>Postconditions:</span></b><span
style='color:black'> provided *this is not set to the end-of-sequence iterator
then </span>the effects on <i>this</i> are given in table RE20:</p>

<h6 align=center style='text-align:center'><i>Table RE20--</i> <span
class=spelle>regex_iterator</span>&amp; operator+<span class=grame>+(</span>) <i>effects</i></h6>

<div align=center>

<table class=MsoNormalTable border=1 cellspacing=1 cellpadding=0 width=624
 style='width:468.0pt'>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p><b>Element</b></p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p><b>Value</b> </p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;size()</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>pre-&gt;<span class=spelle>mark_count</span>()</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;empty()</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>false</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;prefix().first</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>An iterator denoting the end point of the previous match found</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;prefix().last</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[0].first</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;prefix().matched</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;prefix().first != (*this)-&gt;prefix().second</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;suffix().first</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[0].second</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;suffix().last</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>end</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;suffix().matched</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;suffix().first != (*this)-&gt;suffix().second</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[0].first</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>The starting iterator for this match.</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[0].second</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>The ending iterator for this match.</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[0].matched</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p><span class=grame><span style='font-size:10.0pt;font-family:"Courier New"'>true</span></span>
  if a full match was found, and <code><span style='font-size:10.0pt'>false</span></code>
  if it was a partial match (found as a result of the <code><span
  style='font-size:10.0pt'>match_partial</span></code> flag being set).</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[n].first</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>For all integers n &lt; (*this<span class=grame>)-</span>&gt;size(), the
  start of the sequence that matched sub-expression <i>n</i>. Alternatively, if
  sub-expression n did not participate in the match, then <i>end</i>.</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[n].second</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>For all integers n &lt; (*this<span class=grame>)-</span>&gt;size(), the
  end of the sequence that matched sub-expression <i>n</i>. Alternatively, if sub-expression
  n did not participate in the match, then <i>end</i>.</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(**this)[n].matched</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>For all integers n &lt; (*this<span class=grame>)-</span>&gt;size(), true
  if sub-expression <i>n</i> participated in the match, false otherwise.</p>
  </td>
 </tr>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>(*this)-&gt;position()</p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p>The distance from the start of the original sequence being iterated, to
  the start of this match.</p>
  </td>
 </tr>
</table>

</div>

<p>Similar rewording is needed for <span class=spelle>regex_token_iterator</span>,
replace:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex_token_iterator</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>&amp; operator+<span
class=grame>+(</span>);</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:
</span></b><span style='color:black'>if </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N == -1</span><span style='color:black'>
then sets </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>*this</span><span style='color:black'> equal to the end of
sequence iterator. Otherwise if </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N+1 &lt; <span class=spelle>subs.size</span>()</span><span
style='color:black'>, then increments </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N</span><span style='color:black'> and
sets result equal to </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>((subs[N] == -1) ? <span class=spelle>value_type</span>(<span
class=spelle>what.prefix</span>().<span class=spelle>str</span>()) : <span
class=spelle>value_type</span>(what[subs[N]].<span class=spelle>str</span>()))</span><span
style='color:black'>. </span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>Otherwise
if </span><span class=grame><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>what.prefix(</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>).first != what[0].second</span><span
style='color:black'> and if the element </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>match_prev_avail</span></span><span
style='color:black'> is not set in </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>flags</span><span style='color:black'>
then sets it. Then if </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>regex_search</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>(what[0].second,
end, what, *pre, ((what[0].first == what[0].second) ? flags | <span
class=spelle>match_not_null</span> : flags)) == true</span><span
style='color:black'> sets </span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>N</span><span style='color:black'> equal to zero,
and sets </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>result</span><span style='color:black'> equal to </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>((subs[N] == -1)
? <span class=spelle>value_type</span>(<span class=spelle>what.prefix</span>().<span
class=spelle>str</span>()) : <span class=spelle>value_type</span>(what[subs[N]].<span
class=spelle>str</span>()))</span><span style='color:black'>. </span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>Otherwise
if the call to </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>regex_search</span></span><span
style='color:black'> returns false, then let </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>last_end</span></span><span
style='color:black'> be the value of </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>what[</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>0].second</span><span
style='color:black'> prior to the call to </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex_search</span></span><span
style='color:black'>. Then if </span><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>last_end</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> != end</span><span
style='color:black'> and </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>subs[0] == -1</span><span style='color:black'> sets </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>N</span><span
style='color:black'> equal to -1 and sets </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>result</span><span style='color:black'>
equal to </span><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>value_type</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>(<span class=spelle>last_end</span>,
end)</span><span style='color:black'>. <span class=grame>Otherwise sets </span></span><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New";color:black'>*this</span><span
style='color:black'> equal to the end of sequence iterator.</span></span></p>

<p>With:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex_token_iterator</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>&amp; operator+<span
class=grame>+(</span>);</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:
</span></b><span style='color:black'>if </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N == -1</span><span style='color:black'>
then sets </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>*this</span><span style='color:black'> equal to the end of
sequence iterator. Otherwise if </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N+1 &lt; <span class=spelle>subs.size</span>()</span><span
style='color:black'>, then increments </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>N</span><span style='color:black'> and
sets result equal to </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>((subs[N] == -1) ? <span class=spelle>value_type</span>(<span
class=spelle>what.prefix</span>().<span class=spelle>str</span>()) : <span
class=spelle>value_type</span>(what[subs[N]].<span class=spelle>str</span>()))</span><span
style='color:black'>. </span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>Otherwise
if </span><span class=grame><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>what.prefix(</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>).first != what[0].second</span><span
style='color:black'> and if the element </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>match_prev_avail</span></span><span
style='color:black'> is not set in </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>flags</span><span style='color:black'>
then sets it. &nbsp;Then locates the next match as if by calling </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>regex_search</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>(what[0].second, end, what, *pre, flags)</span>,
with the following variation: in the event that the previous match found was of
zero length (<code><span style='font-size:10.0pt'>what[0].length() == 0</span></code>)
then attempts to find a non-zero length match starting at <code><span
style='font-size:10.0pt'>what[0].second</span></code>, only if that fails and
provided <code><span style='font-size:10.0pt'>what[0].second != suffix().second</span></code>
does it look for a (possibly zero length) match starting from <code><span
style='font-size:10.0pt'>what[0].second + 1</span></code>.&nbsp; If such a
match is found then <span style='color:black'>sets </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>N</span><span
style='color:black'> equal to zero, and sets </span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>result</span><span
style='color:black'> equal to </span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>((subs[N] == -1) ? <span class=spelle>value_type</span>(<span
class=spelle>what.prefix</span>().<span class=spelle>str</span>()) : <span
class=spelle>value_type</span>(what[subs[N]].<span class=spelle>str</span>()))</span><span
style='color:black'>.</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>Otherwise
if no further matches were found, then let </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>last_end</span></span><span
style='color:black'> be the endpoint of the last match that was found. Then if </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>last_end</span></span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'> != end</span><span style='color:black'> and </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>subs[0] == -1</span><span
style='color:black'> sets </span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>N</span><span style='color:black'> equal to -1 and
sets </span><span style='font-size:10.0pt;font-family:"Courier New";color:black'>result</span><span
style='color:black'> equal to </span><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>value_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>(<span
class=spelle>last_end</span>, end)</span><span style='color:black'>. <span
class=grame>Otherwise sets </span></span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>*this</span><span
style='color:black'> equal to the end of sequence iterator.</span></span></p>

<h2>Naming of <span class=spelle>basic_regex::getflags</span></h2>

<p><i>Pete Becker writes:</i> <span class=spelle>basic_regex</span> has member
functions named <span class=spelle>getflags</span> and <span class=spelle>get_allocator</span>.
The latter is consistent with the use of the same name in STL containers. In
general, it seems to me, the library tries to use an underscore to separate a
verb from its object for names of this nature. That convention would mean that
we should call the other one <span class=spelle>get_flags</span>. On the other
hand, we do have <span class=spelle>getline</span>, but that's arguably
different because it's not a state query. Do we have a general policy here? If
so, what is it, and what should the name of <span class=spelle>getflags</span>
be?</p>

<p><i>Nathan <span class=spelle>Mayers</span>:</i> what does &quot;<span
class=spelle>get_</span><span class=grame>flags(</span>)&quot; convey that
&quot;flags()&quot; does not?&nbsp; Generally, names should be tested in
context to see if they make the code more readable. Thus, a verb-noun pair name
may make code like &quot;if (<span class=spelle>thing.is_</span><span
class=grame>open(</span>))&quot; read more like English.&nbsp; The verb in <span
class=spelle>get_</span><span class=grame>flags(</span>) doesn't serve, there:
&quot;if (<span class=spelle>re.get_flags</span>() &amp; <span class=spelle>re.greedy</span>)&quot;
is less clear than &quot;if (<span class=spelle>re.flags</span>() &amp; <span
class=spelle>re.greedy</span>)&quot; (and even less clear than &quot;if (<span
class=spelle>re.is_greedy</span>())&quot;).&nbsp; I would prefer to reserve
imperative verbs to names of functions that change the state of the object.</p>

<p><i>Suggested change:</i> Replace all occurrences of &#8220;<span
class=spelle>getflags</span>&#8221; in the document with &#8220;flags&#8221;
(this is consistent with <span class=spelle>ios_base</span> as well).</p>

<h2>Missing namespace prefix in <span class=spelle>regex_iterator</span>
description</h2>

<p><i>Pete Becker writes:</i> The definition of <span class=spelle>regex_iterator</span>
in RE.8.1 mentions</p>

<pre><span class=spelle>regex_</span><span class=grame>iterator(</span><span
class=spelle>BidirectionalIterator</span> a, <span class=spelle>BidirectionalIterator</span> b, const <span
class=spelle>regex_type</span>&amp; re, <span class=spelle>match_flag_type</span> m = <span
class=spelle>match_default</span>);</pre>

<p>And</p>

<pre><span class=grame>match_flag_type</span> flags; // for exposition only</pre>

<p><code><span style='font-size:10.0pt'>match_flag_type</span></code> and <code><span
style='font-size:10.0pt'>match_default</span></code> are defined in the nested
namespace <code><span style='font-size:10.0pt'>regex_constants</span></code>,
so these two names need to be qualified with <code><span style='font-size:10.0pt'>regex_constants::</span></code>.&nbsp;
<span class=grame>Same thing in the first RE.8.1.1.</span></p>

<p><i>John Maddock <span class=grame>adds</span>:</i> These are used almost
without exception without the needed namespace qualifier throughout the text,
likewise the other symbols listed below.</p>

<p><i>Suggested changes:</i> Go through the text and replace all occurrences
of:</p>

<p><code><span style='font-size:10.0pt'>match_flag_type</span></code> with <code><span
style='font-size:10.0pt'>regex_constants::match_flag_type</span></code>, </p>

<p><code><span style='font-size:10.0pt'>match_default</span></code> with<code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>regex_constants::match_default</span></span>,</p>

<p><code><span style='font-size:10.0pt'>match_partial</span></code> with <code><span
style='font-size:10.0pt'>regex_constants::match_partial</span></code>,</p>

<p><code><span style='font-size:10.0pt'>match_prev_avail </span></code>with<code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>regex_constants::match_prev_avail</span></span>,</p>

<p><code><span style='font-size:10.0pt'>match_not_null </span></code>with<code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>regex_constants::match_not_null</span></span><code><span
style='font-size:10.0pt'>,</span></code></p>

<p><code><span style='font-size:10.0pt'>format_default</span></code> with <code><span
style='font-size:10.0pt'>regex_constants::format_default</span></code>,</p>

<p><code><span style='font-size:10.0pt'>format_no_copy </span></code>with<code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>regex_constants::format_no_copy</span></span>,</p>

<p><code><span style='font-size:10.0pt'>format_first_only </span></code>with<code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>regex_constants::format_first_only</span></span>,</p>

<p>&nbsp;<span class=grame>except</span> in the section which defines these
(RE.3.1).</p>

<h2>Unnecessary sub-section headers in <span class=spelle>regex_iterator</span></h2>

<p><i>Pete Becker writes:</i> The first clause labeled RE.8.1.1 has the title
&quot;<span class=spelle>regex_iterator</span> constructors&quot;. It contains
descriptions of the constructors, plus several operators. The second clause
labeled RE.8.1.1 has the title &quot;<span class=spelle>regex_iterator</span>
dereference&quot;. It contains operator*, operator-&gt;, and the two versions
of operator++. <span class=grame>Seems like both of these labels should be
removed.</span></p>

<p><i>John Maddock replies:</i> Agreed, they don&#8217;t seem to add anything,
it&#8217;s easier to just remove those sub-section headers (they are misnamed
and <span class=spelle>misnumbered</span> in places as well):</p>

<p>Rename the section &#8220;RE.8.1.1 <span class=spelle>regex_iterator</span>
constructors&#8221; as &#8220;<span class=spelle>regex_iterator</span>
members&#8221;.</p>

<p>Remove the section &#8220;RE.8.1.1 <span class=spelle>regex_iterator</span>
dereference&#8221;.</p>

<p>Rename the section &#8220;RE.8.2.1 <span class=spelle>regex_iterator</span>
constructors&#8221; as &#8220;<span class=spelle>regex_token_iterator</span>
members&#8221;.</p>

<p>Remove the section: &#8220;RE.8.2.1 <span class=spelle>regex_token_iterator</span>
dereference&#8221;.</p>

<h2>Names of symbolic constants</h2>

<p><i>Pete Becker writes:</i> <span class=spelle>ECMAScript</span> has five
control escapes: t, n, v, f, <span class=grame>r</span>. The regex proposal has
named constants for four of them: <span class=spelle>escape_type_control_f</span>,
_n, _r, and _t. <span class=spelle>escape_type_control_v</span> seems to be
missing. (Okay, that's not about names, but the next two are).</p>

<p>This is minor, but in C and C++ those five things are escape sequences, and
using names that include 'control' is a bit confusing. Granted, it fits with
the terminology in <span class=spelle>ECMAScript</span>, but I'd lean toward
more C-like names, on the line of <span class=spelle>escape_type_f</span>.</p>

<p>And finally, there's <span class=spelle>escape_type_ascii_control</span>.
(For those not familiar with the details of the proposal, this refers to things
that we might write in ordinary text as &lt;ctrl&gt;-X, for example.) We've
pretty much avoided the term &quot;<span class=spelle>ascii</span>&quot; in the
standard (it's only used twice, in footnotes, apologetically), and I'm a bit
uncomfortable with its use here. I'd prefer <span class=spelle>escape_type_control_letter</span>,
which picks up the name of the production in the <span class=spelle>ECMAScript</span>
grammar for the letter that follows the escape. I think it's pretty clear what
it means, and it avoids &quot;<span class=spelle>ascii</span>&quot;.</p>

<p><i>Proposed changes:</i> replace all occurrences of:</p>

<p><span class=grame>escape_type_control_f</span> with <span class=spelle>escape_type_f</span></p>

<p><span class=grame>escape_type_control_n</span> with <span class=spelle>escape_type_n</span></p>

<p><span class=grame>escape_type_control_r</span> with <span class=spelle>escape_type_r</span></p>

<p><span class=grame>escape_type_control_t</span> with <span class=spelle>escape_type_t</span></p>

<p><span class=grame>escape_type_ascii_control</span> with <span class=spelle>escape_type_control</span></p>

<p>Then immediately after the line:</p>

<p><span class=grame><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>static</span></span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'> const <span class=spelle>escape_syntax_type</span> <span
class=spelle>escape_type_t</span>;</span></p>

<p><span class=grame>add</span> the line:</p>

<p><span class=grame><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>static</span></span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'> const <span class=spelle>escape_syntax_type</span> <span
class=spelle>escape_type_v</span>;</span></p>

<p>Then immediately after the table entry:</p>

<table class=MsoNormalTable border=1 cellspacing=1 cellpadding=0 width=624
 style='width:468.0pt'>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p class=MsoNormal><span class=spelle><span style='color:black'>escape_type_t</span></span></p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p class=MsoNormal><span style='color:black'>t</span></p>
  </td>
 </tr>
</table>

<p>Add the new table entry:</p>

<table class=MsoNormalTable border=1 cellspacing=1 cellpadding=0 width=624
 style='width:468.0pt'>
 <tr>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p class=MsoNormal><span class=spelle><span style='color:black'>escape_type_v</span></span></p>
  </td>
  <td width="50%" valign=top style='width:50.0%;padding:5.25pt 5.25pt 5.25pt 5.25pt'>
  <p class=MsoNormal><span style='color:black'>v</span></p>
  </td>
 </tr>
</table>

<h2>Traits class versioning incompletely edited in.</h2>

<p><i>Pete Becker writes: </i>The paper talks about versioning of <span
class=spelle>regex_traits</span> classes, and RE.1.1 (in table RE2) says that a
traits class shall have a member <span class=spelle>X::version_tag</span> whose
type is regex_traits_version_1_tag or a class that publicly inherits from that.
Neither the &lt;regex&gt; synopsis (RE.2) nor the description of <span
class=spelle>regex_traits</span> (RE.3.3) mentions either of these types. I
can't tell whether this was partially edited in or partially edited out.
&lt;g&gt; <span class=grame>So</span>, is <span class=spelle>regex_traits</span>
versioning part of the proposal?</p>

<p><i>John Maddock replies:</i> It&#8217;s an editing mix-up on my part, the
versioning mechanism is part of the rationale, but is only incompletely edited
into the main text.&nbsp; <i>Footnote</i>: for those unfamiliar with the
proposal, it is almost impossible to provide compatible extensions to the regex
library proposal without extending or altering the traits class design.&nbsp;
Specific optimizations may also require modification of the design (one of the
problems here is that this is a competitive field with new algorithms or
heuristics for regular expression searches being published at frequent
intervals, it is clearly not possible to build in support for as yet
un-thought-of advances in the field).&nbsp; By providing a mechanism that
supports a hierarchy of traits class versions the <span class=spelle>basic_regex</span>
class template can adapt to the supplied traits class, taking advantage of
whatever features it offers, and ignoring those it does not.</p>

<p><i>Pete Becker replies:</i> I don't have strong views one way or the other
on versioning the <span class=spelle>regex_traits</span> class per se. It's an
example, though, of a library-wide issue that we need to give some thought to
(the earlier effort to &quot;solve the versioning problem&quot; having
foundered for lack of a well defined goal). Although there are currently no
concrete proposals, there have also been suggestions of changes to allocators,
which would have similar problems. I think those two between them can give us a
tight enough focus to understand what's at stake and come up with a workable
mechanism to support backward compatibility and extensions mandated by the
standard as well as vendor- or user-supplied extensions. (I don't mean to
suggest that I see problems with regex_traits_version_1_tag, just that I don't
want to prejudge what the solution should look like).</p>

<p><i>Proposed resolution:</i> one of the problems with <span class=spelle>regex_traits</span>
is that one can predict right from the outset is that vendors will want to add
their own refinements to the class to support new advances in the field.&nbsp;
The versioning mechanism in <span class=spelle>regex_traits</span> is therefore
needed but is not intended as a general library-wide solution (note that the
mechanism versions a specific interface within the <span class=grame>library,</span>
and not the whole standard library in general).&nbsp; However the general
technique used may be applicable to Allocators which suffer from a similar
problem.&nbsp; This warrants further discussion (and perhaps a proposal in its
own right), but to fix the regular expression proposal as it stands:</p>

<p>In the &#8220;Header &lt;regex&gt; synopsis&#8221; section, right after:</p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>class</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> <span class=spelle>bad_expression</span>;</span></p>

<p>Add:</p>

<pre><span class=grame>struct</span> regex_traits_version_1_tag;</pre>

<p>Immediately before the section: &#8220;RE.3.3 Template class <span
class=spelle>regex_traits</span>&#8221; add:</p>

<h3 style='margin-left:36.0pt'>RE.3.3 <span class=spelle>struct</span>
regex_traits_version_1_tag</h3>

<pre style='margin-left:36.0pt'><span class=grame><span style='color:black'>struct</span></span><span
style='color:black'> reg</span>ex_traits_version_1_tag<span style='color:black'>{};</span></pre>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>The <span
class=spelle>struct</span> </span><code><span style='font-size:10.0pt'>regex_traits_version_1_tag</span></code><span
style='color:black'> defines a type that is used to identify the interface to
which a regular expression traits class conforms.&nbsp; Traits classes that
offer an implementation defined superset of the functionality of that offered
by </span><code><span style='font-size:10.0pt'>regex_traits</span></code><span
style='color:black'> should define a class that </span><span
class=normalwebchar>is derived from</span><code><span style='font-size:10.0pt'>
regex_traits_version_1_tag </span></code>and use that type as their <code><span
style='font-size:10.0pt'>version_tag</span></code> member<span
style='color:black'>. </span></p>

<p>The number of the remaining sections will then need adjusting.</p>

<p>Finally add the following <span class=spelle>typedef</span> member to class
template <span class=spelle>regex_traits</span>:</p>

<pre><span class=grame>typedef</span> regex_traits_version_1_tag <span
class=spelle>version_tag</span>;</pre>

<h2>Specification of <span class=spelle>sub_match::length</span> incorrect</h2>

<p>The specification for <span class=spelle>sub_match::length</span> has
acquired a couple of typos (a misplaced static, and the logic in the effects
clause is back-to-front), it should read:</p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>difference_type</span></span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'> length();</span></p>

<p class=MsoNormal><b><span style='color:black'>Effects: </span></b><span
style='color:black'>returns </span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>(<span class=grame>matched ?</span> distance(first,
second) : 0)</span><span style='color:black'>.</span></p>

<h2>Traits class sentry language</h2>

<p><i>Pete Becker writes: </i>the proposal says:</p>

<p style='margin-left:36.0pt'>&#8220;An object of type <span class=spelle>regex_traits</span>&lt;<span
class=spelle>charT</span>&gt;::sentry shall be constructed from a <span
class=spelle>regex_traits</span> object, and tested to be not equal to null,
before any of the member functions of that object other than <span
style='font-size:10.0pt;font-family:"Courier New"'>length</span>, <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>getloc</span></span>,
and <span style='font-size:10.0pt;font-family:"Courier New"'>imbue</span> shall
be called. Type sentry performs implementation defined initialization of the
traits class object, and represents an opportunity for the traits class to
cache data obtained from the locale object.&#8221;</p>

<p>The first sentence is in passive voice, and begs the question of who shall
do it: the user of the regex instance that holds the <span class=spelle>regex_traits</span>
object, or the regex instance itself. Unless the user is hacking around with a
standalone instance of <span class=spelle>regex_traits</span>, it probably
ought to be the regex object that &quot;shall&quot; do this.</p>

<p>Second, sentry &quot;performs implementation defined initialization.&quot; I
think this ought to be implementation specific, not implementation defined. I
don't want to have to document the details of the initialization that sentry
performs.</p>

<p>Proposed changes: replace with the wording:</p>

<p style='margin-left:36.0pt'>&#8220;An object of type <code><span
style='font-size:10.0pt'>regex_traits&lt;</span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>charT</span></span><code><span
style='font-size:10.0pt'>&gt;::sentry</span></code> shall be constructed from
the <code><span style='font-size:10.0pt'>regex_traits</span></code> object held
by objects of type <code><span style='font-size:10.0pt'>basic_regex</span></code>,
and tested to be not equal to null, before any of the member functions of that
object other than <span style='font-size:10.0pt;font-family:"Courier New"'>length</span>,
<span class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>getloc</span></span>,
and <span style='font-size:10.0pt;font-family:"Courier New"'>imbue</span> shall
be called. Type <code><span style='font-size:10.0pt'>sentry</span></code>
performs implementation specific initialization of the traits class object, and
represents an opportunity for the traits class to cache data obtained from the
locale object.&#8221;</p>

<h2>Imprecise specification of <span class=spelle>regex_traits::char_class_type</span></h2>

<p><i>Pete Becker writes: </i>Roughly speaking, there are three categories of
character class: the ones that are supported by C and C++ locales (<span
class=spelle>alnum</span>, etc.), the additional ones for the regex proposal (d
s w) and user-supplied character classes (through extensions to <span
class=spelle>regex_traits</span>).&nbsp; </p>

<p>Is the intent of the proposal to require that for the first category, the
value returned by, for example, <span class=spelle>lookup_</span><span
class=grame>classname(</span>&quot;<span class=spelle>alnum</span>&quot;) be
the value <span class=spelle>alnum</span> as defined by <span class=spelle>ctype_base::mask</span>?
(I don't care one way or the other, but we have to be clear about what's
required).</p>

<p><i>John Maddock replies: </i>I don&#8217;t think that the proposal should
specify that there is any specific relationship between <span class=spelle>ctype_base::mask</span>
(and its associated values) and <span class=spelle>regex_traits::char_class_type</span>
(and its values).&nbsp; </p>

<p><i>Proposed changes: </i></p>

<p>Replace:</p>

<p style='margin-left:36.0pt'>&#8220;The type <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>char_class_type</span></span>
is used to represent a character classification and is capable of holding an
implementation defined superset of the values held by <span class=spelle>ctype_base::mask</span>
(22.2.1).&#8221;</p>

<p><span class=grame>with</span>:</p>

<p style='margin-left:36.0pt'>&#8220;The type <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>char_class_type</span></span>
is used to represent a character classification and is capable of holding the
implementation specific set of values returned by <code><span style='font-size:
10.0pt'>lookup_classname</span></code>.&#8221;</p>

<h2>Can anything other than <span class=spelle>basic_regex</span> throw <span
class=spelle>bad_expression</span> objects?</h2>

<p><i>Pete Becker writes: </i>The text describing the class <span class=spelle>bad_expresions</span>
says it is the type of the object thrown to report errors &quot;during the
conversion from a string ... to a finite state machine.&quot; This suggests
that it is not thrown by the functions that try to match a string to and a <span
class=spelle>basic_regex</span> object, and this is borne out by the throws
clauses for the constructors and assignment operators for <span class=spelle>basic_regex</span>,
which say that they throw <span class=spelle>bad_expression</span> if the
string isn't a valid regular expression, and by the lack of throws clauses for <span
class=spelle>regex_match</span>, etc.&nbsp; </p>

<p>On the other hand, <span class=spelle>error_type</span> has two values, <span
class=spelle>error_complexity</span> and <span class=spelle>error_stack</span>,
that only occur during matching. There's no other mention of these values, so
the only thing that can be done with them is for the implementation to pass
them to <span class=spelle>regex_traits::error_string</span>, and the only way
the user can see the resulting string is by catching an exception. This
suggests that <span class=spelle>bad_expression</span> can also be thrown by
the match functions. And the text says, in the last paragraph of RE.4, that
&quot;the functions described in this clause can report errors by throwing
exceptions of type <span class=spelle>bad_expression</span>.&quot;</p>

<p>So: can the various match functions throw <span class=spelle>bad_expression</span>,
and, if so, is <span class=spelle>bad_expression</span> the appropriate name?</p>

<p><i>John Maddock replies:</i> It is correct that those values (<span
class=spelle>error_complexity</span> and <span class=spelle>error_stack</span>)
are for the use of the regex-matching algorithms and not for the <span
class=spelle>basic_regex</span> construct and assign members.&nbsp; It is also
true that it should be documented that these throw, and that <span
class=spelle>bad_expression</span> is an inappropriate name for this, but how
about &#8220;throws exceptions derived from <span class=spelle>runtime_error</span>&#8221;
(which is the base class for <span class=spelle>bad_expression</span>).</p>

<p>Proposed changes: Add the following throw clause:</p>

<p style='margin-left:36.0pt'><b>Throws:</b> exceptions derived from <code><span
style='font-size:10.0pt'>std::runtime_error</span></code> if either the runtime
complexity or memory space required to obtain a match exceeds an implementation
defined threshold.</p>

<p>To the following functions:</p>

<pre><span class=grame>template</span> &lt;class <span class=spelle>BidirectionalIterator</span>, class Allocator, class <span
class=spelle>charT</span>,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>class</span> traits, class Allocator2&gt;</pre><pre><span
class=grame>bool</span> <span class=spelle>regex_match</span>(<span
class=spelle>BidirectionalIterator</span> first, <span class=spelle>BidirectionalIterator</span> last,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=spelle>match_results</span>&lt;<span class=spelle>BidirectionalIterator</span>, Allocator&gt;&amp; m,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator2&gt;&amp; e,</pre><pre><span
style='font-size:12.0pt;font-family:"Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</span></pre>

<p><span class=grame>and</span>:</p>

<pre><span class=grame>template</span> &lt;class <span class=spelle>BidirectionalIterator</span>, class Allocator, class <span
class=spelle>charT</span>,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>class</span> traits, class Allocator2&gt;</pre><pre><span
class=grame>bool</span> <span class=spelle>regex_search</span>(<span
class=spelle>BidirectionalIterator</span> first, <span class=spelle>BidirectionalIterator</span> last,</pre><pre> &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span
class=spelle>match_results</span>&lt;<span class=spelle>BidirectionalIterator</span>, Allocator&gt;&amp; m,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator2&gt;&amp; e,</pre><pre><span
style='font-size:12.0pt;font-family:"Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</span></pre>

<p><span class=grame>and</span>:</p>

<pre><span class=grame>template</span> &lt;class <span class=spelle>OutputIterator</span>, class <span
class=spelle>BidirectionalIterator</span>, class traits,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>class</span> Allocator, class <span class=spelle>charT</span>&gt;</pre><pre><span
class=spelle>OutputIterator</span> <span class=spelle>regex_</span><span
class=grame>replace(</span><span class=spelle>OutputIterator</span> out,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=spelle>BidirectionalIterator</span> first,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=spelle>BidirectionalIterator</span> last,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator&gt;&amp; e,</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>const</span> <span class=spelle>basic_string</span>&lt;<span
class=spelle>charT</span>&gt;&amp; <span class=spelle>fmt</span>,</pre><pre><span
style='font-size:12.0pt;font-family:"Times New Roman"'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</span></pre>

<p>Note that the other algorithm overloads are described in terms of these
three main forms, so their throw clauses are assumed to be implicit here.</p>

<h2>Unneeded <span class=spelle>basic_regex</span> members</h2>

<p>The following <span class=spelle>basic_regex</span> members are redundant and
should be removed:</p>

<p class=MsoNormal><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>basic_</span></span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>regex(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>const <span
class=spelle>charT</span>* p1, const <span class=spelle>charT</span>* p2, <span
class=spelle>flag_type</span> f = <span class=spelle>regex_constants::normal</span>,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span class=grame>const</span> Allocator&amp; a =
Allocator());</span></p>

<p class=MsoNormal><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>basic_regex</span></span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>&amp; <span class=grame>assign(</span>const
<span class=spelle>charT</span>* first, const <span class=spelle>charT</span>*
last,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=grame>flag_type</span> f = <span class=spelle>regex_constants::normal</span>);</span></p>

<p><i>Historical note:</i> these were present in the original Boost
implementation as a workaround for non-conforming compilers and didn&#8217;t
get edited out of the proposal as they should have been.</p>

<h2>Missing <span class=spelle>basic_regex</span> members</h2>

<p><i>Pete Becker writes: </i>The proposal has member functions named 'assign'
that take argument lists that correspond to the argument lists for
constructors, with two exceptions: there's <code><span style='font-size:10.0pt'>basic_regex(const
</span></code><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New"'>charT</span></span><code><span style='font-size:10.0pt'> *, </span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>size_type</span></span><code><span
style='font-size:10.0pt'> </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>len</span></span><code><span
style='font-size:10.0pt'>, </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>flag_type</span></span><code><span
style='font-size:10.0pt'>)</span></code>, but no <code><span style='font-size:
10.0pt'>assign(const </span></code><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New"'>charT</span></span><code><span
style='font-size:10.0pt'> *, </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>size_type</span></span><code><span
style='font-size:10.0pt'>, </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>flag_type</span></span><code><span
style='font-size:10.0pt'>)</span></code>; and there's <code><span
style='font-size:10.0pt'>basic_regex()</span></code>, but no <code><span
style='font-size:10.0pt'>assign()</span></code>. Are these omissions
intentional?</p>

<p><i>John Maddock replies</i>: I don&#8217;t believe that there should be an <code><span
style='font-size:10.0pt'>assign()</span></code> member (any more than <span
class=spelle>std::string</span> has one), I don&#8217;t see any great need for
a <code><span style='font-size:10.0pt'>clear()</span></code> member
either.&nbsp; The <code><span style='font-size:10.0pt'>assign(const chart*, </span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>size_type</span></span><code><span
style='font-size:10.0pt'>, </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>flag_type</span></span><code><span
style='font-size:10.0pt'>)</span></code> should be present however.</p>

<p>Proposed changes: add the following member to the <span class=spelle>basic_regex</span>
class synopsis:</p>

<p class=MsoNormal><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>basic_regex</span></span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>&amp; <span class=grame>assign(</span>const
<span class=spelle>charT</span>* <span class=spelle>ptr</span>, <span
class=spelle>size_type</span> <span class=spelle>len</span>, <span
class=spelle>flag_type</span> f = <span class=spelle>regex_constants::normal</span>);</span></p>

<p><span class=grame>The add</span> the following description in the RE4.5
section:</p>

<p class=MsoNormal><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>basic_regex</span></span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>&amp; <span class=grame>assign(</span>const
<span class=spelle>charT</span>* <span class=spelle>ptr</span>, <span
class=spelle>size_type</span> <span class=spelle>len</span>, <span
class=spelle>flag_type</span> f = <span class=spelle>regex_constants::normal</span>);</span></p>

<p class=MsoNormal><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns </span><span class=grame><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>assign(</span></span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>string_type</span></span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>(<span class=spelle>ptr</span>, <span class=spelle>len</span>),
f)</span><span style='color:black'>.</span></p>

<h2>Types of <span class=spelle>match_results</span> typedefs members</h2>

<p><i>Pete Becker writes: </i>The proposal says that <span class=spelle>match_results</span>
has a nested <span class=spelle>typedef</span></p>

<pre><span class=grame>typedef</span> const <span class=spelle>value_type</span>&amp; <span
class=spelle>const_reference</span></pre>

<p>Since <span class=spelle>match_results</span> has an allocator, this should
be</p>

<pre><span class=grame>typedef</span> <span class=spelle>typename</span> <span
class=spelle>allocator::const_reference</span> <span class=spelle>const_reference</span></pre>

<h2>What does <span class=spelle>match_results::</span><span class=grame>size(</span>)
return?</h2>

<p><i>Pete Becker writes: </i>The member <span class=spelle>funtion</span> <span
class=grame>size(</span>) returns &quot;the number of <span class=spelle>sub_match</span>
elements stored in *this&quot;. Aside from the suggested implementation above,
there are the <span class=grame>prefix(</span>) and suffix() <span
class=spelle>sub_match</span> elements. Is the intention that <span
class=grame>size(</span>) should return the number of capture groups in the
original expression, and not include those two extra <span class=spelle>sub_matches</span>?
(I think the answer is probably yes).</p>

<p><i>John Maddock adds:</i> yes this is correct, it&#8217;s the number of
capture groups plus one (because index 0 holds the whole match).</p>

<p><i>Proposed changes: </i></p>

<p>Replace:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>size_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> size()const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns the number of <span class=spelle>sub_match</span>
elements stored in *this.</span></p>

<p>With:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>size_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> size()const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns one plus the number marked sub-expressions in the
regular expression that was matched.</span></p>

<h2>What does <span class=spelle>match_results::position</span> return when
passed an out of range index?</h2>

<p><i>Pete Becker writes: </i><span class=spelle>match_results::</span><span
class=grame>position(</span>) doesn't say what happens when someone asks for
the position of a non-matched group. The specification says that it's <span
class=grame>distance(</span>first1, first2), where first1 is the beginning of
the target text and first2 is the beginning of the nth match. The specification
for <span class=spelle>sub_match</span> says that for a failed match the iterators
have unspecified contents. Do we want this to be unspecified or undefined, or
is there some meaningful value we can return?</p>

<p>Having looked ahead &lt;g&gt;, the match and search algorithms specify that
non-matched groups hold iterators that point to the end of the target
text.&nbsp; This conflicts with the specification for <span class=spelle>sub_match</span>,
which says they're undefined. Is that text in <span class=spelle>sub_match</span>
incorrect?</p>

<p>John Maddock replies: I think it is reasonable to require specified values
in non-matched sub-expressions, so yes the text for <span class=spelle>sub_match</span>
is wrong.&nbsp; The return value from position() should be defined in all cases
as well, since the returned type is signed and negative values can never be
produced by matched sub-expressions, how about -1?</p>

<p>Proposed changes: </p>

<p>Changes to:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>difference_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>
position(unsigned <span class=spelle>int</span> sub = 0)const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns </span><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>std::</span></span><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New";color:black'>distance(</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>prefix().first,
(*this)[sub].first).</span></p>

<p><span class=grame>Are covered in &#8220;</span><span class=spelle>Regex_iterator</span><span
class=grame> does not set </span><span class=spelle>match_results::postion</span><span
class=grame> correctly&#8221;.</span></p>

<p>Delete the following paragraphs from the <span class=spelle>sub_match</span>
specification:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>When
the marked sub-expression denoted by an object of type <span class=spelle>sub_match</span>&lt;&gt;
participated in a regular expression match then member </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> evaluates to true, and members </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>first</span><span
style='color:black'> and </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>second</span><span style='color:black'> denote the range of
characters </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>[<span class=spelle>first</span><span class=grame>,second</span>)</span><span
style='color:black'> which formed that match. Otherwise </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> is false, and members </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>first</span><span style='color:black'>
and </span><span style='font-size:10.0pt;font-family:"Courier New";color:black'>second</span><span
style='color:black'> contained undefined values.</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>If an
object of type </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>sub_match</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>&lt;&gt;</span><span
style='color:black'> represents sub-expression 0 - that is to say the whole
match - then member </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>matched</span><span style='color:black'> is always true, unless a
partial match was obtained as a result of the flag </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>match_partial</span></span><span
style='color:black'> being passed to a regular expression algorithm, in which
case member </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>matched</span><span style='color:black'> is false, and members </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>first</span><span
style='color:black'> and </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>second</span><span style='color:black'> represent the character
range that formed the partial match.</span></p>

<p>The add the following to the <span class=spelle>match_results</span>
specification, immediately after the sentence ending <i>&#8220;except that only
operations defined for const-qualified Sequences are supported.<span
class=grame>&#8221;</span></i><span class=grame>:</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>The </span><code><span
style='font-size:10.0pt'>sub_match&lt;&gt;</span></code><span style='color:
black'> object stored at index zero represents sub-expression 0; that is to say
the whole match.&nbsp; In this case the </span><code><span style='font-size:
10.0pt'>sub_match&lt;&gt;</span></code><span style='color:black'> member </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> is always true, unless a partial match was obtained as a
result of the flag </span><code><span style='font-size:10.0pt'>regex_constants::</span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>match_partial</span></span><span style='color:black'> being passed
to a regular expression algorithm, in which case member </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> is false, and members </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>first</span><span style='color:black'>
and </span><span style='font-size:10.0pt;font-family:"Courier New";color:black'>second</span><span
style='color:black'> represent the character range that formed the partial
match.</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><span style='color:black'>The </span><code><span
style='font-size:10.0pt'>sub_match&lt;&gt;</span></code><span style='color:
black'> object stored at index <i>n</i> denotes what matched the marked
sub-expression <i>n</i> within the matched expression.&nbsp; If the
sub-expression <i>n</i> participated in a regular expression match then the <span
class=spelle>sub_match</span>&lt;&gt; member </span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> evaluates to true, and members </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>first</span><span
style='color:black'> and </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>second</span><span style='color:black'> denote the range of
characters </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>[<span class=spelle>first</span><span class=grame>,second</span>)</span><span
style='color:black'> which formed that match. Otherwise </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>matched</span><span
style='color:black'> is false, and members </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>first</span><span style='color:black'>
and </span><span style='font-size:10.0pt;font-family:"Courier New";color:black'>second</span><span
style='color:black'> point to the end of sequence that was searched.</span></p>

<h2>What happens if <span class=spelle>match_results::</span><span class=grame>operator[</span>]
is out of range?</h2>

<p><i>Pete Becker writes:</i> with respect to <span class=spelle>match_results::</span><span
class=grame>operator[</span>]: We need to say what happens for an index out of
range. Seems to me there are two reasonable possibilities: undefined behavior,
or returns a no-match object.</p>

<p>While I strongly favor undefined behavior over artificially well-defined
results, I also favor well-defined behavior when it is not too artificial.
Thus, the behavior of <span class=grame>sqrt(</span>-2.0) is undefined; free(0)
does nothing. While undefined behavior provides a convenient hook for debugging
implementations, that's not its purpose, and if we can specify reasonable
(which includes inexpensive) behavior we ought to do it, rather than provide
another place where users can go astray.</p>

<p>In this case, I think I prefer to view <span class=grame>operator[</span>]
as indexing into an unbounded array of <span class=spelle>sub_match</span>
objects. The objects at <span class=spelle>match_</span><span class=grame>results.size(</span>)
and above would look like failed sub-matches: their <span class=spelle>boolean</span>
flag would be false, and both their iterators would point to the end of the
target string. Since we've agreed that <span class=spelle>sub_match</span>
objects for failed sub-matches need not have distinct addresses, this can be
implemented by simply adding one <span class=spelle>sub_match</span> element
beyond those needed for the actual results, and returning it for an index
that's otherwise out of bounds.</p>

<p>Proposed changes: replace:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>const_reference</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> operator[](<span
class=spelle>int</span> n) const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns a reference to the </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>sub_match</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> </span><span
style='color:black'>object representing the character sequence that matched
marked sub-expression <i>n</i>. If </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>n == 0 </span><span style='color:black'>then
returns a reference to a </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>sub_match</span></span><span
style='color:black'> object representing the character sequence that matched
the whole regular expression.</span></p>

<p>With:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>const_reference</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> operator[](<span
class=spelle>int</span> n) const;</span></p>

<p class=MsoNormal style='margin-left:36.0pt'><b><span style='color:black'>Effects:</span></b><span
style='color:black'> Returns a reference to the </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>sub_match</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'> </span><span
style='color:black'>object representing the character sequence that matched
marked sub-expression <i>n</i>. If </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>n == 0 </span><span style='color:black'>then
returns a reference to a </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>sub_match</span></span><span
style='color:black'> object representing the character sequence that matched
the whole regular expression.&nbsp; If n &gt;= <span class=grame>size(</span>)
then returns a <span class=spelle>sub_match</span> object representing an
unmatched sub-expression.</span></p>

<h2>Incorrect case insensitive match specification</h2>

<p>The following wording:</p>

<p style='margin-left:36.0pt'>&quot;During matching of a regular expression
finite state machine against a sequence of characters, comparison of a
collating element range c1-c2 against a character c is conducted as follows: if
<span class=spelle>getflags</span>() &amp; <span class=spelle>regex_constants::collate</span>
is true, then the character c is matched if <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,c1)) &lt;= <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,c)) &amp;&amp; <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,c)) &lt;= <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,c2)), otherwise c is matched if c1 &lt;= c
&amp;&amp; c &lt;= c2.&nbsp; During matching of a regular expression finite
state machine against a sequence of characters, testing whether a collating
element is a member of a primary equivalence class is conducted by first
converting the collating element and the equivalence class to a sort keys using
<span class=spelle>traits::transform_primary</span>, and then comparing the
sort keys for equality.&quot;</p>

<p>Is defective in that it does not take account of case-insensitive matches,
all input characters, and all collating elements in the finite state machine
should be passed through <span class=spelle>traits.inst.translate</span> before
being converted into a sort key.&nbsp; The changes are trivial to make, but
rather verbose...</p>

<p><i>Proposed changes:</i> are incorporated into the following issue.</p>

<h2>Character class extensions to <span class=spelle>ECMAScript</span> grammar
need a formal grammar</h2>

<p><i>Pete Becker writes:</i> The regex proposal adds to <span class=spelle>ECMAScript</span>
the ability to use named character classes through &quot;expressions of the
form&quot;<span class=grame>:</span><br>
<br>
[[:class-name:]]<br>
[[.collating-name.]]<br>
[[=collating-name=]]<br>
<br>
This isn't sufficient. In <span class=spelle>ECMAScript</span> the expression
[[] is valid, and names a character set containing the character '['.
Similarly, [[:] is also valid, and names a character set containing the
characters '[' and ':'. We need to say whether these two expressions (and their
analogs for collating names) are still valid. I suspect the answer is that
they're not -- a '[' as the first character in a character class is a special
character, which must be <span class=spelle>follwed</span> by one of ':', '.',
or '=', then a name that does not contain any of ']', ':', &quot;.', or '='
(technically we could allow ']', but that seems unnecessarily baroque), then
the appropriate close marker.</p>

<p><i>John Maddock replies:</i> After some discussion on the c++-<span
class=grame>std-</span>lib list it was agreed that the character class
extensions were worth having (in fact they were requested after a previous
committee meeting), but that they required a change to the <span class=spelle>ECMAScript</span>
grammar.</p>

<p>Proposed changes: In section RE.4, remove everything starting from:</p>

<p style='margin-left:36.0pt'>&#8220;The regular expression grammar recognized
by type <i>specialization of <span class=spelle>basic_regex</span></i> is
described in&#8221; </p>

<p><span class=grame>and</span> ending with:</p>

<p style='margin-left:36.0pt'>&#8220;During matching of a regular expression
finite state machine against a sequence of characters, a character <span
style='font-size:10.0pt;font-family:"Courier New"'>c</span> is a member of
character class <span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New"'>some_name</span></span>, if <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>traits_inst.is_class</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(c, <span class=spelle>traits_inst.lookup_</span><span
class=grame>classname(</span>&quot;<span class=spelle>some_name</span>&quot;))</span>.&#8221;
</p>

<p><span class=grame>and</span> replace with the paragraph:</p>

<p style='margin-left:36.0pt'>&#8220;The regular expression grammar recognized
by class template <code><span style='font-size:10.0pt'>basic_regex is</span></code>
that specified by Appendix XXXX&#8221;</p>

<p>Then add the following as an appendix to the proposal:</p>

<h3 style='margin-left:36.0pt'>Appendix: Modified <span class=spelle>ECMAScript</span>
regular expression grammar</h3>

<p style='margin-left:36.0pt'>The regular expression grammar recognized by
class template <code><span style='font-size:10.0pt'>basic_regex is</span></code>
that specified by ECMA-262, <span class=spelle>ECMAScript</span> Language
Specification, <span class=grame>Chapter</span> 15 part 10, <span class=spelle>RegExp</span>
(Regular Expression) Objects (FWD.1) except as specified below.</p>

<p style='margin-left:36.0pt'>Objects of type <i>specialization of <span
class=spelle>basic_regex</span></i> store within themselves a
default-constructed instance of their <span style='font-size:10.0pt;font-family:
"Courier New"'>traits</span> template parameter, henceforth referred to as <span
class=spelle><i>traits_inst</i></span>. This <span class=spelle><i>traits_inst</i></span><i>
</i>object is used to support localization of the regular expression; no <span
class=spelle>basic_regex</span> object shall call any locale dependent C or C++
API, including the formatted string input functions, instead it shall call the
appropriate traits member function to achieve the required effect.</p>

<p style='margin-left:36.0pt'>The transformation from a sequence of characters
to a finite state machine is accomplished by first by transforming the sequence
of characters to the sequence of tokens obtained by calling <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>traits_inst.syntax_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(c)</span> for each input
character c. The regular expression grammar is then applied to the sequence of
tokens in order to construct the finite state machine (this is to allow the
regular expression syntax to be localized to a specific character set; for
example to use Far Eastern ideographs, rather than Latin characters). Where <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>traits_inst.syntax_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(c)</span> returns <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>syntax_escape</span></span>,
then the implementation shall call <span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New"'>traits_inst.escape_syntax_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(c)</span> for the character
following the escape, in order to determine the grammatical meaning of the
escape sequence. When <span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New"'>traits_inst.escape_syntax_type</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(c<span class=grame>)</span></span><span
class=grame>returns</span> <span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New"'>escape_type_class</span></span> or <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>escape_type_not_class</span></span>,
then the single character following the escape is converted to a string <span
style='font-size:10.0pt;font-family:"Courier New"'>n</span>, and passed to <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>traits_inst.lookup_classname</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(n)</span>, to determine the
character classification to be matched against. If <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>traits_inst.lookup_</span></span><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New"'>classname(</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>n)</span> returns null, then
the escape is treated as an identity escape.&nbsp; The mapping of values
returned by <code><span style='font-size:10.0pt'>syntax_type</span></code> and <code><span
style='font-size:10.0pt'>escape_syntax_type</span></code> to characters within
the <span class=spelle>ECMAScript</span> grammar is specified in tables RE5 and
RE6.</p>

<p style='margin-left:36.0pt'>The following productions within the <span
class=spelle>ECMAScript</span> grammar are modified as follows:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><i><span
style='font-size:10.0pt'>CharacterClass </span></i></span><span class=grame><b><span
style='font-size:10.0pt'>:</span></b></span><b><span style='font-size:10.0pt'>:</span></b></p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><b><span
style='font-size:10.0pt;font-family:Courier-Bold'>[ </span></b></span><span
class=grame><span style='font-size:8.0pt'>[</span></span><span class=spelle><span
style='font-size:8.0pt'>lookahead</span></span><span style='font-size:8.0pt'> </span><span
style='font-size:8.0pt;font-family:Symbol'>. [</span><span style='font-size:
8.0pt'>{</span><b><span style='font-size:8.0pt;font-family:Courier-Bold'>^</span></b><span
style='font-size:8.0pt'>}] </span><span class=spelle><i><span style='font-size:
10.0pt'>ClassRanges</span></i></span><i><span style='font-size:10.0pt'> </span></i><b><span
style='font-size:10.0pt;font-family:Courier-Bold'>]</span></b></p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=grame><b><span
style='font-size:10.0pt;font-family:Courier-Bold'>[ ^</span></b></span><b><span
style='font-size:10.0pt;font-family:Courier-Bold'> </span></b><span
class=spelle><i><span style='font-size:10.0pt'>ClassRanges</span></i></span><i><span
style='font-size:10.0pt'> </span></i><b><span style='font-size:10.0pt;
font-family:Courier-Bold'>]</span></b></p>

<p class=MsoNormal style='margin-left:36.0pt'><i><span style='font-size:10.0pt'><br>
<span class=grame>ClassAtom </span></span></i><span class=grame><b><span
style='font-size:10.0pt'>:</span></b></span><b><span style='font-size:10.0pt'>:</span></b></p>

<p class=MsoNormal style='margin-left:36.0pt'>-<br>
<span class=spelle><i><span style='font-size:10.0pt'>ClassAtomNoDash</span></i></span><i><span
style='font-size:10.0pt'><br>
<span class=spelle>ClassAtomExClass</span><br>
<span class=spelle>ClassAtomCollatingElement</span><br>
<span class=spelle>ClassAtomEquivalence</span></span></i></p>

<p style='margin-left:36.0pt'>The following new productions are then added:</p>

<p class=MsoNormal style='margin-left:36.0pt'><span class=spelle><i><span
style='font-size:10.0pt'>ClassAtomExClass</span></i></span><span class=grame><i><span
style='font-size:10.0pt'>::</span></i></span><i><span style='font-size:10.0pt'><br>
&nbsp; [: <span class=spelle>ClassName</span> :]<br>
<br>
<span class=spelle>ClassAtomCollatingElement</span>::<br>
&nbsp; [. <span class=grame>ClassName .</span>]<br>
<br>
<span class=spelle>ClassAtomEquivalence</span><span class=grame>::</span><br>
&nbsp; [= <span class=spelle>ClassName</span> =]<br>
<br>
<span class=spelle>ClassName</span>::<br>
&nbsp; <span class=spelle>ClassNameCharacter</span><br>
&nbsp; <span class=spelle>ClassNameCharacter</span> <span class=spelle>ClassName</span><br>
<br>
<span class=spelle>ClassNameCharacter</span>::<br>
&nbsp; <span class=spelle>SourceCharacter</span> but not one of &quot;.&quot;
&quot;=&quot; &quot;:&quot;</span></i></p>

<p style='margin-left:36.0pt'>The productions <span class=spelle>ClassAtomExClass</span>,
<span class=spelle>ClassAtomCollatingElement</span> and <span class=spelle>ClassAtomEquivalence</span>
provide the equivalent functionality to the same features in IEEE Std
1003.1-2001, Portable Operating System Interface (<span class=grame>POSIX )</span>,
Base Definitions and Headers, Section 9, Regular Expressions.</p>

<p style='margin-left:36.0pt'>The regular expression grammar may be modified by
any <code><span style='font-size:10.0pt'>regex_constants::</span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>syntax_option_type</span></span>
flags specified when constructing an object of type specialization of <code><span
style='font-size:10.0pt'>basic_regex </span></code>according to the rules in
table RE.3.1.1.</p>

<p style='margin-left:36.0pt'>A <span class=spelle>ClassName</span> production
when used in ClassAtomExClass is <span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57">not </ins></span>valid if<span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">, when</ins></span>
the <span class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">value
returned by </ins></span><code><span
style='font-size:10.0pt'>traits_inst.lookup_classname </span></code><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">for that
name is </ins></span>zero. &nbsp;If this is not the
case then a <code><span style='font-size:10.0pt'>bad_expression</span></code>
exception shall be thrown.&nbsp; The range of values recognized as a valid <span
class=spelle>ClassName</span> is determined by the type of the traits class,
but at least the following names shall be recognized: <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>alnum</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>, alpha, blank, <span
class=spelle>cntrl</span>, digit, graph, lower, print, <span class=spelle>punct</span>,
space, upper, <span class=spelle>xdigit</span>, d, s, w.&nbsp; </span>In
addition the following expressions shall be equivalent:</p>

<p style='margin-left:36.0pt'>\d and [[<span class=grame>:digit</span>:]]</p>

<p style='margin-left:36.0pt'>\D and [<span class=grame>^[</span>:digit:]]</p>

<p style='margin-left:36.0pt'>\s and [[<span class=grame>:space</span>:]]</p>

<p style='margin-left:36.0pt'>\S and [<span class=grame>^[</span>:space:]]</p>

<p style='margin-left:36.0pt'>\w and [<span class=grame>_[</span>:<span
class=spelle>alnum</span>:]]</p>

<p style='margin-left:36.0pt'>\W and [^<span class=grame>_[</span>:<span
class=spelle>alnum</span>:]]</p>

<p style='margin-left:36.0pt'>The results from multiple calls to <code><span
style='font-size:10.0pt'>traits_inst.lookup_classname</span></code> can be
bitwise <span class=spelle>OR'ed</span> together and subsequently passed to <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>traist_inst.is_class</span></span>.</p>

<p style='margin-left:36.0pt'>A ClassName production when used in a ClassAtomCollatingElement
production is <span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">not </ins></span>valid if<span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57"> </ins></span>the
<span class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">value
returned by </ins></span><code><span style='font-size:10.0pt'><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">traits_inst.lookup_collatename
</ins></span></span></code><span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">for that </ins></span>name is <span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57">an </ins></span><code></span></code><span
class=msoDel></span>empty
string.</p>

<p style='margin-left:36.0pt'>A ClassName production when used in a ClassAtomEquivalence
production is <span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">not </ins></span>valid if<span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57">, when</ins></span> the <span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">value
returned by </ins></span><code><span style='font-size:10.0pt'><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">traits_inst.lookup_collatename
</ins></span></span></code><span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">for that </ins></span>name is <span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57">an </ins></span><span
class=msoDel></span><code><span
style='font-size:10.0pt'><span class=msoDel></span></span></code><span
class=msoDel></span>empty
string <span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">or if the value</ins></span><span class=msoDel></span> returned <span class=msoIns><ins
cite="mailto:Comparison" datetime="2003-09-16T13:57">by </ins></span><span
class=msoDel></span><code><span
style='font-size:10.0pt'>traits_inst.transform_primary</span></code><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57"> for the
result of the call to </ins></span><code><span style='font-size:10.0pt'><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">traits_inst.lookup_collatename
</ins></span></span></code><span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">is an empty string</ins></span>.</p>

<p style='margin-left:36.0pt'><span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">When the sequence of characters being transformed
to a finite state machine contains an invalid class name the translator shall
throw an exception object of type </ins></span><code><span style='font-size:
10.0pt'><span class=msoIns><ins cite="mailto:Comparison"
datetime="2003-09-16T13:57">bad_expression</ins></span></span></code><span
class=msoIns><ins cite="mailto:Comparison" datetime="2003-09-16T13:57">.</ins></span></p>

<p style='margin-left:36.0pt'>Where the regular expression grammar requires the
conversion of a sequence of characters to an integral value, this is
accomplished by calling <span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New"'>traits_inst.toi</span></span>.</p>

<p style='margin-left:36.0pt'>The behavior of the internal finite state machine
representation, when used to match a sequence of characters is as described in <span
class=spelle><span style='color:black'>ECMAScript</span></span> Language
Specification, Chapter 15 part 10, <span class=spelle><span style='color:black'>RegExp</span></span>
(Regular Expression) Objects (FWD.1). The behavior is modified according to any
<span class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>match_flag_type</span></span> flags specified (RE.3.1.2) when
using the regular expression object in one of the regular expression algorithms
(RE.7). The behavior is also localized by interaction with the traits class
template parameter as follows:</p>

<p style='margin-left:36.0pt'><span style='color:black'>During matching of a
regular expression finite state machine against a sequence of characters, two
characters <i>c</i> and <i>d </i>are compared using </span><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>traits_inst.translate</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>(c, <span
class=spelle>getflags</span>() &amp; <span class=spelle>regex_constants::icase</span>)
== <span class=spelle>traits_inst.translate</span>(d, <span class=spelle>getflags</span>()
&amp; <span class=spelle>regex_constants::icase</span>)</span><span
style='color:black'>.</span></p>

<p style='margin-left:36.0pt'><span style='color:black'>During matching of a regular
expression finite state machine against a sequence of characters, comparison of
a collating element range </span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>c1-c2 </span><span style='color:black'>against a
character </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>c</span><span style='color:black'> is conducted as follows: if </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>getflags</span></span><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>() &amp; <span class=spelle>regex_constants::collate</span></span><span
style='color:black'> is true, then the character </span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>c</span><span style='color:black'>
is matched if </span><span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>traits_inst.transform</span></span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>(<span
class=spelle>string_type</span>(1,traits_inst.translate(c1, flags() &amp; <span
class=spelle>icase</span>)) &lt;= <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,traits_inst.translate(c, flags() &amp; <span
class=spelle>icase</span>))) &amp;&amp; <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,traits_inst.translate(c, flags() &amp; <span
class=spelle>icase</span>))) &lt;= <span class=spelle>traits_inst.transform</span>(<span
class=spelle>string_type</span>(1,traits_inst.translate(c2, flags() &amp; <span
class=spelle>icase</span>)))</span><span style='color:black'>, otherwise </span><span
style='font-size:10.0pt;font-family:"Courier New";color:black'>c</span><span
style='color:black'> is matched if </span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>c1 &lt;= c &amp;&amp; c &lt;= c2</span><span
style='color:black'>. </span></p>

<p style='margin-left:36.0pt'><span style='color:black'>During matching of a
regular expression finite state machine against a sequence of characters,
testing whether a collating element is a member of a primary equivalence class
is conducted by first converting the collating element and the equivalence
class to a sort keys using </span><span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New";color:black'>traits::transform_primary</span></span><span
style='color:black'>, and then comparing the sort keys for equality.</span></p>

<p style='margin-left:36.0pt'><span style='color:black'>During matching of a
regular expression finite state machine against a sequence of characters, a
character </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>c</span><span style='color:black'> is a member of character class </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>some_name</span></span><span style='color:black'>, if </span><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>traits_inst.is_class</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'>(c, <span class=spelle>traits_inst.lookup_</span><span
class=grame>classname(</span>&quot;<span class=spelle>some_name</span>&quot;))</span><span
style='color:black'>.</span></p>

<h2>Imprecise Specification of <span class=spelle>regex_replace</span> </h2>

<p><i>Pete Becker writes:</i> </p>

<p>RE.7.3 (<span class=spelle>regex_replace</span>) says:</p>

<p>Finds all the non-overlapping matches 'm' of type <span class=spelle>match_results</span>&lt;<span
class=spelle>BidirectionalIterator</span>&gt; that occur in the sequence
[first, last).</p>

<p>Having found them or not, it then writes stuff depending on its
arguments.&nbsp; It's not clear, though, what &quot;non-overlapping
matches&quot; are. It took me about five minutes to convince myself that these
are matches of the complete expression, and not matches of internal capture
groups (which would always overlap the full match). I think a footnote is
sufficient for this. More important, though, is what happens when matches
overlap. Suppose we're searching for &quot;<span class=spelle>aba</span>&quot;
in the text &quot;<span class=spelle>ababa</span>&quot;. There are two matches:
the first three characters match, and the last three match. These two matches
overlap. Do we discard them both? Keep the first? Keep the second? My guess is
that the intention is to keep the first one, but we need to say so.</p>

<p><i>John Maddock replies:</i> thinking about this, I think the text that
specifies how multiple matches are enumerated should be in one place only (and
obviously be completely precise), currently <span class=spelle>regex_iterator</span>
has the text for this, so I think it makes sense to specify the behavior of <span
class=spelle>regex_replace</span> in terms of that type, even though <span
class=spelle>regex_iterator</span> hasn&#8217;t actually been described yet
when <span class=spelle>regex_replace</span> is encountered.</p>

<p><i>Proposed changes:</i></p>

<p>Replace the following clause:</p>

<p style='margin-left:36.0pt'><b>Effects:</b> Finds all the non-overlapping
matches <i>m</i> of type <span class=spelle><span style='font-size:10.0pt;
font-family:"Courier New"'>match_results</span></span><span style='font-size:
10.0pt;font-family:"Courier New"'>&lt;<span class=spelle>BidirectionalIterator</span>&gt;
</span>that occur within the sequence [first, last). If no such matches are
found <span class=grame>and </span><span class=grame><span style='font-size:
10.0pt;font-family:"Courier New"'>!</span></span><span style='font-size:10.0pt;
font-family:"Courier New"'>(flags &amp; <span class=spelle>format_no_copy</span>)</span>
then calls <span class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(first, last, out)</span>.
Otherwise, for each match found, <span class=grame>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New"'>!</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(flags &amp; <span
class=spelle>format_no_copy</span>)</span> calls <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(<span class=spelle>m.prefix</span>().first,
<span class=spelle>m.prefix</span>().last, out)</span>, and then calls <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>m.format</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(out, <span class=spelle>fmt</span>,
flags)</span>. Finally <span class=grame>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New"'>!</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(flags &amp; <span
class=spelle>format_no_copy</span>)</span> calls <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(<span class=spelle>last_m.suffix</span>().first,
<span class=spelle>last_m,suffix</span>().last, out) </span>where <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>last_m</span></span>
is a copy of the last match found. If <span style='font-size:10.0pt;font-family:
"Courier New"'>flags &amp; <span class=spelle>format_first_only</span></span>
is non-zero then only the first match found is replaced.</p>

<p>With:</p>

<p style='margin-left:36.0pt'><b>Effects:</b> Constructs an <span class=spelle>regex_iterator</span>
object: <code><span style='font-size:10.0pt'>regex_iterator&lt;</span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>BidirectionalIterator</span></span><code><span
style='font-size:10.0pt'>, </span></code><span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>charT</span></span><code><span
style='font-size:10.0pt'>, traits, Allocator&gt; </span></code><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New"'>i(</span></span><code><span
style='font-size:10.0pt'>first, last, e, flags)</span></code>, and uses <span
class=spelle><i>i</i></span> to enumerate through all of the matches <i>m</i>
of type <span class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>match_results</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>&lt;<span class=spelle>BidirectionalIterator</span>&gt;
</span>that occur within the sequence [first, last). If no such matches are
found <span class=grame>and </span><span class=grame><span style='font-size:
10.0pt;font-family:"Courier New"'>!</span></span><span style='font-size:10.0pt;
font-family:"Courier New"'>(flags &amp; <span class=spelle>format_no_copy</span>)</span>
then calls <span class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(first, last, out)</span>.
Otherwise, for each match found, <span class=grame>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New"'>!</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(flags &amp; <span
class=spelle>format_no_copy</span>)</span> calls <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(<span class=spelle>m.prefix</span>().first,
<span class=spelle>m.prefix</span>().last, out)</span>, and then calls <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>m.format</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(out, <span class=spelle>fmt</span>,
flags)</span>. Finally <span class=grame>if </span><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New"'>!</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(flags &amp; <span
class=spelle>format_no_copy</span>)</span> calls <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>std::copy</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>(<span class=spelle>last_m.suffix</span>().first,
<span class=spelle>last_m,suffix</span>().last, out) </span>where <span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>last_m</span></span>
is a copy of the last match found. If <span style='font-size:10.0pt;font-family:
"Courier New"'>flags &amp; <span class=spelle>format_first_only</span></span>
is non-zero then only the first match found is replaced.</p>

<h2>Section 2: Issue list</h2>

<h2>What is an invalid/empty regular expression?</h2>

<p><i><span style='color:black'>Pete Becker writes:</span></i><span
style='color:black'> </span>The default constructor for <span class=spelle>basic_regex</span>
constructs an object for which <span class=grame>empty(</span>) returns true,
i.e. the object does not contain a valid regular expression. The consequences
of this aren't clear. For example, <span class=spelle>regex_match</span> (RE.7.1)
doesn't say what happens if the <span class=spelle>basic_regex</span> object
passed to it does not contain a valid regular expression.&nbsp; </p>

<p>Would it make sense for the default constructor to construct an object as if
the user had used <span class=spelle>basic_</span><span class=grame>regex(</span>&quot;&quot;)?
That way the object holds a valid (although trivial) expression, and the need
to check for invalid regular expressions goes away, because the rest of the <span
class=spelle>ctors</span> throw exceptions for invalid regular
expressions.&nbsp; I generally prefer objects that do reasonable things, even
if those things are useless. It's not hard for an implementation to handle
&quot;&quot; efficiently, and that eliminates an oddball case that otherwise
requires explicit wording in several places -- if the default constructor
creates an object with an invalid expression then copy constructor, assign, and
operator= all need to say what happens when they're passed one of these.
Currently they say that the <span class=spelle>postcondition</span> is <span
class=grame>empty(</span>) == false, and they'd have to add a qualification for
default-constructed objects. Same for anything that uses a <span class=spelle>basic_regex</span>
object. It's simpler to not have to say anything about invalid regular
expressions.</p>

<p><i>John Maddock replies:</i> I accept it's simpler, but I'm still not sure
that it's a good idea, consider:&nbsp; </p>

<p>The user default constructs a <span class=spelle>std::regex</span> object,
and the code then goes off on one of several possible execution paths, later
the user tries to use the expression, and gets very strange behavior (a match
is found at every single position in the string), how would that user track
down the bug in their program that leaves the expression <span class=spelle>uninitialized</span>?&nbsp;
It would be much easier if they could just <code><span style='font-size:10.0pt'>assert(!</span></code><span
class=spelle><span style='font-size:10.0pt;font-family:"Courier New"'>my_expression.empty</span></span><code><span
style='font-size:10.0pt'>())</span></code>, or better still if the
implementation of <span class=spelle>regex_match/regex_search</span> did just
that.&nbsp; </p>

<p>Note also that invalid regular expression objects can be produced by <code><span
style='font-size:10.0pt'>basic_regex&lt;&gt;:</span></code><span class=grame><span
style='font-size:10.0pt;font-family:"Courier New"'>:assign</span></span> (if
the assign encounters an invalid expression it throws, but what kind of object
is left behind is unspecified - unless we insist on the strong exception
guarantee it will be empty().&nbsp; </p>

<p>Finally consider the following sequence of actions:&nbsp; </p>

<pre><span class=grame>std::regex</span> e; //1</pre><pre><span class=grame>e.imbue(</span><span
class=spelle>my_locale</span>); //2</pre><pre><span class=grame>e.assign(</span><span
class=spelle>my_expression</span>);</pre>

<p>If statement 1 creates a valid regex, then its traits class will presumably
be called upon to initialize itself (which may involve caching lots of locale specific
data), statement 2 then wipes out all that work done.&nbsp; </p>

<p>I do agree though that as the text stands, it needs an audit to fix
unspecified behavior when the expression is invalid.</p>

<p><i>Proposed changes:</i> At the very least we must indicate what happens to
the existing object in the event that <span class=spelle>basic_regex::assign</span>
throws.</p>

<p>To the effects clause for:</p>

<pre><span class=grame>template</span> &lt;class <span class=spelle>string_traits</span>, class A&gt;</pre><pre><span
class=spelle>basic_regex</span>&amp; <span class=grame>assign(</span>const <span
class=spelle>basic_string</span>&lt;<span class=spelle>charT</span>, <span
class=spelle>string_traits</span>, A&gt;&amp; s, <br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=spelle>flag_type</span> f = <span class=spelle>regex_constants::normal</span>);</pre>

<p>Add the sentence:</p>

<p>&#8220;In the event that an exception is thrown, then the <span
class=spelle>basic_regex</span> object shall remain unchanged.&#8221;</p>

<p>Then we must either indicate that the <span class=spelle>basic_regex</span>
default constructor is equivalent to the regular expression &#8220;&#8221;, or
we must add the following precondition:</p>

<p><b>Precondition</b><span class=grame>: </span><code><span style='font-size:
10.0pt'>!e.empty()</span></code> </p>

<p><span class=grame>to</span> the following algorithms (other overloads
inherit this precondition implicitly since they are described in terms of the
ones below):</p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>template</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> &lt;class <span class=spelle>BidirectionalIterator</span>,
class Allocator, class <span class=spelle>charT</span>,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span
class=grame>class</span> traits, class Allocator2&gt;</span></p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>bool</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> <span class=spelle>regex_match</span>(<span
class=spelle>BidirectionalIterator</span> first, <span class=spelle>BidirectionalIterator</span>
last,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=spelle>match_results</span>&lt;<span class=spelle>BidirectionalIterator</span>,
Allocator&gt;&amp; m,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator2&gt;&amp; e,</span></p>

<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;<span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</pre>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;</span></p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>template</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> &lt;class <span class=spelle>BidirectionalIterator</span>,
class Allocator, class <span class=spelle>charT</span>,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>class</span> traits, class Allocator2&gt;</span></p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>bool</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> <span class=spelle>regex_search</span>(<span
class=spelle>BidirectionalIterator</span> first, <span class=spelle>BidirectionalIterator</span>
last,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=spelle>match_results</span>&lt;<span class=spelle>BidirectionalIterator</span>,
Allocator&gt;&amp; m,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator2&gt;&amp; e,</span></p>

<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</pre>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;</span></p>

<p class=MsoNormal><span class=grame><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>template</span></span><span style='font-size:10.0pt;
font-family:"Courier New";color:black'> &lt;class <span class=spelle>OutputIterator</span>,
class <span class=spelle>BidirectionalIterator</span>, class traits,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>class</span> Allocator, class <span class=spelle>charT</span>&gt;</span></p>

<p class=MsoNormal><span class=spelle><span style='font-size:10.0pt;font-family:
"Courier New";color:black'>OutputIterator</span></span><span style='font-size:
10.0pt;font-family:"Courier New";color:black'> <span class=spelle>regex_</span><span
class=grame>replace(</span><span class=spelle>OutputIterator</span> out,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=spelle>BidirectionalIterator</span> first,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=spelle>BidirectionalIterator</span> last,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=grame>const</span> <span class=spelle>reg_expression</span>&lt;<span
class=spelle>charT</span>, traits, Allocator&gt;&amp; e,</span></p>

<p class=MsoNormal><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<span class=grame>const</span> <span class=spelle>basic_string</span>&lt;<span
class=spelle>charT</span>&gt;&amp; <span class=spelle>fmt</span>,</span></p>

<pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span
class=grame>match_flag_type</span> flags = <span class=spelle>match_default</span>);</pre>

<h2>Regular expression constructor language</h2>

<p><i>Pete Becker writes:</i> Probably editorial. For the <span class=spelle>basic_regex</span>
<span class=spelle>ctor</span> that takes a const <span class=spelle>charT</span>
*p, the proposal says:&nbsp; </p>

<p style='margin-left:36.0pt'>Effects: Constructs an object of class <span
class=spelle>basic_regex</span>; the object's internal finite state machine is
constructed from the regular expression contained in the null-terminated string
p...</p>

<p><span class=grame>p</span> is not a null-terminated string. It is a pointer.
The analogous phrasing for <span class=spelle>basic_string</span> is:</p>

<p style='margin-left:36.0pt'>Effects: Constructs an object of class <span
class=spelle>basic_string</span> and determines its initial string value from
the array of <span class=spelle>charT</span> of length <span class=spelle>traits::length</span>(s)
whose first element is designated by <span class=grame>s ...</span></p>

<p>We need to maintain a similar level of formalism.</p>

<h2>Incorrect usage of &#8220;undefined&#8221;</h2>

<p>In several places in the document the term &#8220;undefined&#8221; should be
replaced by &#8220;unspecified&#8221;:</p>

<p>&#8220;Otherwise <span style='font-size:10.0pt;font-family:"Courier New"'>matched</span>
is false, and members <span style='font-size:10.0pt;font-family:"Courier New"'>first</span>
and <span style='font-size:10.0pt;font-family:"Courier New"'>second</span> contained
<b><i>undefined</i></b> values.&#8221;</p>

<p>&#8220;If the function returns false, then the effect on parameter <i>m</i>
is <b><i>undefined</i></b>, otherwise the effects on parameter <i>m</i> are
given in table RE18&#8221;</p>

<p>&#8220;If the function returns false, then the effect on parameter <i>m</i>
is <b><i>undefined</i></b>, otherwise the effects on parameter <i>m</i> are
given in table RE19&#8221;</p>

<h2>Incorrect usage of &#8220;implementation defined&#8221;</h2>

<p>In several places in the document the term &#8220;implementation
defined&#8221; should be replaced by either &#8220;implementation
specific&#8221; or &#8220;unspecified&#8221;:</p>

<p>&#8220;Type sentry performs <b><i>implementation defined</i></b>
initialization of the traits class object, and represents an opportunity for
the traits class to cache data obtained from the locale object.&#8221;</p>

<pre>&#8220;<span class=spelle><span style='color:black'>char_class_type</span></span><span
style='color:black'> <span class=spelle>lookup_</span><span class=grame>classname(</span>const <span
class=spelle>string_type</span>&amp; name) const;</span></pre>

<p class=MsoNormal><b><span style='color:black'>Effects: </span></b><span
style='color:black'>returns</span> an<b><i><span style='color:black'> implementation
defined</span></i></b><span style='color:black'> value that represents the
character classification </span><span style='font-size:10.0pt;font-family:"Courier New";
color:black'>name&#8221;</span></p>

<p>&#8220;<b>Returns: </b>converts f into a value <span style='font-size:10.0pt;
font-family:"Courier New"'>m</span> of type <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>ctype_base::mask</span></span>
in an <b><i>implementation defined</i></b> manner&#8221;</p>

<p>&#8220;<b>Effects: </b>constructs an object <span style='font-size:10.0pt;
font-family:"Courier New"'>result</span> of type <span style='font-size:10.0pt;
font-family:"Courier New"'>int</span>. If <span style='font-size:10.0pt;
font-family:"Courier New"'>first == last</span> or if <span class=spelle><span
style='font-size:10.0pt;font-family:"Courier New"'>is_</span></span><span
class=grame><span style='font-size:10.0pt;font-family:"Courier New"'>class(</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>*first, <span class=spelle>lookup_classname</span>(&quot;d&quot;))
== false</span> then sets <span style='font-size:10.0pt;font-family:"Courier New"'>result</span>
equal to -1. Otherwise constructs a <span class=spelle><span style='font-size:
10.0pt;font-family:"Courier New"'>basic_istream</span></span><span
style='font-size:10.0pt;font-family:"Courier New"'>&lt;<span class=spelle>charT</span>&gt;</span>
object which uses an <b><i>implementation defined</i></b> stream buffer type
which represents the character sequence [<span class=spelle>first,last</span>),
and sets the format flags on that object as appropriate for argument <span
style='font-size:10.0pt;font-family:"Courier New"'>radix</span>.&#8221;</p>

<h2>Are <span class=spelle>sub_match</span> objects all unique?</h2>

<p><i>Pete Becker writes: </i>Are <span class=spelle>sub_match</span> objects
for non-matched capture groups required to be distinct? I can picture a <span
class=spelle>match_type</span> implementation that holds <span class=spelle>sub_match</span>
objects only for the capture groups that matched, and returns a generic
no-match object for others. Is this intended to be legal? (My inclination is
that it ought to be allowed, because I don't see any good reason not to allow
it).</p>

<p><i>John Maddock replies</i>: there is nothing to explicitly disallow this in
the proposal, so yes it is allowed, and no I don&#8217;t see any reason to
disallow it either.</p>

<p><i>Proposed changes:</i> Either none, or add a non-normative note.</p>

<h2>How are Unicode escape sequences handled?</h2>

<p><i><span style='color:black'>Pete Becker writes: </span></i>ECMA-Script
supports character escapes of the form &quot;\<span class=spelle>uxxxx</span>&quot;,
where each 'x' is a hex digit. Each such escape sequence represents the
character whose code point is the value of '<span class=spelle>xxxx</span>'
translated to a number in the usual way.&nbsp; What do such character escapes
mean when the character type for <span class=spelle>basic_regex</span> is too
small to hold that value? Do we intend to require multi-byte support here (I
hope not)? Or is such a value invalid when the target character type is too
small?</p>

<p><i>John Maddock adds:</i> this sparked off a furious discussion about
Unicode support in C++ generally, it was clear that:</p>

<ol style='margin-top:0cm' start=1 type=1>
 <li class=MsoNormal>Unicode support in C++ is highly desirable. </li>
 <li class=MsoNormal><span class=grame>basic_regex</span> probably isn&#8217;t
     the place to fix this &#8211; we need to review <span class=spelle>std::string</span>,
     <span class=spelle>std::locale</span> etc first.</li>
</ol>

<p>For now, we should assume that if the underlying character type isn&#8217;t
wide enough to hold a specified Unicode code point, then it should be treated
as a syntax error.</p>

<h2>Meaning of the <span class=spelle>match_partial</span> flag</h2>

<p><i>From Pete Becker:</i></p>

<p>RE.3.1.2 says that the <span class=spelle>match_partial</span> flag</p>

<p style='margin-left:36.0pt'>Specifies that if no match can be found, then it
is acceptable to return a match [from, last) where from<span class=grame>!=</span>last,
if there exists some sequence of&nbsp; characters [<span class=spelle>from,to</span>)
of which [<span class=spelle>from,last</span>) is a prefix, and which would
result in a full match.</p>

<p>Taking this literally, if I have the expression &quot;<span class=grame>a(</span>?=b)(?!<span
class=grame>b</span>)&quot; and try to match it against &quot;a&quot;, the
partial match must fail, because the two assertions are contradictory. Is the
matcher really required to do this sort of analysis of the expression, and
determine that there is no possible continuation that could succeed?</p>

<p>From the name, I would think that <span class=spelle>partial_match</span>
would mean, roughly, that if you reach the end of the search text but are only
partway through the regular expression, that's okay.&nbsp; So in the example
above, the partial match would succeed. Is that what's intended here?</p>

<p>John Maddock replies: Yes that is the intent; the text may be a little too
strict.</p>

<p>Proposed change: Add the following text to the end of the above clause:</p>

<p>[Note &#8211; implementations are not required to go to heroic efforts to
determine whether a partial match is truly possible, there may be combinations
of text and regular expression where ruling out an impossible partial match
would require excessive effort &#8211; end note]</p>

<h2>Name of <span class=spelle>regex_traits::is_class</span></h2>

<p><i>Pete Becker writes: </i>That name is confusing. I'd prefer <span
class=spelle>inclass</span>, or some variant. The function takes two arguments:
a character and a character class, and tells you whether the character belongs
to the class. <span class=grame>is_class</span> sounds too much like querying
whether some object represents a character class.</p>

<p><i>John Maddock replies:</i> <span class=spelle>ctype</span> uses
&#8220;is&#8221; for the equivalent function but that is too short, maybe
&#8220;<span class=spelle>is_class_member</span>&#8221; or something, but
finding a good name is hard (as ever).</p>

<p>No proposed change at this time; more suggestions or comments are needed.</p>

<h2>Can <span class=spelle>traits::error_string</span> be simplified?</h2>

<p><i>Pete Becker writes: </i>In the proposal, the template <span class=spelle>regex_traits</span>
has a member function <span class=spelle>error_string</span> that takes an
error code that indicates what error occurred and returns a string
corresponding to that error, which is then used as the argument to the
constructor for an exception object. Seems to me it would be simpler to have <span
class=spelle>regex_traits</span> simply provide a function that throws the
exception, called with the error code. Is this string needed for anything else?</p>

<p><i>John Maddock replies:</i> The only use I can think of is if you implement
C API's on top of the C++ library (as <span class=spelle>boost.regex</span>
does with the POSIX regex API's).&nbsp; These have a separate API that returns
the error string from a numeric ID.&nbsp; However from the point of view of the
proposal I think you are correct, it could be changed to just throw.</p>

<p><i>Proposed changes:</i> none are presented here, since the proposal is not
actually defective, it would be easy enough to change if there is any strong
feelings one way or another.</p>

<h2>Can <span class=spelle>traits::translate</span> be improved?</h2>

<p><i>Pete Becker writes:</i> The <span class=spelle>regex_traits</span> member
function 'translate' is used when comparing a character in the pattern string
with a character in the target string. It takes two arguments: the character to
translate, and a <span class=grame>boolean</span> flag that indicates whether
the translation should be case sensitive. So two characters are equal if</p>

<pre><span class=grame>translate(</span><span class=spelle>pch</span>, <span
class=spelle>icase</span>) == translate(<span class=spelle>tch</span>, <span
class=spelle>icase</span>)</pre>

<p>So with pattern text of &quot;<span class=spelle>abcde</span>&quot;,
checking for a match would look something like this:</p>

<pre><span class=grame>for</span> (<span class=spelle>int</span> <span
class=spelle>i</span> = 0; <span class=spelle>i</span> &lt; 5; ++<span
class=spelle>i</span>)<br>
<br>
</pre><pre>&nbsp;&nbsp; <span class=grame>if</span> (translate(<span
class=spelle>pch</span>[<span class=spelle>i</span>], <span class=spelle>icase</span>) == translate(<span
class=spelle>tch</span>[<span class=spelle>i</span>], <span class=spelle>icase</span>))<br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class=grame>return</span> false;<br>
<br>
</pre><pre><span class=grame>return</span> true;</pre>

<p>The implementation of <span class=spelle>regex_traits::translate</span> in
the library-supplied traits class is:</p>

<pre><span class=grame>return</span> (<span class=spelle>icase</span> ? <span
class=spelle>use_facet</span>&lt;<span class=spelle>ctype</span>&lt;<span
class=spelle>charT</span>&gt; &gt;(<span class=spelle>getloc</span>()).<span
class=spelle>tolower</span>(<span class=spelle>ch</span>) : <span class=spelle>ch</span>);</pre>

<p>There's potential for a significant speedup, though, if case sensitive and
case insensitive comparisons go through two different functions. The obvious
transformation of the preceding loop would be:</p>

<pre><span class=grame>if</span> (<span class=spelle>icase</span>)<br>
<br>
</pre><pre>&nbsp;&nbsp; <span class=grame>for</span> (<span class=spelle>int</span> <span
class=spelle>i</span> = 0; <span class=spelle>i</span> &lt; 5; ++<span
class=spelle>i</span>)<br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class=grame>if</span> (<span
class=spelle>translate_ic</span>(<span class=spelle>pch</span>[<span
class=spelle>i</span>]) == <span class=spelle>translate_ic</span>(<span
class=spelle>tch</span>[<span class=spelle>i</span>]))<br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class=grame>return</span> false;<br>
<br>
</pre><pre><span class=grame>else</span><br>
<br>
</pre><pre>&nbsp;&nbsp; <span class=grame>for</span> (<span class=spelle>int</span> <span
class=spelle>i</span> = 0; <span class=spelle>i</span> &lt; 5; ++<span
class=spelle>i</span>)<br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class=grame>if</span> (translate(<span
class=spelle>pch</span>[<span class=spelle>i</span>]) == translate(<span
class=spelle>tch</span>[<span class=spelle>i</span>]))<br>
<br>
</pre><pre>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; <span class=grame>return</span> false;<br>
<br>
</pre><pre><span class=grame>return</span> true;</pre>

<p>For the default <span class=spelle>regex_traits</span> class, the calls to
translate in the second branch of <span class=grame>the if</span> statement
would be inline calls to a translate function that simply returns its argument,
so the loop turns into a sequence of direct comparisons, with no distractions
from the possibility of case insensitivity. Further, since case sensitivity is
determined by a flag that's set at the time the regular expression is compiled,
one of the two branches of the outer if statement will always be unnecessary.</p>

<p>I made up the names '<span class=spelle>translate_ic</span>' and 'translate'
for this e-mail. I'm not suggesting that we use them.</p>

<p><i>John Maddock replies:</i> I&#8217;ve no object to this, but the speedup
may not be as big as you think, an earlier <span class=spelle>Boost.regex</span>
implementation (actually before it was part of Boost), used something similar,
and the difference between the two is pretty minimal compared to other costs
involved &#8211; such as the number of states matched.&nbsp; The change also
involves quite a bit of redrafting.</p>

<h2>Improving on <span class=spelle>traits::toi</span></h2>

<p><i>Pete Becker writes:</i> It says, in part:</p>

<p>If first == last or if <span class=spelle>is_</span><span class=grame>class(</span>*first,
<span class=spelle>lookup_classname</span>(&quot;d&quot;)) == false then sets
result equal to -1.</p>

<p>And &quot;d&quot; by default is the digits 0-9. Since the radix for the conversion
can be 8, 10, or 16, the condition involving &quot;d&quot; isn't right. For a
hex value it precludes the value 'a0'. For an octal value it allows '90', but
the ensuing conversion will fail. We need to find a different way to express
this. The idea is to return -1 on a failed conversion, and the appropriate
unsigned value on success.</p>

<p><i>And further:</i> I'm starting to think that <span class=spelle>toi</span>
is too high level an interface. Regular expression grammars go character by <span
class=spelle>charcter</span>. For example, the value of a <span class=spelle>HexEscapeSequence</span>
(\<span class=spelle>xhh</span>) is &quot;(16 times the MV of the first hex
digit) plus the MV of the second <span class=spelle>HexDigit</span>&quot;. <span
class=grame>toi</span> (<span class=spelle>hypertechnically</span>) doesn't
require that. In order to implement the specification literally, the regex
parser needs to translate individual characters, not groups of characters, into
values, and accumulate those values as appropriate. Thus, <span class=spelle>regex_traits</span>
ought to provide <span class=spelle>int</span> <span class=grame>value(</span><span
class=spelle>charT</span> <span class=spelle>ch</span>),which returns -1 if <span
class=spelle>isxdigit</span>(<span class=spelle>ch</span>) is false, otherwise
the numeric value represented by the character.</p>

<p><i>And:</i> I've just implemented it. Here are the changes I made:</p>

<p style='margin-left:36.0pt;text-indent:-18.0pt'>1.<span style='font-size:
7.0pt'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>I removed <span class=spelle>escape_type_backref</span>
and <span class=spelle>escape_type_decimal</span></p>

<p style='margin-left:36.0pt;text-indent:-18.0pt'>2.<span style='font-size:
7.0pt'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>I added <span class=spelle>escape_type_numeric</span>
(0-9)</p>

<p style='margin-left:36.0pt;text-indent:-18.0pt'>3.<span style='font-size:
7.0pt'>&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; </span>I added <span class=spelle>int</span>
<span class=spelle>regex_traits::</span><span class=grame>value(</span><span
class=spelle>charT</span> <span class=spelle>ch</span>, <span class=spelle>int</span>
base)</p>

<p>The first two aren't technically necessary for this change, but <span
class=spelle>escape_type_backref</span> is a bit misleading. <span
class=spelle>ECMAScript</span> doesn't restrict the number of capture groups,
so \10 can be a valid back reference. This means that <span class=spelle>escape_type_backref</span>
alone isn't sufficient. So I figured it's enough to know that you're starting a
numeric constant (i.e. <span class=spelle>escape_type_numeric</span>), and then
you can use <span class=grame>value(</span>) == -1 to determine when you've
reached the end of a constant.</p>

<p>The second argument to value is needed in order to decide whether the
character is a valid digit for the base. <span class=grame>value</span> returns
-1 for an invalid digit, and the (unsigned) numeric value for a valid digit.</p>

<p><i>John Maddock adds:</i> I&#8217;ve no objections to the change, but it
involves a fair bit of redrafting, it should be dealt with in an overall
re-drafting of the traits class section when we have a complete set of changes.</p>

<h2>Improving on <span class=spelle>traits::lookup_classname</span></h2>

<p><i>Pete Becker writes: </i>I think this needs a change in specification. It
returns a value that identifies the named character class identified by its
string argument. The cases I'm concerned about are the ones with names like [<span
class=grame>:</span><span class=spelle>alnum</span>:]. When the code encounters
the opening [: it has to scan ahead for the <span class=grame>matching :</span>,
pick up the characters in between, stuff them into a string, and call <span
class=spelle>lookup_classname</span>. This is a lot of wheel spinning. In
particular, creating the string is expensive. If <span class=spelle>lookup_classname</span>
took two iterators instead of a string it could simply look at the characters
without the intervening string object.</p>

<p><i>John Maddock replies:</i> Funnily enough <span class=spelle>Boost.regex</span>
currently does it that <span class=grame>way,</span> I changed it to simplify
the interface.</p>

<p>I'm not against the change - but if we change that one then <span
class=spelle>lookup_collatename</span> should also change - and there are
numerous places in the text that use: <span class=spelle>lookup_</span><span
class=grame>classname(</span>&quot;string literal&quot;), which would actually
be quite tricky to reword correctly.</p>

<p>This should be dealt with in an overall re-drafting of the traits class
section when we have a complete set of changes.</p>

<p>&nbsp;</p>

</div>

</div>

</div>

</body>

</html>

