<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML>
<HEAD>
    <TITLE>A Proposal to Add split/join of string/string_view to the Standard Library</TITLE>
    <META http-equiv=Content-Type content="text/html; charset=windows-1252">
<style>
.disable_modified {background-color: #f2ff6c}
.disable_modified::after { content: "  [Modified/added recently]"; font-size: small;}
</style>
</HEAD>
<BODY>
    <p style="font-size:small ">
Document number: P0540R0 <br>
Project: Programming Language C++<br>
Audience: Library Evolution Working Group<br>
<br>
Laurent NAVARRO &lt;ln@altidev.com&gt;<BR>
Date: 2017-01-21

        </p>

        <H1>A Proposal to Add split/join of string/string_view to the Standard Library </H1>


        <H2>I. Motivation</H2>

    <p>
        Split a string in multiple string based on a separator and the reverse operation to aggregate a collection of string with separator are quite common operations
        , but there's no standardized easy to use solutions in the existing <code>std::basic_string</code> and the proposed <code>std::basic_string_view</code> class.
    </p>
    <pre>
      split("C++/is/fun","/") => ["C++","is","fun"]
      join(["C++","is","fun"],"/") => "C++/is/fun"      </pre>

    <p>The purpose of this simple proposal is to cover this miss.</p>
    <p>We also propose solutions to easily handle case conversion</p>
    <p>
        Theses features are available in the standard string class of the following languages : D, Python, Java, C#, Go, Rust.
    </p>

    <H2>II. Impact On the Standard</H2>

    <p>
        This proposal is a pure library extension. It does not require changes in the language core itself.<br>
        It does require adding new method to <code>std::basic_string</code> class (or not if implemented only in <code>std::basic_string_view</code>). <br>
        Or just a function add in algorithms if this option is preferred.
    </p>
    <p>It has been implemented in standard C++.</p>


    <H2>III. Design Decisions</H2>

    <P>Several options have been discussed in this discussion <a href="#Ref1">[1]</a>, bellows a summary of the various discussed options. As several alternative has been discussed, we let the committee choose which option is privileged.</P>

    <H3>3.1 Discussion on split</H3>
    <H4>3.1.1 splits &amp; splitsv method</H4>
    <p>
        Probably the simplest option is to add method to <code>std::basic_string</code> and <code>std::basic_string_view</code>.<br>
        Example on <code>std::basic_string_view</code> (<code>std::basic_string</code> is quite the same)
        <pre>vector&lt;basic_string&lt;CharT, Traits> > splits(const basic_string_view&lt;CharT, Traits> &Separator) const
vector&lt;basic_string_view&lt;CharT, Traits> > splitsv(const basic_string_view&lt;CharT, Traits> &Separator) const </pre>
        The purpose of theses method is to return a vector of string or of string_view.<br>
        It's decided to hardcode the vector choice to have a call as simple possible.
        <pre>auto MyResult= "my,csv,line"s.split(",");</pre>
        <code>s</code> and <code>sv</code> suffixes are derived from the normalized literal suffixes.<br>
        <code>splitsv</code> has the advantage to be efficient in terms of CPU (no copy to do) and RAM (No memory to allocate for substring, just for the vector).<br>
        <code>splits</code> is useful if <code>splitsv</code> can't be used. For instance, it's needed if you try to split a temporary object.<br>
        Move semantic is used to optimize the returned vector.<br>
        Example of implementation in <a href="#Ref2">[2]</a> (could be optimized with an initial parsing to reserve the vector)
    </p>

    <H4>3.1.2 basic_string_view implementation only</H4>
    <p>
        Several options presented here are method in both <code>std::basic_string</code> and <code>std::basic_string_view</code>.
        It could make sense to implement them only in <code>std::basic_string_view</code> for several reasons :
        <ul>
            <li>The implementation is similar for both</li>
            <li>string can easily (and with a very small cost) be converted on string_view</li>
            <li><code>basic_string_view</code> is a new class then it's probably simpler to amend it to integrate theses features in C++ 17. Could be back ported later on <code>std::basic_string</code> if needed.</li>
        </ul>
    </p>

    <H4>3.1.3 splitf method</H4>
    <p><code>splitf</code> method will split the input string and call an unary functor with a <code>std::basic_string_view</code> as a parameter</p>
    <pre>template &lt;class F>
void splitf(const basic_string_view<CharT, Traits> &Separator,F functor) const</pre>
    <p>
        Some person does not wish to have a container returned to avoid its memory allocation, <code>splitf</code> is one possible method to address this concern.<br>
        Some person wishes to execute a processing on each value, <code>splitf</code> is a more direct way to address this request than iterate on <code>splitsv</code> result with for_each.<br>
        The transmitted string view allows compute the position of the substring in the initial string, it was highlighted as a potential need.<br>
        Example of usage displaying substring, initial position and length<pre>
strsv.splitf(" ", [&](const string_view &s) {
 cout &lt;&lt; s &lt;&lt;"   ,Pos="&lt;&lt;(s.data() -strsv.data())&lt;&lt;" ,Len="&lt;&lt;s.length()&lt;&lt; endl;
});</pre>
        Example of implementation in <a href="#Ref2">[2]</a> 
    </p>

    <H4>3.1.4 splitc method</H4>
    <p><code>splitc</code> method will split the input string and push substring in the container passed as output parameter.</p>
    <pre>template &lt;class T>
void splitc(const basic_string_view&lt;CharT, Traits> &Separator,T &Result) const</pre>
    <p>
        Some person does not wish to use a vector container, this option allows transmit a wide range of containers to address this concern.<br>
        This option also allows feed a container with another string type in it. Can be done in this way if it can be built from a string_view.<br>        
        Example of usage<pre>
vector&lt;string_view> vector5;
strsv.splitc(" ", vector5); 
vector&lt;string> vector6;
strsv.splitc(" ", vector6);     </pre>
        Example of implementation in <a href="#Ref2">[2]</a> 
    </p>



    <H4>3.1.5 Unify name for split methods</H4>
    <p>Instead of using suffixes to select the version, it would be nice to have an automatic selection of the right <code>split</code> version.<br>
      <code>splits</code>& <code>splitsv</code> have only 1 parameter and can easily be differentiated from <code>splitf</code> & <code>splitc</code>.<br>
      <code>splits</code>& <code>splitsv</code> will be selected by parametring the return type.</p>
      <pre>str.split&lt;string>(" ");  ==> Returns vector&lt;string>
str.split&lt;string_view>(" ");  ==> Returns vector&lt;string_view></pre>
    <p>It's not very user-friendly, but if we set a default type it can be better.
      I selected <code>basic_string_view</code> as default type as it's the most interesting in most cases and the longer to type (string_view=11 char).<br>
    When other type is needed (i.e. string), it can be specified, it can also be another string type if convertible from string_view </p>
      <pre>template&lt;class StringType=<b>basic_string_view&lt;CharT, Traits></b>,class SeparatorType >
vector&lt;StringType > split(const SeparatorType &Separator) const

str.split&lt;string>(" ");  ==> Returns vector&lt;string>
str.split(" ");  ==> Returns vector&lt;string_view></pre>
    <p class="modified">An alternate solution was proposed by Jakob Riedle with a specialization based on the fact invoker is a lvalue or a rvalue.<br>
      This solution has the advantage it automatically discards the usage of <code>string_view</code> on temporary values,
      but it makes it a bit more complex if you really want split as a copy in a <code>vector&lt;string></code>, it can, however, be done using <code>splitc(MyStringVector)</code>. I've tested it successfully with this prototype on GCC&VS2017</p>
    <pre>template&lt;class SeparatorType >
vector&lt;basic_string_view<CharT, Traits> > split(const SeparatorType &Separator)  const &;
template&lt;class SeparatorType >
vector&lt;basic_string<CharT, Traits> > split(const SeparatorType &Separator)  const &&;</pre>
    <p> To make the difference between <code>splitf</code> & <code>splitc</code> version, <code>enable_if</code> & <code>is_callable</code> may probably be used.
      I was unable to implement it (I left splitc as is).<br>
      Example of implementation in <a href="#Ref6">[6]</a>
    </p>



    <H4>3.1.6 Special values</H4>
    <p>If the separator is an empty string "" or the separator char is 0 then split methods will split on every char.<code>"abc"=>["a","b","c"]</code><br>
    If the input string is an empty string "" then split methods will return an empty container or never call the callback function.
    </p>



    <H4>3.1.7 split by single char</H4>
    <p>It was highlighted that split by a single char can be optimized regarding splitting by a string. Then it was suggested an overload for single char Separator.<br>
    A	counter-argument was this optimization can (and should) be detected at runtime or by a non-standardized overload added by the implentor.
    But why spend time to check if it can be determined at compilation.<br>
    For information the <code>string::find</code> method as standardized the char overload too.</p>
    <p>
      It was decided to add the char overload as it also avoid the creation of the temporary string_view object.<br>
      The implementors can decide to forward it to the string_view version if it don't want to optimize this case.<br>
      Specify it as part of the standard is an incitation to optimize this case as it's probably the most common case.
    </p>
    <p>
        Example of implementation in <a href="#Ref2">[2]</a>
    </p>


    <H4>3.1.8 split by regexp</H4>
    <p>
        It was highlighted that split by a regexp can be useful, it can also be a way to implement the split by a set of separators, then an overload of <code>split</code> with <code>regexp</code> could make sense.
</p><p>
    It may introduce a dependency on regexp which is perhaps not a good idea.<br>
    But it could be implemented by a regexp_split function. <br>
    Perhaps it's not an issue to have a dependency with another standard class. In this case, an overload on member-function may make sense.<br>
      This option is part of the proposed text.
</p>
<p>       Example of implementation in <a href="#Ref2">[2]</a>    </p>


    <H4>3.1.9 Alternate option : split by splitter object</H4>
    <p>
    It N3593 Proposal <a href="#Ref5">[5]</a> there's the idea of using a template delimiter object which must implement a find method.<br>
    The proposal suggests providing 4 built-in classes for char, string, any_of_string, regex.<br>
    It's a smart idea as it allows reduce the number of overload and provide the possibility to extend the concept.<br>
    However I think that the benefit provided by reduction of overload is lost by the introduction of 4 new classes. And the interest of adding a new kind of splitter is reduced.<br>
    Then I consider this solution is a bit more complex for a reduced benefit.
    <p>



<H4>3.1.10 Alternate option : split as non-member function : string_split</H4>
    <p>
        <code>string_split</code> function algorithm may replace member function as it doesn't require special access to the class.<br>
      Pros are :<br>
      * avoid implementing it in both classes.<br>
      It's true, methods have to be declared on both classes, however, in <code>basic_string</code> it can be a generic forwarder (below examples used for unified split).
      <pre>	// SPLIT Version replacing splits & splitsv for any separator
	template<class StringType = basic_string_view&lt;CharT, Traits>, class SeparatorType >
	vector&lt;StringType > split(const SeparatorType &Separator) const
	{
		basic_string_view17&lt;CharT, Traits> sv(this->c_str(),this->size());
		return sv.split&lt;StringType>(Separator);
	}

	// SPLIT Version replacing splitf & splitc for any separator
	template&lt;class SeparatorType , class TargetType>
	void split(const SeparatorType &Separator, TargetType Target) const
	{
		basic_string_view17&lt;CharT, Traits> sv(this->c_str(), this->size());
		return sv.split(Separator, Target);
	}
</pre>
    * It can be used with alternate string class, however, it will probably require they have a standardized way to extract substring by having a substr method
        (it's not the case of qstring, Cstring, AnsiString ) making this argument less valuable<br>
    <span class="modified"> * It may be simpler to use with char*</span><br>
   <code> const char *MyCharStr="My char* to split";<br>
     vector4 = string_split(MyCharStr," ");</code><br>
    could be written instead of <br>
    <code>vector4 = string_view(MyCharStr).split(" ");</code><br>
    However during my tests <a href="#Ref6">[6]</a> I wasn't able to make <code>char*</code> works with this definition (CharT deduction impossible)<br>
    <pre>template &lt;class CharT,class Traits = std::char_traits&lt;CharT>,
	  class StringType = basic_string_view&lt;CharT, Traits> >
vector&lt;StringType > string_split(const basic_string_view17&lt;CharT, Traits> &InputStr, const basic_string_view&lt;CharT, Traits> &Separator)</pre>
    But was possible if I skip the CharT with this definition instead (but it will require to define a function for each possible CharT that is opposed to the 1st Pros argument) <br>
    <code>template &lt;class StringType = string_view ><br>
vector&lt;StringType > string_split(const string_view17 &InputStr, const string_view &Separator)
</code><br>


Cons :<br>
* why 'find*','substr','copy' can be a member and 'split' should not? It looks to me more homogeneous. <br>
* I consider using member function is more Object Oriented Programming.<br>
<pre>mystring.split(',')  //seems to me more OOP
string_split(mystring,',')  //seems to me more C/Procedural </pre>
* Introduction of a 'global_name' <code>string_split</code> which can enter in a collision with a user function.
    It's supposed to be protected by namespace but usage of <code>using namespace std;</code> is quite common.

    </p>




    <H4>3.1.11 Alternate option : string_split function returning a range</H4>
    <p>
        <code>string_split</code> function algorithm will return a range and can potentially replace all the previously discussed method<br>
        Example of usage<pre>
string MyStr("my,csv,line")
vector&lt;string>  MyResult(split(MyStr, ","));
// Efficient conversion from string to string_view may require an explicit initial casting
vector&lt;string_view> MyResult(split(string_view(MyStr), ","));
// splitf replacement
std::for_each(split(string_view(MyStr), ","), callback);    </pre>
Example of implementation in <a href="#Ref3">[3]&[4]</a><br>
It looks at a smart option, however, some cons on it :<br>
* Temporary object issues (see below)<br>
* Some notation may be heavier than with methods options

    </p>
    <p><a name="TempObject"><span style="text-decoration: underline;">The Temporary object issue</span></a><br>
        During the evaluation of the prototype it was clearly appears there's a problem with temporary objects.<br>
<pre>string string_toLower(string Str)
{
    std::transform(Str.begin(), Str.end(), Str.begin(), ::tolower);
    return Str;
}
int main()
{
  for(auto x : split_string(vstr1,regex("\\s")))      <== (1)
    . . . . .
  for(auto x : split_string(string_toLower("C++ Is Very Fun!!!")," "))    <== (2)
    . . . . .
}    </pre>
Code (1) will <b>not</b> work because <code>split_string</code> use a reference to the <code>regex</code> but has this one doesn't exist anymore when
    <code>split_string</code> returns, the returned iterator got a problem because the regex object doesn't exist anymore.<br>
    It can easily be fixed by copying the regex in the iterator, but it's not cost free. Problem is potentially the same with the string separator but has it should be a literal it should be OK in most times.
    </p>
    <p> Code (2) will <b>not</b> work either and it's a bigger problem. It's for the same reason, when we use the returned iterator, the lowered string doesn't exist anymore.
      A workaround could be to systematically make a copy of the inputString but it doesn't make sense from a memory and a performance perspective.
      It will also invalidate returned string_view in several other usage as the returned string view will not be on the initial object but on the saved copy.
    </p><p>
      A complementary test done using split action of range-v3 library seems to confirm this fact :
<pre>vector<string>vect1= string_toLower(str8) | view::split(',');</pre>
    This instruction display this assertion :<br>
<pre>error: static assertion failed: You can't pipe an rvalue container into a view. First, save the container into a named variable, and then pipe it to the view.</pre>
    Which let think that, by design, range cannot handle temporary values.

    </p><p>
    <span style="text-decoration: underline;">Solutions Comparison</span><br>
      Since the first answer in the proposal thread on google groups, there's a debate between methods vs function. Bellow a factual comparison on some use case.<br>
      Note : Other options like unified name split or non member function can be consider but they are pretty close of method option.
<style>
#table1 td:nth-child(2),#table1 td:nth-child(3) {font-family: Menlo, Monaco, Consolas, "Courier New", monospace;font-size: small;min-width: 300px;}
</style>
    <table border="1" id="table1">
      <tr>
        <th>Usage</th>
        <th>Range function Option</th>
        <th>Method Option</th>
        <th>Remarks</th>
      </tr>
      <tr>
        <td>Split in vector of string_view</td>
        <td>vector&lt;string_view> vec =string_split(str,",");</td>
        <td>auto vec=str.splitsv(",");</td>
        <td>Function solution allocate 1 extra split_range_string and 3 extra iterators (first_,past_last_,the internal of for) </td>
      </tr>
      <tr>
        <td>Split in vector of string</td>
        <td>vector&lt;string> vec=string_split(str,",");</td>
        <td>auto vec=str.splits(",");</td>
        <td>Function solution allocate 1 extra split_range_string and 3 extra iterators (first_,past_last_,the internal of for) +creation of intermediate string_view </td>
      </tr>
      <tr>
        <td>Split in list of string_view</td>
        <td>list&lt;string_view> lst =string_split(str,",");</td>
        <td>list&lt;string_view> lst;<br>
          str.splitc(",",lst);</td>
        <td>Function solution use the same notation for every container.<br>
          Method solution requires a different coding for other containers but is more efficient as it doesn't require extra object creation.
        </td>
      </tr>
      <tr>
        <td>Split over a function</td>
        <td>for_each(string_split(str,","),MyFunction);</td>
        <td>str.splitf(",",MyFunction);</td>
        <td>Function solution allocate extra objects </td>
      </tr>
      <tr>
        <td>Split over a loop (a)</td>
        <td>for(s :string_split(str,","))</td>
        <td>for(s :str.splitsv(","))</td>
        <td>Methods solution allocate an extra vector of string_view  </td>
      </tr>
      <tr>
        <td>Split over a loop (b)</td>
        <td>for(s :string_split(str,","))</td>
        <td>	strsv.splitf(" ", [&](const string_view &s)	{<br>
		cout &lt;&lt; s &lt;&lt; endl; <br>
	});
</td>
        <td>Methods solution doesn't allocate the extra vector and no extra object like function solution but the notation is quite less conventional </td>
      </tr>
      <tr>
        <td>Temporary object</td>
        <td>vec=string_split(GetTmpObject(),",");</td>
        <td>vec=GetTmpObject().splits(",");</td>
        <td>
          Function solution : Doesn't work (see Temporary object issue paragraph)<br>
          Methods solution : works fine
        </td>
      </tr>
      <tr>
        <td>From alternate string class (Must be convertible from string_view)</td>
        <td>vec=string_split(MyOtherString,",");</td>
        <td>vec=string_view(MyOtherString).splitsv(",");</td>
        <td>
          both options use in fact the string_view implementation. string_split is not instanciated for the alternate string class.
        </td>
      </tr>
    </table>
    </p>



    <H4>3.1.12 comparison with range::split</H4>
    <p>C++ 17 will bring a new feature named range that will allow simplify operations on several stuff, thanks to Eric Niebler for his strong commitment on that.<br>
      Range-V3 (ancestor of the proposal) include a split action allowing split containers by an element, a range or a predicate.<br>
      This solution would fit the need but with some restriction I think.
      Ranges are designed to handle general containers with general purpose content, at the opposite this proposal is string oriented and integrate easy to use features.<br>
      Split by a single char with range is easy to do and can be a reasonable alternative : <code>vector&lt;string>vect1= str8 | view::split(',');</code><br>
      Use range to split by a substring or a regex will be significantly more complex and require significant pieces of technical code.<br>
      Take advantage of string_view is possible with range but it requires an explicit cast of the input string then I consider it should be the default case.<br>
      As described in previous paragraph (range function) range usage cause creation of several intermediate objects (the range, iterators),
      if we consider the case of parsing a 100 000 line CSV file of 10 columns where each line is handled using the random access operator on the application.
      I've the feeling that using the <code>splitc</code> variant on a vector will be simpler to write and cause significantly less allocation of internal objects.    </p>


    <H3>3.2 Discussion on join</H3>
    <H4>3.2.1 join static method</H4>
    <p><code>join</code> static method will join a list of input string transmitted on a iterable container and add a delimiter between each value</p>
    <pre>template&lt;class T,class U>
static basic_string&lt;CharT, Traits, Allocator> join(T &InputStringList, U Separator)</pre>
    <p>

    Example of usage (simple but so useful)<pre>
cout << "Join of string vector=" << string::join(vector6, "_") << endl;
cout << "Join of string_view vector=" << string::join(vector5, "_") << endl;    </pre>
        Example of implementation in <a href="#Ref2">[2]</a> 
</p>

    <H4>3.2.2 join method</H4>
    <p><code>join</code> method will use the current string as a separator to join the list, it's the way of usage of join in Python</p>
    <pre>template&lt;class T>
basic_string&lt;CharT, Traits, Allocator> join(T &InputStringList)</pre>
    <p>
        Example of usage <pre>
cout << "pythonic Join of string =" << "-"s.join(vector6) << endl;
    </pre>
This option could also be part of <code>std::basic_string_view</code> class, the static option make less sense.
<br>Example of implementation in <a href="#Ref2">[2]</a> 
    </p>

    <H4>3.2.3 string_join function</H4>
    <p><code>string_join</code> function act exactly as static function but is more adequate if <code>string_split</code> function is the selected option for split.</p>
    <pre>template&lt;class T,class U>
basic_string&lt;typename T::value_type::value_type, typename T::value_type::traits_type> string_join(T &InputStringList, U Separator)</pre>
    <p> Example of usage (simple but so useful)<pre>
cout << "Join of string_view vector with string_join=" << string_join(vector5, "_") << endl;</pre>
        Example of implementation in <a href="#Ref2">[2]</a> 
    </p>

    <H4>3.2.4 Processing optimization</H4>
    <p>It's possible to optimize the processing by iterating on the container a first time to compute the size of the final string and reserve it below an example on </p>
    <pre>template&lt;class T>
basic_string&lt;typename T::value_type::value_type, typename T::value_type::traits_type> 
  string_join(const T &InputStringList
            , const basic_string_view&lt;typename T::value_type::value_type,typename T::value_type::traits_type> Separator)
{
	basic_string&lt;T::value_type::value_type, T::value_type::traits_type> result_string;
	size_t StrLen = 0;	
	if (InputStringList.empty())
		return result_string;
	auto it = InputStringList.begin();
	for (; it != InputStringList.end(); ++it)
		StrLen += it->size() + Separator.size();
	result_string.reserve(StrLen);
	result_string += *InputStringList.begin();
	for (it = ++InputStringList.begin(); it != InputStringList.end(); ++it)
	{
		result_string += Separator;
		result_string += *it;
	}
	return result_string;
}</pre>
    <p>
        However it implies to be able to obtain the length of each string in both InputStringList and Separator. 
        We can consider it's quite common that the separator will be a <code>char*</code> but it doesn't have <code>size()</code> member, so as a workaround the separator is specified as <code>const string_view</code>.<br>
        The problem is the same if <code>InputStringList</code> is a <code>vector&lt;char*></code>, but in this case the problem is bigger as it seems to me impossible to specify the type of the returned string.
    </p>


    <H4>3.2.5 Parameter order</H4>
    <p>In several function like string_join it was highlighted the  order could be with the separator first as it's smaller than the container (or the input string).</p>
    <p> In some language the standard implementation is with the separator first :  PHP , C#, Java<br>
In some others, it's separator last Go, Rust, boost::algorithm::join, LibC strtok<br>
        Python has a different logic as the separator is the "caller" object (like join classic method described earlier)<br>
        Having the separator as 2nd parameters would allow have it optional with "" <br>
        It has to be analyzed later, but the consensus seems to be for the separator last.
    </p>



    <H3 class="modified">3.3 Discussion on case management function</H3>
    <p>This chapter wasn't in the initial perimeter of the proposal, but as in fact this proposal talk about solutions
    to help to handle std::string, I have decided to add it in order to avoid managing a separate proposal.</p>

    <H4>3.3.1 tolower/toupper for string</H4>
    <p>Today STL provide solutions to convert in lowercase/uppercase a single char, but applying it to a string is quite non-intuitive
      when we can hope a modern language should handle that easily.</p>
     <pre>std::string result;
std::transform( src.begin(), src.end(), std::back_inserter( result ), ::tolower );</pre>
<p>When it could be <code>std::string result=src.tolower();</code>
  The proposal is to add  <code>tolower</code> and <code>toupper</code> member method <code>basic_string</code> and <code>basic_string_view</code> classes has shorthand for string of existing <code>tolower</code> and <code>toupper</code> function of <code>cctype</code> header which return a converted copy.
     </p>
    <p>As discussed for split & join it could also be non-members functions <code>string_toupper</code> there are quite the same Pros & Cons<br>Possible implementation</p>
    <pre>basic_string&lt;CharT, Traits> tolower() const
{
  basic_string&lt;CharT, Traits> result;
  result.reserve(size()); // Allows reserve the space
  std::transform(begin(), end(), std::back_inserter(result), ::tolower);
  return result;
}</pre>


    <H4>3.3.2 In place tolower/toupper for string</H4>
    <p>The previous &sect; propose methods returning transformed copy of the input string. But in some case we don't need to keep the original then it makes sense to reuse the memory instead of allocating a new one. Then we propose to have an in place version of methods suffixed by <code>_inplace</code> (we found a similar approach in POCO library)(perhaps an native English speaker may suggest a more adequate suffix(emplace like emplace_back?) ).
    </p>
<pre>void toupper_inplace();</pre>
      Example of implementation in <a href="#Ref6">[6]</a>

    <H4>3.3.3 Locale version of tolower/toupper </H4>
    <p>Perhaps it could make sense to have version parametrized by a locale which uses <code>toupper</code> for locale header.<br>
      Perhaps this version could handle 1:n conversion like '&szlig;'=>'SS' which is currently not handled by the existing single char <code>toupper</code>.<br>
      Perhaps it makes sense to have it as non-members functions part of <code>locale</code> header.
    </p>

    <H4>3.3.4 case insensitive compare </H4>
    <p>It may append to wish to compare 2 strings in a case insensitive manner, but it's not easy to do it in an efficient manner.<br>
      A naive solution consists to convert both string in lowercase on to call <code>compare</code> on the result. However this option may cause 2 extra memory allocation, when perhaps it can be detected immediately that on first char 'a'!='b'<br>
      We propose to add a <code>icompare</code> method similar to the existing <code>compare</code> but not case sensitive in both <code>basic_string</code> and <code>basic_string_view</code> classes.<br>
      <code>int icompare( const basic_string& str ) const;</code>
    </p>




    <H2>IV. Proposed Text</H2>
    <H4>Addition to &lt;string_view> header</H4>
    <H5>add in class basic_string_view</H5>
<pre>vector&lt;basic_string&lt;CharT, Traits> > splits(const basic_string_view &Separator) const;
vector&lt;basic_string&lt;CharT, Traits> > splits(const typename basic_string_view::value_type Separator) const;
vector&lt;basic_string&lt;CharT, Traits> > splits(const basic_regex&lt;CharT> &Separator) const;

vector&lt;basic_string_view> splitsv(const basic_string_view &Separator) const;
vector&lt;basic_string_view> splitsv(const typename basic_string_view::value_type Separator) const;
vector&lt;basic_string_view> splitsv(const basic_regex&lt;CharT> &Separator) const;

template &lt;class F>
void splitf(const basic_string_view &Separator,F functor) const;
template &lt;class F>
void splitf(const typename basic_string_view::value_type Separator,F functor) const;
template &lt;class F>
void splitf(const basic_regex&lt;CharT> &Separator,F functor) const;

template &lt;class T>
void splitc(const basic_string_view &Separator,T &Result) const;
template &lt;class T>
void splitc(const typename basic_string_view::value_type Separator,T &Result) const;
template &lt;class T>
void splitc(const basic_regex&lt;CharT> &Separator,T &Result) const;

basic_string&lt;CharT, Traits> toupper() const;
void toupper_inplace() ;
basic_string&lt;CharT, Traits> tolower() const;
void tolower_inplace();
int icompare( const basic_string_view& str ) const;</pre>


<h5>splits method and overloads</h5>
<pre>vector&lt;basic_string&lt;CharT, Traits> > splits(const basic_string_view &Separator) const
vector&lt;basic_string&lt;CharT, Traits> > splits(const typename basic_string_view::value_type Separator) const
vector&lt;basic_string&lt;CharT, Traits> > splits(const basic_regex&lt;CharT> &Separator) const  </pre>
    <p>
      <b>Effects: </b>split a string based on the separator and return the result in a vector of string<br>
      The separator can be :<br>
      * a string<br>
      * a single char<br>
      * regexp<br>
      <b>Returns:</b> vector&lt;string> <br>
      <b>Remarks:</b> if string is empty (size()==0) or single char==0 then the string will be split on every char<br>
      If the input string is an empty string "" then split methods return an empty container or never call the callback function.
    </p>

<h5>splitsv method and overloads</h5>
<pre>vector&lt;basic_string_view> splitsv(const basic_string_view &Separator) const
vector&lt;basic_string_view> splitsv(const typename basic_string_view::value_type Separator) const
vector&lt;basic_string_view> splitsv(const basic_regex&lt;CharT> &Separator) const  </pre>
    <p>
      <b>Effects: </b>split a string based on the separator and return the result in a vector of string_view<br>
      The separator can be :<br>
      * a string<br>
      * a single char<br>
      * regexp<br>
      <b>Returns:</b> vector&lt;string_view> <br>
      <b>Remarks:</b> if string is empty (size()==0) or single char==0 then the string will be split on every char<br>
      If the input string is an empty string "" then split methods return an empty container or never call the callback function.
    </p>


<h5>splitf method and overloads</h5>
<pre>template &lt;class F>
void splitf(const basic_string_view &Separator,F functor) const
template &lt;class F>
void splitf(const typename basic_string_view::value_type Separator,F functor) const
template &lt;class F>
void splitf(const basic_regex&lt;CharT> &Separator,F functor) const  </pre>
    <p>
      <b>Effects: </b>split a string based on the separator and  call a unary function for each occurence<br>
      The separator can be :<br>
      * a string<br>
      * a single char<br>
      * regexp<br>
      <b>Returns:</b> void <br>
      <b>Remarks:</b> if string is empty (size()==0) or single char==0 then the string will be split on every char<br>
      If the input string is an empty string "" then split methods return an empty container or never call the callback function.
    </p>


<h5>splitc method and overloads</h5>
<pre>template &lt;class T>
void splitc(const basic_string_view &Separator,T &Result) const
template &lt;class T>
void splitc(const typename basic_string_view::value_type Separator,T &Result) const
template &lt;class T>
void splitc(const basic_regex&lt;CharT> &Separator,T &Result) const  </pre>
    <p>
      <b>Effects: </b>split a string based on the separator and  return the result in the container passed as output parameter<br>
      The separator can be :<br>
      * a string<br>
      * a single char<br>
      * regexp<br>
      <b>Returns:</b> void <br>
      <b>Remarks:</b> if string is empty (size()==0) or single char==0 then the string will be split on every char<br>
      If the input string is an empty string "" then split methods return an empty container or never call the callback function.
    </p>

<h5 class="modified">case management method </h5>
<pre>basic_string&lt;CharT, Traits> toupper() const;
basic_string&lt;CharT, Traits> tolower() const;</pre>
    <p>
      <b>Effects: </b>Return a copy of a string transformed in lowercase/uppercase.<br>
      <b>Returns:</b> the transformed copy <br>
    </p>
<pre>void toupper_inplace();
void tolower_inplace();</pre>
    <p>
      <b>Effects: </b>Transform the string in is lowercase/uppercase version.<br>
      <b>Returns:</b> void <br>
      <b>Remarks:</b> it replace the original string
    </p>
<pre>int icompare( const basic_string_view& str ) const;</pre>
    <p>
      <b>Effects: </b>Make a comparison of string in a similar manner that compare but case insensitive<br>
      <b>Returns:</b> negative value if *this appears before the character sequence specified by the arguments, in lexicographical order<br>
zero if both character sequences compare equivalent<br>
positive value if *this appears after the character sequence specified by the arguments, in lexicographical order
    </p>

    <H4>Addition to &lt;string> header</H4>
    <H5>add in class basic_string</H5>
<pre>vector&lt;basic_string> splits(const basic_string_view&lt;CharT, Traits> &Separator) const;
vector&lt;basic_string> splits(const typename basic_string_view&lt;CharT, Traits>::value_type Separator) const;
vector&lt;basic_string> splits(const basic_regex&lt;CharT> &Separator) const;
vector&lt;basic_string_view&lt;CharT, Traits> > splitsv(const basic_string_view&lt;CharT, Traits> &Separator) const;
vector&lt;basic_string_view&lt;CharT, Traits> > splitsv(const typename basic_string_view&lt;CharT, Traits>::value_type Separator) const;
vector&lt;basic_string_view&lt;CharT, Traits> > splitsv(const basic_regex&lt;CharT> &Separator) const;
template &lt;class F>
void splitf(const basic_string_view&lt;CharT, Traits> &Separator,F functor) const;
template &lt;class F>
void splitf(const typename basic_string_view&lt;CharT, Traits>::value_type Separator,F functor) const;
template &lt;class F>
void splitf(const basic_regex&lt;CharT> &Separator,F functor) const;
template &lt;class T>
void splitc(const basic_string_view&lt;CharT, Traits> &Separator,T &Result) const;
template &lt;class T>
void splitc(const typename basic_string_view&lt;CharT, Traits>::value_type Separator,T &Result) const;
template &lt;class T>
void splitc(const basic_regex&lt;CharT> &Separator,T &Result) const;

template&lt;class T,class U>
static basic_string&lt;CharT, Traits, Allocator> join(T &InputStringList, U Separator);

basic_string toupper() const;
void toupper_inplace() ;
basic_string tolower() const;
void tolower_inplace();
int icompare( const basic_string& str ) const;</pre>
<h5>splits,splitsv,splitf,splitc,tolower,toupper,icompare methods and overloads</h5>
    <p>
      Exactly the same behavior as their <code>basic_string_view</code> equivalent it's a shortcut.
    </p>
<h5>join static method</h5>
<pre>template&lt;class T,class U>
static basic_string&lt;CharT, Traits, Allocator> join(T &InputStringList, U Separator)</pre>
    <p>
      <b>Effects: </b>return a string which join all string contained in the InputStringList. Add a separator between each string. If there's N string there's N-1 separator inserted.<br>

      <b>Returns:</b> Aggregated string <br>
      <b>Remarks:</b> if InputStringList is empty method returns an empty string.
    </p>

    <H2>VI References</H2>

    <ul>
    <li>
    <a name="Ref1">
                        [1] Discussion on Google Groups</a>
    <a href="https://groups.google.com/a/isocpp.org/forum/?utm_medium=email&utm_source=footer#!topic/std-proposals/JTKTThJ-7Ko">https://groups.google.com/a/isocpp.org/forum/?utm_medium=email&utm_source=footer#!topic/std-proposals/JTKTThJ-7Ko</a>                    
                </li>
    <li><a name="Ref2">
      [2] Example of implementation on GitHub with  VC17 (testes also with GCC 6.2 MingW)</a>
      <br>Test : <a href="https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_1/testongcc.cpp">https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_1/testongcc.cpp</a>
      <br>Lib : <a href="https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_1/string17.h">https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_1/string17.h</a>
    </li>
    <li><a name="Ref3">
      [3a] Implementation prototype done for N3593 Proposal</a>
      <br><a href="https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/StringSplit.h">https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/StringSplit.h</a>
    </li>
    <li>
      [3b] Test code of implementation prototype done for N3593 Proposal
      <br><a href="https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/TestStringSplit.cpp">https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/TestStringSplit.cpp</a>
    </li>
    <li><a name="Ref4">
      [4] Example of implementation by range initiated by Nicol and amended by Laurent on GitHub</a>
      <br><a href="https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/RangeSplitTest.cpp">https://github.com/laurent-n/cpp17_implode_explode/blob/master/testcpp17_3/RangeSplitTest.cpp</a>
    </li>
    <li><a name="Ref5">
      [5] N3593 Proposal</a>
      <br><a href="http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3593.html">http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2013/n3593.html</a>
    </li>
    <li><a name="Ref6">
      [6] Example of implementation of unified split version on GitHub with  VC17 (testes also with GCC 6.2 MingW)</a>
      <br><a href="https://github.com/laurent-n/cpp17_implode_explode/tree/master/testcpp17_4">https://github.com/laurent-n/cpp17_implode_explode/tree/master/testcpp17_4</a>
    </li>

  </ul>

    <H2>VII Acknowledgements</H2>
Thanks to Nicol Bolas, Thiago Macieira,Ville Voutilainen ,Olaf van der Spek, Matthew Woehlke, Alexander Bolz, Marco Arena, Jakob Riedle
    for assistance on this paper.

</BODY>
</HTML>
