<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="utf-8">
<title>Issue 1282: A proposal to add std::split algorithm</title>
<meta property="og:title" content="Issue 1282: A proposal to add std::split algorithm">
<meta property="og:description" content="C++ library issue. Status: NAD">
<meta property="og:url" content="https://cplusplus.github.io/LWG/issue1282.html">
<meta property="og:type" content="website">
<meta property="og:image" content="http://cplusplus.github.io/LWG/images/cpp_logo.png">
<meta property="og:image:alt" content="C++ logo">
<style>
  p {text-align:justify}
  li {text-align:justify}
  pre code.backtick::before { content: "`" }
  pre code.backtick::after { content: "`" }
  blockquote.note
  {
    background-color:#E0E0E0;
    padding-left: 15px;
    padding-right: 15px;
    padding-top: 1px;
    padding-bottom: 1px;
  }
  ins {background-color:#A0FFA0}
  del {background-color:#FFA0A0}
  table.issues-index { border: 1px solid; border-collapse: collapse; }
  table.issues-index th { text-align: center; padding: 4px; border: 1px solid; }
  table.issues-index td { padding: 4px; border: 1px solid; }
  table.issues-index td:nth-child(1) { text-align: right; }
  table.issues-index td:nth-child(2) { text-align: left; }
  table.issues-index td:nth-child(3) { text-align: left; }
  table.issues-index td:nth-child(4) { text-align: left; }
  table.issues-index td:nth-child(5) { text-align: center; }
  table.issues-index td:nth-child(6) { text-align: center; }
  table.issues-index td:nth-child(7) { text-align: left; }
  table.issues-index td:nth-child(5) span.no-pr { color: red; }
  @media (prefers-color-scheme: dark) {
     html {
        color: #ddd;
        background-color: black;
     }
     ins {
        background-color: #225522
     }
     del {
        background-color: #662222
     }
     a {
        color: #6af
     }
     a:visited {
        color: #6af
     }
     blockquote.note
     {
        background-color: rgba(255, 255, 255, .10)
     }
  }
</style>
</head>
<body>
<hr>
<p><em>This page is a snapshot from the LWG issues list, see the <a href="lwg-active.html">Library Active Issues List</a> for more information and the meaning of <a href="lwg-active.html#NAD">NAD</a> status.</em></p>
<h3 id="1282"><a href="lwg-closed.html#1282">1282</a>. A proposal to add <code>std::split</code> algorithm</h3>
<p><b>Section:</b> 26 <a href="https://wg21.link/algorithms">[algorithms]</a> <b>Status:</b> <a href="lwg-active.html#NAD">NAD</a>
 <b>Submitter:</b> Igor Semenov <b>Opened:</b> 2009-12-07 <b>Last modified:</b> 2019-02-26</p>
<p><b>Priority: </b>Not Prioritized
</p>
<p><b>View other</b> <a href="lwg-index-open.html#algorithms">active issues</a> in [algorithms].</p>
<p><b>View all other</b> <a href="lwg-index.html#algorithms">issues</a> in [algorithms].</p>
<p><b>View all issues with</b> <a href="lwg-status.html#NAD">NAD</a> status.</p>
<p><b>Discussion:</b></p>
<ol style="list-style-type:upper-roman">

<li>
<p>
Motivation and Scope
</p>
<p>
Splitting strings into parts by some set of delimiters is an often task, but
there is no simple and generalized solution in C++ Standard. Usually C++
developers use <code>std::basic_stringstream&lt;&gt;</code> to split string into
parts, but there are several inconvenient restrictions:
</p>

<ul>
<li>
we cannot explicitly assign the set of delimiters;
</li>
<li>
this approach is suitable only for strings, but not for other types of
containers;
</li>
<li>
we have (possible) performance leak due to string instantiation.
</li>
</ul>
</li>

<li>
<p>
Impact on the Standard
</p>
<p>
This algorithm doesn't interfere with any of current standard algorithms.
</p>
</li>

<li>
<p>
Design Decisions
</p>
<p>
This algorithm is implemented in terms of input/output iterators. Also, there is
one additional wrapper for <code>const CharType *</code> specified delimiters.
</p>
</li>

<li>
<p>
Example implementation
</p>
<pre>
template&lt; class It, class DelimIt, class OutIt &gt;
void split( It begin, It end, DelimIt d_begin, DelimIt d_end, OutIt out )
{
   while ( begin != end )
   {
       It it = std::find_first_of( begin, end, d_begin, d_end );
       *out++ = std::make_pair( begin, it );
       begin = std::find_first_of( it, end, d_begin, d_end,
           std::not2( std::equal_to&lt; typename It::value_type &gt;() ) );
   }
}

template&lt; class It, class CharType, class OutIt &gt;
void split( It begin, It end, const CharType * delim, OutIt out )
{
   split( begin, end, delim, delim + std::strlen( delim ), out );
}
</pre>
</li>

<li>
<p>
Usage
</p>
<pre>
std::string ss( "word1 word2 word3" );
std::vector&lt; std::pair&lt; std::string::const_iterator, std::string::const_iterator &gt; &gt; v;
split( ss.begin(), ss.end(), " ", std::back_inserter( v ) );

for ( int i = 0; i &lt; v.size(); ++i )
{
   std::cout &lt;&lt; std::string( v[ i ].first, v[ i ].second ) &lt;&lt; std::endl;
}
// word1
// word2
// word3
</pre>
</li>

</ol>

<p><i>[
2010-01-22 Moved to Tentatively NAD Future after 5 positive votes on c++std-lib.
Rationale added below.
]</i></p>


<p><i>[LEWG Kona 2017]</i></p>

<p>Recommend NAD: Paper encouraged. Have papers for this; <a href="https://wg21.link/LEWG259">LEWG259</a>.</p>



<p><b>Rationale:</b></p>
<p>
The LWG is not considering completely new features for standardization at this
time.  We would like to revisit this good suggestion for a future TR and/or
standard.
</p>


<p id="res-1282"><b>Proposed resolution:</b></p>
<p>
Add to the synopsis in 26.1 <a href="https://wg21.link/algorithms.general">[algorithms.general]</a>:
</p>

<blockquote><pre>
template&lt; class ForwardIterator1, class ForwardIterator2, class OutputIterator &gt;
  void split( ForwardIterator1 first, ForwardIterator1 last,
              ForwardIterator2 delimiter_first, ForwardIterator2 delimiter_last,
              OutputIterator result );

template&lt; class ForwardIterator1, class CharType, class OutputIterator &gt;
  void split( ForwardIterator1 first, ForwardIterator1 last,
              const CharType * delimiters, OutputIterator result );
</pre></blockquote>

<p>
Add a new section [alg.split]:
</p>

<blockquote><pre>
template&lt; class ForwardIterator1, class ForwardIterator2, class OutputIterator &gt;
  void split( ForwardIterator1 first, ForwardIterator1 last,
              ForwardIterator2 delimiter_first, ForwardIterator2 delimiter_last,
              OutputIterator result );
</pre>

<blockquote>
<p>
1. <i>Effects:</i> splits the range <code>[first, last)</code> into parts, using any
element of <code>[delimiter_first, delimiter_last)</code> as a delimiter. Results
are pushed to output iterator in the form of <code>std::pair&lt;ForwardIterator1,
ForwardIterator1&gt;</code>. Each of these pairs specifies a maximal subrange of
<code>[first, last)</code> which does not contain a delimiter.
</p>
<p>
2. <i>Returns:</i> nothing.
</p>
<p>
3. <i>Complexity:</i> Exactly <code>last - first</code> assignments.
</p>
</blockquote>

<pre>
template&lt; class ForwardIterator1, class CharType, class OutputIterator &gt;
  void split( ForwardIterator1 first, ForwardIterator1 last,
              const CharType * delimiters, OutputIterator result );
</pre>

<blockquote>
<p>
1. <i>Effects:</i> split the range <code>[first, last)</code> into parts, using any
element of <code>delimiters</code> (interpreted as zero-terminated string) as a
delimiter. Results are pushed to output iterator in the form of
<code>std::pair&lt;ForwardIterator1, ForwardIterator1&gt;</code>. Each of these
pairs specifies a maximal subrange of <code>[first, last)</code> which does not
contain a delimiter.
</p>
<p>
2. <i>Returns:</i> nothing.
</p>
<p>
3. <i>Complexity:</i> Exactly <code>last - first</code> assignments.
</p>
</blockquote>

</blockquote>





</body>
</html>
