<html>

<head>
<meta http-equiv="Content-Language" content="en-us">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>Conversion Library Proposal for TR2</title>
</head>

<body>

<p>Doc. no.&nbsp;&nbsp; WG21/N1973=06-0043<br>
Date:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y-%m-%d" startspan -->2006-04-10<!--webbot bot="Timestamp" endspan i-checksum="12259" --><br>
Project:&nbsp;&nbsp;&nbsp;&nbsp; Programming Language C++<br>
Reply to:&nbsp;&nbsp; Kevlin Henney &lt;<a href="mailto:kevlin@curbralan.com">kevlin@curbralan.com</a>&gt;<br>
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; Beman Dawes &lt;<a href="mailto:bdawes@acm.org">bdawes@acm.org</a>&gt;</p>
<h1>Lexical Conversion Library Proposal for TR2</h1>
<p><a href="#Introduction">Introduction</a><br>
<a href="#Motivation">Motivation and Scope</a><br>
<a href="#Impact">Impact on the Standard</a><br>
<a href="#Design">Important Design Decisions</a><br>
<a href="#Text">Proposed Text for TR2</a><br>
&nbsp;&nbsp;&nbsp; <a href="#Synopsis">Synopsis</a><br>
&nbsp;&nbsp;&nbsp; <a href="#lexical_cast">Function template lexical_cast</a><br>
&nbsp;&nbsp;&nbsp; <a href="#bad_lexical_cast">Class bad_lexical_cast</a></p>
<h2><a name="Introduction">Introduction</a></h2>
<p>This paper proposes addition of a&nbsp;lexical conversion library component to the 
C++ Standard Library Technical Report 2. The proposal is based on the Boost 
Conversion Library's <code>lexical_cast</code> (<a href="http://www.boost.org/libs/conversion/lexical_cast.htm">www.boost.org/libs/conversion/lexical_cast.htm</a>).</p>
<p>The <code>lexical_cast</code> function template offers a convenient and 
consistent form for supporting common conversions to and from arbitrary types 
when they are represented as text. The Boost version of <code>lexical_cast</code> 
is very widely used. It would be a pure addition to the C++ standard.</p>
<p>Boost <code>lexical_cast</code> is particularly popular with end users. Five 
of six <a href="http://www.boost.org/doc/html/who_s_using_boost_/inhouse.html">
<i>Who's using Boost</i> in house</a> users list <code>lexical_cast</code> as 
one of the Boost libraries they use.</p>
<p>For a good discussion of the options and issues involved in string-based 
formatting, including comparison of <code>stringstream</code>, <code>
lexical_cast</code>, and others, see Herb Sutter's article,
<a href="http://www.gotw.ca/publications/mill19.htm"><i>The String Formatters of 
Manor Farm</i></a>.</p>
<p>Also see Bj<font face="Times New Roman"></font>rn Karlsson, <i>Beyond the 
C++ Standard Library</i>, 73-77, Addison Wesley, ISBN 0-321-13354-4,
<a href="http://www.awprofessional.com/title/0321133544">
www.awprofessional.com/title/0321133544</a></p>
<h2><a name="Motivation">Motivation</a> and Scope</h2>
<p><b><i>Why is this important? </i></b></p>
<p>Sometimes a value must be converted to a literal text form, such as an <code>
int</code> represented as a <code>string</code>, or vice-versa, when a <code>
string</code> is interpreted as an <code>int</code>. Such examples are common 
when converting between data types internal to a program and representation 
external to a program, such as windows and configuration files. </p>
<p>The standard C and C++ libraries offer a number of facilities for performing 
such conversions. However, they vary with their ease of use, extensibility, and 
safety. </p>
<p>For instance, there are a number of limitations with the family of standard C 
functions typified by <code>atoi</code>: </p>
<ul type="square">
  <li>Conversion is supported in one direction only: from text to internal data 
  type. Converting the other way using the C library requires either the 
  inconvenience and compromised safety of the <code>sprintf</code> function, or 
  the loss of portability associated with non-standard functions such as <code>
  itoa</code>. </li>
  <li>The range of types supported is only a subset of the built-in numeric 
  types, namely <code>int</code>, <code>long</code>, and <code>double</code>.
  </li>
  <li>The range of types cannot be extended in a uniform manner. For instance, 
  conversion from string representation to <code>complex</code> or <code>
  rational</code>. </li>
</ul>
<p>The standard C functions typified by <code>strtol</code> have the same basic 
limitations, but offer finer control over the conversion process. However, for 
the common case such control is often either not required or not used. The <code>
scanf</code> family of functions offer even greater control, but also lack 
safety and ease of use. </p>
<p>The standard C++ library offers <code>stringstream</code> for the kind of 
in-core formatting being discussed. It offers a great deal of control over the 
formatting and conversion of I/O to and from arbitrary types through text. 
However, for simple conversions direct use of <code>stringstream</code> can be 
either clumsy (with the introduction of extra local variables and the loss of 
infix-expression convenience) or obscure (where <code>stringstream</code> 
objects are created as temporary objects in an expression). Facets provide a 
comprehensive concept and facility for controlling textual representation, but 
their perceived complexity and high entry level requires an extreme degree of 
involvement for simple conversions, and excludes all but a few programmers. </p>
<p>The <code>lexical_cast</code> function template offers a convenient and 
consistent form for supporting common conversions to and from arbitrary types 
when they are represented as text. The simplification it offers is in 
expression-level convenience for such conversions. For more involved 
conversions, such as where precision or formatting need tighter control than is 
offered by the default behavior of <code>lexical_cast</code>, the conventional
<code>stringstream</code> approach is recommended. Where the conversions are 
numeric to numeric, other approaches may offer more reasonable behavior than <code>
lexical_cast</code>. </p>
<p><b><i>What kinds of problems does it address, and what kinds of programmers 
is it intended to support?</i></b></p>
<p>The library addresses everyday needs, for both application programs and 
libraries. It is useful across many application domains. It is useful to all 
levels of programmers, from rank beginners to seasoned experts.</p>
<p><b><i>Is it based on existing practice? Is there a reference implementation?</i></b></p>
<p>Yes, very much so. It has been a mainstay of Boost for many years.</p>
<h2><a name="Impact">Impact</a> on the Standard</h2>
<p><b><i>What does it depend on, and what depends on it?</i></b></p>
<p>It depends on some standard library components. No other proposals depend on 
it.</p>
<p><b><i>Is it a pure extension, or does it require changes to standard 
components?</i></b></p>
<p>It is a pure extension.</p>
<p><b><i>Can it be implemented using today's compilers, or does it require 
language features that will only be available as part of C++0x?</i></b></p>
<p>It can be (and has been) implemented with current compilers, and also many 
older compilers.</p>
<h2>Important <a name="Design">Design</a> Decisions</h2>
<h3>FAQ</h3>
<p><b>Why is the &lt;&lt; plus &gt;&gt; analogy broken for the <code>std::string</code> 
special case?</b></p>
<p>The default asymmetric behavior of I/O for strings is often a cause for 
surprise amongst novices and, when wrapped inside <code>
lexical_cast</code>, experts as well. Converting from a string and back again is 
expected to be an identity operation, which is what is now supported. This 
expectation is important, and the response is to make the behavior consistent 
with the intent of the conversion rather than its underlying implementation. 
Over time, <code>
lexical_cast</code> has become more symmetric with respect to its conversions.<br>
<br>
There is also a little bit of handling to ensure that numeric types do not lose 
precision. Again, the I/O stream defaults are not what many people would expect. 
And then there is special support for wchar_t&lt;-&gt;char conversions, because again 
I/O streams don't quite do the right thing. We are not in a position to change 
I/O streams at this late stage, but something like <code>
lexical_cast</code> is not required to repeat those little surprises.</p>
<p>Before these changes, Boost regularly received complaints and bug reports 
about <code>
lexical_cast</code> behavior. Once the changes were made, complaints and bug 
reports stopped.</p>
<p><b>I don't like the name. Why don't you change it?</b></p>
<p>Suggestions always welcome. However, until something better comes along, the 
proposal authors don't believe that there is sufficient reason to change from <code>
lexical_cast</code>, which is very well established, used in books and other 
teaching material, and does not seem to cause confusion among real users.</p>
<p><b>Since either the source or target are usually strings, why not provide 
separate to_string(x) and string_to&lt;t&gt;(x) functions?</b></p>
<p>The source or target isn't always a string. Furthermore, the from/to idea 
cannot be expressed in a simple and consistent form. The illusion is that they 
are easier than <code>
lexical_cast</code> because of the name. This is theory. The practice is that 
the two forms, although similarly and symmetrically named, are not at all 
similar in use: one requires explicit provision of a template parameter and the 
other not. This is a simple usability pitfall that is guaranteed to catch 
experienced and inexperienced users alike -- the only difference being that the 
experienced user will know what to do with the error message.</p>
<h3>Change history</h3>
<ul type="square">
  <li>Early versions of Boost <code>lexical_cast</code> used the default stream 
  precision for reading and writing floating-point numbers. For numerics that 
  have a corresponding specialization of <code>std::numeric_limits</code>, 
  recent Boost versions and the proposal choose a precision to match. <br>
&nbsp;</li>
  <li>Early versions of Boost <code>lexical_cast</code> did not support 
  conversion to or from any wide-character-based types. Recent Boost versions 
  and the proposal support conversions from <code>wchar_t</code>, <code>wchar_t 
  *</code>, and <code>std::wstring</code> and to <code>wchar_t</code> and <code>
  std::wstring</code>. <br>
&nbsp;</li>
  <li>Early versions of Boost <code>lexical_cast</code> assumed that the 
  conventional stream extractor operators were sufficient for reading values. 
  However, string I/O is asymmetric, with the result that spaces play the role 
  of I/O separators rather than string content. Recent Boost versions and the 
  proposal fix this error for <code>std::string</code> and so <code>std::wstring</code>:
  <code>lexical_cast&lt;std::string&gt;(&quot;Hello, World&quot;)</code> succeeds instead of 
  failing with a <code>bad_lexical_cast</code> exception. <br>
&nbsp;</li>
  <li>Early versions of Boost of <code>lexical_cast</code> allowed unsafe and 
  meaningless conversions to pointers. Recent Boost versions and the proposal 
  throw <code>bad_lexical_cast</code> for conversions to pointers: <code>
  lexical_cast&lt;char *&gt;(&quot;Goodbye, World&quot;)</code> throws an exception instead of 
  causing undefined behavior. </li>
</ul>
<hr>
<h2>Proposed <a name="Text">Text</a> for Technical Report 2</h2>
<p><span style="font-style: italic; background-color: #C0C0C0">Text in gray is 
commentary and not part of the proposed text.</span></p>
<hr>
<h2><a name="Synopsis">Synopsis</a></h2>
<p><span style="font-style: italic; background-color: #C0C0C0">Choice of a new 
or existing header is deferred pending outcome of other conversion related 
proposals.</span></p>
<pre>namespace std
{
  namespace tr2
  {
    class <a href="file:///c:/boost/site/libs/conversion/lexical_cast.htm#bad_lexical_cast">bad_lexical_cast</a>;
    template&lt;typename Target, typename Source&gt;
      Target <a href="file:///c:/boost/site/libs/conversion/lexical_cast.htm#lexical_cast">lexical_cast</a>(const Source&amp; arg);
  }
}</pre>
<h2>Function template <code><a name="lexical_cast">lexical_cast</a></code></h2>
<p>The <code>lexical_cast</code> function template supplies common conversions 
to and from arbitrary types represented as text, providing expression-level 
convenience for such conversions.</p>
<p>The requirements on the argument and result types are: </p>
<ul type="square">
  <li><code>Source</code> is <i>OutputStreamable</i>, meaning that an <code>
  operator&lt;&lt;</code> is defined that takes a <code>std::ostream</code> or <code>
  std::wostream</code> object on the left hand side and an instance of the 
  argument type on the right. </li>
  <li><code>Target</code> is <i>InputStreamable</i>, meaning that an <code>
  operator&gt;&gt;</code> is defined that takes a <code>std::istream</code> or <code>
  std::wistream</code> object on the left hand side and an instance of the 
  result type on the right. </li>
  <li>Both <code>Source</code> and <code>Target</code> are <i>CopyConstructible</i> 
  [20.1.3]. </li>
  <li><code>Target</code> is <i>DefaultConstructible</i>, meaning that it is 
  possible to <i>default-initialize</i> an object of that type [8.5, 20.1.4].</li>
</ul>
<p> <code>lexical_cast</code> behavior is specified in terms of <code>operator&lt;&lt;</code> 
and <code>operator&gt;&gt;</code> on a <code>std::basic_stringstream</code> 
object. Implementations are not required to actually use a <code>std::basic_stringstream</code> 
object to achieve the required behavior. Implementations are permitted to 
provide specializations of the <code>lexical_cast</code> template.</p>
<blockquote>
  <p>[<i>Note: </i>Implementations may use this &quot;as if&quot; leeway to achieve 
  efficiency. <i>-- end note.</i>]</p>
</blockquote>
<pre>template&lt;typename Target, typename Source&gt;
  Target lexical_cast(const Source&amp; arg);</pre>
<blockquote>
  <p><i>Effects:</i> </p>
  <ul>
    <li>Inserts <code>arg </code>into an empty <code>std::basic_stringstream</code> 
    object via&nbsp; <code>operator&lt;&lt;</code>.</li>
    <li>Extracts the result, of type <code>Target</code>, from the <code>std::basic_stringstream</code> 
    object via <code>operator&gt;&gt;</code>.</li>
  </ul>
<p><i>Throws:</i>
<code>
bad_lexical_cast</code> if:</p>
  <ul>
    <li><code>Source</code> is a pointer type.</li>
    <li><code>fail()</code> for the <code>std::basic_stringstream</code> object 
    is true after either <code>operator&lt;&lt;</code> or <code>operator&gt;&gt;</code> is 
    applied.</li>
    <li><code>get()</code> for the <code>std::basic_stringstream</code> object 
    is not <code>std::char_traits&lt;char_type&gt;::eof()</code> after both <code>
    operator&lt;&lt;</code> and <code>operator&gt;&gt;</code> are applied.</li>
  </ul>
<p><i>Returns: </i>The result as created by the effects.</p>
<p><i>Remarks:</i> If <code>
Target</code> is either <code>std::string</code> or <code>std::wstring</code>, 
stream extraction takes the whole content of the string, including spaces, 
rather than relying on the default <code>operator&gt;&gt;</code> behavior.</p>
<p>The character type of the underlying stream is assumed to be <code>char</code> 
unless either the <code>Source</code> or the <code>Target</code> requires 
wide-character streaming, in which case the underlying stream uses <code>wchar_t</code>.
<code>Source</code> types that require wide-character streaming are <code>
wchar_t</code>, <code>wchar_t *</code>, and <code>std::wstring</code>. <code>
Target</code> types that require wide-character streaming are <code>wchar_t</code> 
and <code>std::wstring</code>. </p>
<p>If <code>std::numeric_limits&lt;Target&gt;::is_specialized</code>, the underlying 
stream precision is set according to <code>std::numeric_limits&lt;Target&gt;::digits10
</code>+ 1, otherwise if <code>std::numeric_limits&lt;Source&gt;::is_specialized</code>, 
the underlying stream precision is set according to <code>std::numeric_limits&lt;Source&gt;::digits10
</code>+ 1.</p>
<p>[<i>Note: </i>Where a higher degree of control is required over conversions, <code>
std::stringstream</code> and <code>std::wstringstream</code> offer a more 
appropriate path. Where non-stream-based conversions are required, <code>
lexical_cast</code> is the wrong tool for the job and is not special-cased for 
such scenarios. <i>-- end note.</i>]</p>
</blockquote>
<h3>C<font size="4">lass <code><a name="bad_lexical_cast">bad_lexical_cast</a></code></font></h3>
  <pre>namespace std
{
  namespace tr2
  {
    class bad_lexical_cast : public std::bad_cast
    {
    public:
      bad_lexical_cast () throw ();
      bad_lexical_cast ( const bad_lexical_cast &amp;) throw ();
      bad_lexical_cast &amp; operator =( const bad_lexical_cast &amp;) throw ();
      virtual const char * what () const throw ();
    };
  }
}</pre>
  <p><i><span style="background-color: #C0C0C0">The virtual destructor is not 
  shown, following the practice of 18.5.2 Class bad_cast [lib.bad.cast].</span></i><br>
</p>
<p>The class <code>bad_lexical_cast</code> defines the type of objects thrown as 
exceptions by the implementation to report runtime
<code>lexical_cast</code> failure. </p>
<pre>bad_lexical_cast () throw ();</pre>
<blockquote>
  <p><i>Effects:</i> Constructs an object of class <code>bad_lexical_cast</code>.</p>
  <p><i>Remarks:</i> The result of calling <code>what()</code> on the newly 
  constructed object is implementation-defined.</p>
</blockquote>
<pre>bad_lexical_cast ( const bad_lexical_cast &amp;) throw ();
bad_lexical_cast &amp; operator =( const bad_lexical_cast &amp;) throw ();</pre>
<blockquote>
  <p><i>Effects:</i> Copies an object of class <code>bad_lexical_cast</code>.</p>
</blockquote>
<pre>virtual const char * what () const throw ();</pre>
<blockquote>
  <p><i>Returns:</i> An implementation-defined NTBS.</p>
  <p><i>Remarks:</i> The message may be a null-terminated multibyte string 
  (17.3.2.1.3.2), suitable for conversion and display as a wstring (21.2, 
  22.2.1.4)</p>
</blockquote>
<hr>
<p> Copyright Kevlin Henney 2000-2005<br>
 Copyright Beman Dawes 2006</p>
<p>Last revised:
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%Y-%m-%d" startspan -->2006-04-10<!--webbot bot="Timestamp" endspan i-checksum="12259" --></p>

</body>

</html>
