<HTML>
<HEAD>
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=windows-1252">
<META NAME="Generator" CONTENT="Microsoft Word 97">
<TITLE>Document number N1394=02-0052, for the Evolution working group</TITLE>
<META NAME="Template" CONTENT="C:\Program Files\Microsoft Office\Office\html.dot">
</HEAD>
<BODY LINK="#0000ff" VLINK="#800080">

<P ALIGN="CENTER">Document number N1394=02-0052 2002-Sept-10</P>
<P ALIGN="CENTER">Some Proposed Extensions to C++ Language</P>
<P ALIGN="CENTER">For Evolution Working Group</P>
<P ALIGN="CENTER">David E. Miller</P>
<FONT FACE="Arial"><H3>1. "Finally" block</H3>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Problem(s) to be addressed</P>
</B></I></FONT><FONT SIZE=2><P>Factoring block completion code into a single place.</P>
<P>Easy instrumentation (retrofitting) of existing code to track function returns, as well as easier normalization of existing code.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Why this is not solved by use of simple destructors associated with the variables declared within the blocks</P>
</B></I><P>It often occurs that some final action depends upon the combined state of multiple variables, so the destructors associated with individual variables are to no avail.</P>
<B><I><P>How the problem is partially solved now</P>
</B></I><P>Factoring common code into a single function (including the special case of a local class destructor)</P>
<P>Repeating loose code for each possible point of exit.</P>
<B><P>For the case of the current try-catch block code, there are several approaches to executing the same code for multiple catch blocks and for the case of no exception being thrown</P>
</B></FONT><FONT SIZE=2><P>Factor the code into a separate function, which is called by each catch block and after the last catch block, taking into account that if a catch block does not throw, the common function must not be called twice. This can be done by using a flag existing outside the try-catch block or by executing a "goto" statement from each such catch block past the non-throw invocation of the common function.</P>
<P>As a variation of this approach, a local class can be declared, an instance of which can be initialized with references to the variables to be handled, with the local class destructor performing the needed actions.</P>
<P>Repeating code in each catch block, as well as after the last catch block, in a similar fashion to using a function.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Advantages and disadvantages of each approach</P>
</B></I></FONT><FONT SIZE=2><P>An advantage of putting the code into a function is the guarantee the multiple uses will not get out of sync, in the event the code needs to be changed.</P>
<P>A disadvantage of putting the code into a function is the inability of the functions code to execute directly a return statement from the point of invocation, thus requiring extra code to provide for this possibility.</P>
<P>E.g. loose code could say, "if( condition ) return some_value;" whereas an invoked function could do no more than return a flag indicating the caller should then return, which then harks back to the loose code it was intended to avoid.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Mechanical instrumentation of existing code</P>
</B></I></FONT><FONT SIZE=2><P>It is convenient for debugging and maintenance purposes, particularly when investigating code written by others who may be long gone, to be able to track exits from functions and to be able to guarantee that a single breakpoint can be placed into each function at the point of return, rather than having to spent extra time tracking down every possible return or throw statement.</P>
</FONT><FONT FACE="Arial" SIZE=2><P>Current approach</P>
<B><P>Insert and instance of a locally defined class at the start of each function, using the destructor to execute some useful code.</P>
<I><P>Proposed approach</P>
</B></I></FONT><FONT SIZE=2><P>Add a new "finally" keyword having effect similar (not necessarily identical) to that of Java or C#.</P>
<P>Appended to try-catch blocks, "finally" would result in a block executed regardless of which (if any) "catch" block associated with the preceding "try" block was executed.</P>
<P>Exiting a try-catch block by use of a "goto" to a point after the "finally" block would bypass the "finally" block. This construction should elicit some warning message from the compiler.</P>
<P>Exiting a try-catch block by use of a return statement would NOT bypass the "finally" block. In the event a "finally" block executed as a result of a return statement itself executes a return statement, the value specified in a return statement executed within the "finally" block would supersede that of the original return statement. In the event of exiting multiple "finally" blocks with more than one executing a value return statement, the last return value would pertain.</P>
<P>As in C#, a "finally" block could be used with a "try" block lacking any "catch" block, in which case its behavior would be equivalent to a try-catch-finally block in which the sole catch block consisted of a rethrown generic exception.</P>
</FONT><B><FONT FACE="Arial" SIZE=2><P>try<BR>
{<BR>
}<BR>
catch(...) // This catch block could be omitted without change of effect</P>
<P>{</P><DIR>
<DIR>

<P>throw ;</P></DIR>
</DIR>

<P>}</P>
<P>finally</P>
<P>{</P>
<P>}</P>
</B></FONT><FONT SIZE=2><P>As in C#, multiple "finally" blocks would be allowed for a single "try" block.</P>
<P>Because "catch" blocks could be omitted, it becomes very easy to automate the process of instrumenting existing code by adding "try" to the function start and "finally" to the function end.</P>
<P>Example</P>
<P>void f() try<BR>
{<BR>
}<BR>
<B>finally</B><BR>
{<BR>
         cout &lt;&lt; "returning from f()" &lt;&lt; endl ; // convenient location for breakpoint<BR>
}</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Impact upon backward compatibility</P>
</B></I><P>Additional keyword</P>
</FONT><FONT SIZE=2><P>The introduction of a new keyword would require that any code already using the new keyword be modified by a global search-and-replace operation to change the name to something else. However, since this would not require any structural change, the impact would be less than, for example, the change in "for" variable scope.</P>
<B><P>Impact upon compiler</P>
</B><P>Since the "finally" concept has already been implemented in Java, C#, and at least one C++ compiler (as a non-standard extension), albeit with somewhat different details than those in this proposal, one may expect the impact to be limited.</P>
</FONT><B><FONT FACE="Arial"><P>&nbsp;</P>
</FONT><FONT FACE="Arial" SIZE=4><P>2. New keyword "shared"</P>
</FONT><I><FONT FACE="Arial" SIZE=2><P>Problem(s) to be addressed</P>
</B></I><P>Currently, there is no uniform way to specify that data are, or are not, shared among multiple threads, which often leads to code that inadvertently fails to lock, or doubly locks, data. In the former case, non-deterministic data corruption can occur; in the latter case, deadlocks can occur.</P>
<B><I><P>Proposed approach</P>
</B></I><P>Add a "shared" keyword having several characteristics:</P>
<OL>

<B><LI>Shared primitive data types are not usable, except for address purposes (pointer and reference).</LI>
</B></FONT><FONT SIZE=2><P>Note how this differs from the "volatile" modifier, in which a volatile primitive can be used in an ordinary manner.</P>
</FONT><B><FONT FACE="Arial" SIZE=2><LI>Non-shared object instances can be acted upon only by functions not having the "shared" modifier.</LI>
</B></FONT><FONT SIZE=2><P>Note how this differs from the "volatile" modifier, in which a non-volatile object can be acted upon by a function expecting a volatile.</P>
</FONT><B><FONT FACE="Arial" SIZE=2><LI>Shared object instances can be acted upon only by functions having the "shared" modifier.</LI></OL>

</B></FONT><FONT SIZE=2><P>This is similar to the current "volatile" modifier.</P>
<P>&nbsp;</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Usage examples</P>
</B></I><P>Class C</P>
<P>{</P><DIR>
<DIR>

<P>MutexObject m_lock ;</P>
<P>int m_int ;</P>
<P>void f() shared</P>
<P>{</P><DIR>
<DIR>

<P>Locker locker1( m_lock );</P>
<P>C * pThisUnshared = const_cast&lt; C * &gt;( this );</P>
<P>pThisUnshared-&gt;f();</P></DIR>
</DIR>

<P>}</P>
<P>void f()</P>
<P>{</P><DIR>
<DIR>

<P>this-&gt;m_int = ... ;</P></DIR>
</DIR>

<P>}</P></DIR>
</DIR>

<P>};</P>
<P>&nbsp;</P>
<B><I><P>Similarities to, and differences from, current "volatile" keyword</P>
</B></I><P>The "volatile" keyword does not preclude a volatile primitive type from being used.</P>
<P>There is no way to prevent a volatile function from being inadvertently invoked on a non-volatile object, other than by preemptively overloading each such function with a non-volatile variation, which may not be a desirable approach for obvious reasons.</P>
<B><I><P>Impact upon backward compatibility</P>
</B></I><P>Additional keyword</P>
</FONT><FONT SIZE=2><P>The introduction of a new keyword would require that any code already using the new keyword be modified by a global search-and-replace operation to change the name to something else. However, since this would not require any structural change, the impact would be less than, for example, the change in "for" variable scope.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Impact upon compiler</P>
</B></I></FONT><FONT SIZE=2><P>Since there is already a mechanism for dealing with const and volatile, adding another variation should not involve a great addition to compiler complexity.</P>
</FONT><B><I><FONT FACE="Arial"><P>&nbsp;</P>
</I></FONT><FONT FACE="Arial" SIZE=4><P>3. Generalization of hexadecimal number specification (0X...) to allow specifying the radix (nX..., where n is 1-16)</P>
</FONT><I><FONT FACE="Arial" SIZE=2><P>Problem(s) to be addressed</P>
</B></I></FONT><FONT SIZE=2><P>More flexible initialization of values that might be more readily expressed in some base other than 8, 10, or 16.</P>
<P>Uniform specification of based numbers, rather than the ad hoc decimal, octal, hexadecimal syntaxes presently in use.</P>
<P>Simple way to specify bit counts and, by implication, parity, without invoking a function.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Proposed approach</P>
</B></I><P>The current base-16 syntax can be trivially extended with complete backward compatibility by allowing the number preceding the X to be extended to the set {0..16}, where any non-zero value would be construed to be the radix.</P>
<P>The special case of a radix of one would be construed to result in the simple tally of all the 1 digits of a hexadecimal number, which can be convenient for automating the specification of parity values for compilation-time, binary constants, among other things.</P>
<B><I><P>Impact upon backward compatibility</P>
</B></I></FONT><FONT SIZE=2><P>Since the proposed extension extends the syntax merely by allowing the numbers 1-16 in front of the X of the current hexadecimal notation, no currently valid code should be broken.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Impact upon compiler</P>
</B></I></FONT><FONT SIZE=2><P>Based upon informal comments of some compiler writers made in the Spring, 2002, meeting, the impact upon compilers should be extremely minor.</P>
</FONT><B><I><FONT FACE="Arial" SIZE=2><P>Examples of use</P>
</B></I><P>10x123 equivalent to 123</P>
<P>8x123 equivalent to 2x001010011, equivalent to 0x53</P>
<P>2x111 equivalent to 0x7</P>
<P>4x123 equivalent to 2x011011, equivalent to 0x1B</P>
<P>12x12 equivalent to 14, equivalent to 0xE</P>
<P>1x001010011, equivalent to 4</P>
<P>1x1234567, equivalent to 12 (1 + 1 + 2 + 1 + 2 + 2 + 3)</P>
<P>1x1234567 &amp; 1, equivalent to 0 (creates parity bit for 0x1234567)</P></FONT></BODY>
</HTML>
