<html>

<head>
<meta http-equiv="Content-Language" content="en-us">
<meta name="GENERATOR" content="Microsoft FrontPage 5.0">
<meta name="ProgId" content="FrontPage.Editor.Document">
<meta http-equiv="Content-Type" content="text/html; charset=windows-1252">
<title>Filesystem library query
</title>
</head>

<body bgcolor="#FFFFFF">

<p>Doc. no.&nbsp;&nbsp; J16/04-0016=WG21/N1576<br>
Date:&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 6 February 2004<br>
Project:&nbsp;&nbsp;&nbsp;&nbsp; Programming Language C++<br>
Reply to:&nbsp;&nbsp; Beman Dawes &lt;<a href="mailto:bdawes@acm.org">bdawes@acm.org</a>&gt;</p>

<h1>Filesystem library query</h1>

<h2>Introduction</h2>
<p>This paper is a query to determine  interest by the 
Library Working Group in a future proposal for a C++ filesystem component 
based on the Boost Filesystem Library. Such a component would be suitable 
for a future standard or a future TR. This paper is not itself such a proposal.</p>
<p>The Boost Filesystem Library (<a href="http://www.boost.org/libs/filesystem">www.boost.org/libs/filesystem</a>) provides portable facilities to query and 
manipulate paths, files, and directories. The library is widely used. It would 
be a pure addition to the C++ standard, leaving in place existing 
standard library functionality where there is overlap.</p>
<p>The motivation for the library is the desire to perform <i><b>portable, safe, 
script-like filesystem operations</b></i> from within C++ programs. Because the 
C++ Standard Library currently contains no facilities for such filesystem tasks 
as directory iteration or directory creation, programmers currently must rely on 
operating system specific C-style interfaces, making it difficult to write 
portable programs.</p>
<p>The intent is not to compete 
with Python, Perl, or shell scripting languages, but rather to provide 
filesystem operations where C++ is already the language of choice. The design 
encourages, but does not require, safe and portable filesystem usage.</p>
<h2>Sample program using the Boost library</h2>
<blockquote>
  <pre>#include &quot;boost/filesystem/operations.hpp&quot;
#include &lt;iostream&gt;

namespace fs = boost::filesystem;
using std::cout;

int main( int argc, char* argv[] )
{
  fs::path p( argc &lt;= 1 ? &quot;.&quot; : argv[1] );

  if ( !fs::exists( p ) ) // does not exist
    cout &lt;&lt; &quot;Not found: &quot; &lt;&lt; argv[1] &lt;&lt; '\n';

  else if ( fs::is_directory( p ) ) // is a directory
  {
    for ( fs::directory_iterator dir_itr( p );
          dir_itr != fs::directory_iterator(); ++dir_itr )
    {
      // display only the rightmost name in the path
      cout &lt;&lt; dir_itr-&gt;leaf() &lt;&lt; '\n';
    }
  }

  else // is a file
    cout &lt;&lt; &quot;Found: &quot; &lt;&lt; argv[1] &lt;&lt; '\n'; 
  return 0;
}</pre>
</blockquote>
<p>Users say they prefer the Filesystem library's interface to native 
operating system or 
POSIX API's, even in code without portability requirements.</p>
<h2>Important Design Decisions</h2>
<h4>Portable functionality and behavior</h4>
<p>The library provides only functionality and behavior which can be supported uniformly on 
many different operating systems. As a practical matter, this means 
functionality and behavior which can be specified to work uniformly on POSIX 
and Windows. Since modern versions of legacy operating systems such as 
OS/390 and System/z provide POSIX support, the library can be implemented on 
these systems. Examples of behavior which is not supported because of portability 
concerns includes manipulation of file and directory attributes. The emphasis on 
portable behavior drove many design choices.</p>
<h4>Portable Paths</h4>
<p>Consider this code:</p>
<blockquote>
  <pre>if ( !exists( &quot;foobar/cheese&quot; ) )
  cout &lt;&lt; &quot;Something is rotten in foobar\n&quot;;</pre>
</blockquote>
<p>The <code>exists()</code> function returns true if the indicated 
file or directory is present in the external file system. The signature is:</p>
<blockquote>
  <pre>bool exists( const path &amp; );</pre>
</blockquote>
<p>The <code>&quot;foobar/cheese&quot;</code> argument is written according to a 
portable generic path grammar and is converted to an object of class <i>path</i>, 
which the implementation translates into the operating system's&nbsp;native 
format for use in operating system calls. For example, if the operating system uses  colons as path element 
separators, the path above would be passed to the operating system as <code>&quot;foobar:cheese&quot;</code>.&nbsp; 
Class <i>path</i> has much useful and interesting functionality for manipulating 
filesystem paths, and for ensuring that names in paths meet application specific 
requirements.&nbsp; Non-portable (native) path grammar is also supported.</p>
<h4>Use-driven design</h4>

<p>Because of the desire to support simple &quot;script-like&quot; usage, use cases often 
drove design choices. For example, class <code>path</code> has conversion 
constructors from <code>const char *</code> and <code>const std::string &amp;</code>, 
allowing users to write <code>if (exists( &quot;foo&quot;))</code> rather than <code>if (exists(path(&quot;foo&quot;)))</code>.</p>

<h4>Errors reported via exceptions</h4>
<p>Like all I/O, filesystem operations often encounter runtime errors both 
expected and unexpected. The library reports runtime errors via C++ exceptions.</p>

<h4>Throws heavy-weight exceptions</h4>
<p>Filesystem operations often encounter errors such as &quot;File not found&quot; which 
must be reported to human users. To ensure that the exceptions thrown for such 
errors contain sufficient information for users to resolve the error, and to 
eliminate the need for programs to include numerous try/catch blocks, the 
library throws relatively heavy-weight exceptions. There is a single <i>filesystem_error</i> type, with two error codes, two paths, and two 
messages. While the details could certainly change a great deal, the overall needs 
for avoiding try/catch blocks after every operation and for allowing detailed 
user customization based on error details has to be dealt with one way or 
another.</p>

<h4>Automatic testing for relative portability of names in paths</h4>
<p>Because there is no such thing as absolute portability for names of files and 
directories, the design uses a relative portability approach which allows the 
user to specify which name portability rules are desired. Default, global 
user-specified, and per constructor user-specified portability checking allows an 
application to perform as much or as little portability checking as desired. The 
experience with automatic checking is that it often identifies programmer 
oversights before they become serious problems.</p>

<h4>Sub-namespace &quot;filesystem&quot;</h4>
<p>The Filesystem library includes several components which are essentially new 
versions of components already in the current C++ Standard Library. 
Specifically: <i>remove, rename, basic_filebuf, filebuf, wfilebuf, 
basic_ifstream, ifstream, wifstream, basic_ofstream, ofstream, wofstream, 
basic_fstream, fstream,</i> and <i>wfstream</i>. The primary difference for the 
iostream (clause 27) classes is that seven constructors and open functions now 
take arguments of <code>const path &amp;</code>. Specifications and implementation 
simply reference the equivalent components in clause 27 of the current standard. <i>
remove</i> and <i>rename</i> differ in the type of their arguments, their return 
types, and how they handle errors. Note that there is no intent to deprecate any 
components in the current standard; these are in use in millions of lines of 
existing code and must be preserved.</p>
<p>The versioning problem this creates is not unique to the Filesystem library; 
it is simply the first place where the C++ committee must face the problem.</p>
<p>Two choices were considered; to give the components completely different names 
or to place them in a sub-namespace. My thinking for the Boost library was that 
new names would be a serious confusion, and so the new components were placed in 
sub-namespace <code>filesystem</code>. For the standard library, a filesystem 
component should use the same versioning approach used by other standard library 
components.</p>
<h4>Path equality versus equivalence</h4>
Path equality is defined essentially as string equality, and path equivalence 
  as a determination (implemented using native filesystem API operations) 
actually point to the same file or directory. Path equivalence isn't crucial to 
the library, and wasn't provided in early versions. It is only mentioned here 
because LWG members have indicated interest in path equality and equivalence 
issues.<h2>Remaining work: Internationalization</h2>
<p>The Boost Filesystem library is not currently internationalized; that work is 
underway. The approach being prototyped uses a <i>basic_path</i> template, with <i>
path</i> and <i>wpath</i> typedefs, similar to strings and iostreams. Paths will 
need the ability to imbue a locale, to handle the conversion between internal 
and external representations.</p>
<hr>
<p> Copyright Beman Dawes, 2004</p>
<p>Revised
<!--webbot bot="Timestamp" S-Type="EDITED" S-Format="%B %d, %Y" startspan -->February 09, 2004<!--webbot bot="Timestamp" endspan i-checksum="41424" --></p>

</body>

</html>