<HTML><HEAD><TITLE>Dynamic Libraries in C++</TITLE></HEAD><BODY>

<CENTER>
<H1><A NAME="Dynamic Libraries in C++">Dynamic Libraries in C++</A></H1>
<H2>Notes from the Technical Session in Santa Cruz, Oct. 21, 2002</H2>
</CENTER>

<TABLE ALIGN="RIGHT" CELLSPACING="0" CELLPADDING="0">
<TR>
<TD ALIGN="RIGHT"><B><I>Document number:</I></B></TD>
<TD>&nbsp; N1418&nbsp;=&nbsp;02-0076</TD>
</TR>
<TR>
<TD ALIGN="RIGHT"><B><I>Date:</I></B></TD>
<TD>&nbsp; November 11, 2002</TD>
</TR>
<TR>
<TD ALIGN="RIGHT"><B><I>Project:</I></B></TD>
<TD>&nbsp; Programming Language C++</TD>
</TR>
<TR>
<TD ALIGN="RIGHT"><B><I>Reference:</I></B></TD>
<TD>&nbsp; ISO/IEC IS 14882:1998(E)</TD>
</TR>
<TR>
<TD ALIGN="RIGHT"><B><I>Reply to:</I></B></TD>
<TD>&nbsp; Pete Becker</TD>
</TR>
<TR>
<TD></TD>
<TD>&nbsp; Dinkumware, Ltd.</TD>
</TR>
<TR>
<TD></TD>
<TD>&nbsp; petebecker@acm.org</TD>
</TR>
</TABLE>
<BR CLEAR="ALL">
<HR>

<P><B><A HREF="#Overview">Overview</A>
&#183; <A HREF="#Linking and Libraries">Linking and Libraries</A>
&#183; <A HREF="#Usage Models">Usage Models</A>
&#183; <A HREF="#Semantic Issues">Semantic Issues</A>
</B></P>

<HR>

<H2><A NAME="Overview">Overview</A></H2>

<P>Many operating systems today support applications consisting of an
executable file and one or more dynamic libraries<SUP><A HREF="#Note 1">1</A></SUP>.
Compilers for such operating systems typically provide language extensions that support
fine-grained control over the process of creating such applications. The
C and C++ language standards, however, say nothing about dynamic
libraries, so it is difficult to write portable applications that
use them. This paper provides background material needed to better
understand some of the problems posed by applications that use
dynamic libraries.</P>

<P>Perhaps the most important impediment to discussion of dynamic libraries
is differing notions of what the term &quot;dynamic library&quot; means.
Systems programmers know the details of how dynamic libraries are loaded
and how names defined in dynamic libraries are resolved; application
programmers know what they want to do with dynamic libraries. Designing
a model of dynamic libraries that is suitable for standardization requires
exposure to both domains. The systems programming aspects of dynamic
libraries are discussed in
<A HREF="#Linking and Libraries">Linking and Libraries</A>, and the
application programming aspects are discussed in 
<A HREF="#Usage Models">Usage Models</A>. Finally, there are several decisions
that must be made concerning what the language should say about dynamic libraries.
These are discussed in <A HREF="#Semantic Issues">Semantic Issues</A>.</P>

<H2><A NAME="Linking and Libraries">Linking and Libraries</A></H2>

<H3>static linking</H3>

<P>Formally, a program in C or C++ consists of one or more translation units
which are compiled separately. The resulting object files are then linked together
to produce an executable file:</P>

<PRE>/home/pete$ <B>cc -c test.cpp</B>
/home/pete$ <B>cc -c helper.cpp</B>
/home/pete$ <B>cc test.o helper.o</B></PRE>

<P>In practice the link step is often handled by a script or by the compiler itself,
so an application can be compiled and linked with a single command:</P>

<PRE>/home/pete$ <B>cc test.cpp helper.cpp</B></PRE>

<P>Compilers also support the use of libraries. A library is nothing more than a
set of object files grouped together into one file. A library is linked to an
application by putting its name on the command line:</P>

<PRE>C:\work> <B>cc test.obj helper.obj mylib.lib</B></PRE>

<P>Despite the apparent similarities, though, there is usually an important difference
between linking an object file into an application and linking a library into an
application. With most implementations the linker puts all of the functions and data
objects from each object file into the application. When linking a library into an
application, however, only the parts of the library that are needed by the application
are linked in. The linker scans the library for object files that define names needed
by other parts of the application, and links those object files into the application.
The rest of the library isn't used. For example:</P>

<PRE>#include &lt;stdlib.h&gt;
#include &lt;stdio.h&gt;
int main()
{
puts(&quot;Hello, world&quot;);
exit(0);
}</PRE>

<P>When the compiler compiles this translation unit it produces an object file
that defines the symbol <CODE>main</CODE><SUP><A HREF="#Note 2">2</A></SUP> and
has internal notes that tell the linker that the object file needs definitions for
the symbols <CODE>puts</CODE> and <CODE>exit</CODE>. When the linker links this object file
to produce an application it looks through the standard library
<SUP><A HREF="#Note 3">3</A></SUP> for an object file that defines one or both of
those names and links it into the application, adding any names that that object
file needs definitions for into the list of names that it is searching for. The link
step is complete only when the linker has resolved all of the symbols needed by the
object files that constitute the application and all of the symbols needed by the object
files that it linked in from libraries.</P>

<H3>application loading</H3>

<P>To run an application the operating system calls the program loader and gives it
the name of the executable file. The program loader finds memory space for the application
and copies the application's executable code into the memory space. Then it does any
adjustments to the executable code that are needed to make it ready to run. For example,
memory addresses in the executable code might need to be adjusted to reflect the actual
location in memory where the program has been loaded. Once these loader fixups
have been made the loader turns execution over to the application.</P>

<H3>linking to dynamic libraries</H3>

<P>When an application uses dynamic libraries the picture changes. The code contained
in the dynamic library is not linked into the application; in fact, the code often doesn't
even have to be present on the system when the application is linked. The linker
just makes notes in the executable file about symbols that it thinks will be resolved
by dynamic libraries<SUP><A HREF="#Note 4">4</A></SUP>.</P>

<H3>dynamic loading and loader fixups</H3>

<P>To run an application that uses dynamic libraries the program loader has a great deal
more work to do: in addition to loading the executable file into memory it has to find
all of the dynamic libraries that the executable file depends on, including those needed by
other dynamic libraries that it has loaded. After a dynamic library has been loaded,
each function call from the executable file or from another dynamic library into that
dynamic library has to be fixed up. These fixups can't be done any sooner,
because it is only at load time that the locations in memory of the dynamic libraries
that constitute the application are known. Thus, some of the work that the linker does
when linking to a static library is deferred until load time when linking to dynamic libraries.</P>

<H3>manual loading</H3>

<P>It is also possible to <B><A NAME="manually load">manually load</A></B> a dynamic
library at runtime. This is done by passing the name of the dynamic library to a system
function that loads the library and returns a handle that the application can use to
refer to the code in the library. After successfully loading the library the application
can get the addresses of symbols defined in the dynamic library by calling another system
function and passing it the handle for the library and the name of a symbol.</P>

<H2><A NAME="Usage Models">Usage Models</A></H2>

<P>For the designer of an application there are three usage models for dynamic
libraries, reflecting the three forms of linking and loading discussed in the previous section.
</P>

<H3>monolithic applications</H3>

<P>A monolithic application is an application that doesn't explicitly use any dynamic
libraries<SUP><A HREF="#Note 5">5</A></SUP>. All of the application's code must be present
when the application is built, and all of the code is statically linked into the application.
This is the traditional C and C++ program model. It imposes the tightest coupling among an
application's components, which reduces flexibility and increases robustness.</P>

<H3>closed applications</H3>

<P>A closed application uses dynamic libraries but doesn't manually load any dynamic libraries
at runtime. The application designer determines what the application will be able to do and
distributes the applicaton's code among the executable file and the application's dynamic
libraries. This allows for the possibility of upgrading the application by distributing an
updated executable file or updated dynamic libraries while leaving the unaffected code in place.
Such an application is less tightly coupled than a monolithic application, and requires
more care to ensure that new components work with correctly older versions.</P>

<H3>plug-ins</H3>

<P>An application that supports plug-ins <A HREF="#manually load">manually loads</A>
dynamic libraries to supplement
its capabilities. Plug-ins often come from the application implementor, but they can
also come from third-party developers. To support the latter, the application implementor
documents the interface that a plug-in must support, and it provides, documents,
and maintains services needed by plug-ins. Unlike a closed application, an application that
supports plug-ins permits extensions that were not designed into the application. In
this sense such an application is less tightly coupled than a closed application; however,
this flexibility comes at a price: new versions of the application must continue to provide
the old version's support services so that existing plug-ins will continue to work.</P>

<H2><A NAME="Semantic Issues">Semantic Issues</A></H2>

<H3>exporting and importing</H3>

<P>Under Windows, when a dynamic library is built symbols that are intended to be used
in code that uses the dynamic library must be marked as exported. Further, in code that
uses symbols from a dynamic library each such symbol must be marked as imported. This
marking is done by adding implementation-specific keywords to the declarations of
these symbols. Each such symbols is modified by a macro that expands to the appropriate
keyword for an exported symbol when the dynamic library is being built and to the
appropriate keyword for an imported symbol when the dynamic library is being used:</P>

<PRE>#ifndef MY_HEADER
#define MY_HEADER
#if BUILD_MY_LIBRARY
 #define MY_LIBRARY_DECL __declspec(dllexport)
#else
 #define MY_LIBRARY_DECL __declspec(dllimport)
#endif

MY_LIBRARY_DECL void f(int);
#endif /* MY_HEADER */</PRE>

<P>Under Unix, the default is that when a dynamic library is built all symbols with
external linkage are made available to code that uses the dynamic library. Nothing has
to be done to the source code to make symbols available from a dynamic library or to
use symbols defined in a dynamic library.</P>

<P>Both of these approaches have problems. The Windows approach requires careful maintenance
of the macros that describe the dynamic library. Moving code from one dynamic library to another
requires changing the controlling macros so that the symbols will be marked as exported from
the new library (e.g., the macro <CODE>BUILD_MY_LIBRARY</CODE> in the example above would
have to be changed to a name that was defined when building the new library). The Unix
approach, simply put, does too much. It exposes internal details that the designers of
a dynamic library would prefer to keep private. Unix compilers address this problem from
outside the language through a text file that tells the linker which names to make available
from a dynamic library. This is obviously awkward, and some compilers are moving toward
a keyword-based approach.</P>

<P>Overall, it looks like some form of language support is needed to provide fine-grained
control over which symbols are made available by dynamic libraries. There doesn't appear to
be any technical barrier to simply marking a symbols as exported (with whatever syntax is
deemed appropriate); with that information the compiler can generate whatever information
is needed when it sees the definition of that symbol and when it sees a use of that
symbol<SUP><A HREF="#Note 6">6</A></SUP>.</P>

<H3>language support</H3>

<P>The syntax for declaring exported symbols ought to be simple. One possibility that has
been discussed on the mail reflector is extending the syntax for a
<CODE>linkage-specification</CODE>, so that a symbol that is defined in a dynamic library
could be marked with something like
<CODE>extern &quot;library&quot;</CODE><SUP><A HREF="#Note 7">7</A></SUP>.
The following is not intended to be a proposal, merely a survey of the issues presented.</P>

<P>Ordinary functions and data objects can be marked in the same way as they can be
labeled <CODE>extern &quot;C&quot;</CODE>:</P>

<PRE>extern &quot;library&quot; {
int i;      // i is defined in a dynamic library
void f();   // f is defined in a dynamic library
}

extern &quot;library&quot; double d;
            // d is defined in a dynamic library</PRE>

<P>The symbols defined by a class consist of its member functions and its static
data members. Putting the implementation of a class into a dynamic library requires
being able make all of those symbols available:</P>

<PRE>extern &quot;library&quot; {
    class C {
    public:
        void f();       // C::f is defined in a dynamic library
        static int i;   // C::i is defined in a dynamic library
    };
}</PRE>

<P>Templates are patterns for creating functions and classes. They are not, in
themselves, code or data. Thus, they do not need any special handling for
dynamic libraries. Rather, it is template instances that must be labeled when their
code and data are in a dynamic library:</P>


<PRE>template &lt;class T&gt; struct C {
    void set(const T&tt) {t = t;
    T get() {return t; }
private:
    T t;
    };

extern &quot;library&quot; {
template &lt;&gt; C&lt;int&gt;; // C&lt;int&gt;::set and C&lt;int&gt;::get
                    // are defined in a dynamic library
}</PRE>

<P>The compiler also generates data that is used by the implementation, such as
the data that supports runtime type information. For applications that support plug-ins
it may be important to control the availability of such data, since writers of
plug-ins may rely on the availability of type information for some of the
application's types. This poses a problem, since the name of the data structure
that holds type information<SUP><A HREF="#Note 8">8</A></SUP> is not usually
known to the user. Some other syntax would be needed to support control of this data.</P>

<H3>semantic complications</H3>

<P>There are several semantic issues that the standard would have to address, mostly
turning on the applicability of the one-definition rule when dynamic libraries are
used. What should an implementation be required to do if two dynamic libraries export the same
symbol? What should an implementation be required to do if two dynamic libraries define the
same symbol as a symbol with external linkage but do not export it? What should an
implementation be required to do if two dynamic libraries define the same type? (For
example, when code in a dynamic library throws an exception, should code which called
that code from another dynamic library be able to catch that exception?)</P>

<HR>

<P><A NAME="Note 1">1</A>. In Windows they're known as DLLs; in
Unix they're shared libraries. Throughout this paper they are
referred to as &quot;dynamic libraries&quot;, in the hope that
the name suggests that the two models are similar and that they
are different.</P>

<P><A NAME="Note 2">2</A>. This discussion ignores name mangling.</P>

<P><A NAME="Note 3">3</A>. Although it generally doesn't appear on the command
line, the standard library is usually no different from a user-defined library
except that the compiler knows its name and passes that name to the linker
even if it isn't mentioned on the command line.</P>

<P><A NAME="Note 4">4</A>. This is deliberately vague, because the details
vary fairly widely from system to system.</P>

<P><A NAME="Note 5">5</A>. An application that doesn't explicitly use any dynamic
libraries will often use dynamic libraries anyway -- the standard library for C
and C++ is often packaged as a dynamic library. However, this is usually not something
that the application designer need be concerend with; the implementor will make it
work.</P>

<P><A NAME="Note 6">6</A>. For Windows programmers this is a simplification; for
Unix programmers it is a complication.</P>

<P><A NAME="Note 7">7</A>. The use of <CODE>&quot;library&quot;</CODE> here is intended only
as an aid to exposition, not as a recommendation.</P>

<P><A NAME="Note 8">8</A>. If there is, in fact, such a name at all. Some implementations
store this information in the vtable.</P>

</BODY></HTML>
