Chapter7 Linking
Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded(copied) into memory and executed.
Linkers play a crucial role in software development because they enable
separate compilation. Instead of organizing a large application as one monolithic
source file, we can decompose it into smaller, more manageable modules that can
be modified and compiled separately.When we change one of these modules, we
simply recompile it and relink the application, without having to recompile the
other files.
7.1 Compiler Drivers
7.2 Static Linking
Static linkers such as the Linux ld
program take as input a collection of relocatable
object files and command-line arguments and generate as output a fully linked
executable object file that can be loaded and run.
- Symbol resolution. Object files define and reference symbols, where each
symbol corresponds to a function, a global variable, or a static variable
(i.e., any C variable declared with the
static
attribute). The purpose of symbol resolution is to associate each symbol reference with exactly one symbol definition. - Relocation. Compilers and assemblers generate code and data sections that start at address 0. The linker relocates these sections by associating a memory location with each symbol definition, and then modifying all of the references to those symbols so that they point to this memory location. The linker blindly performs these relocations using detailed instructions, generated by the assembler, called relocation entries.
Object files are merely collections of blocks of bytes. Some of these blocks contain program code, others contain program data, and others contain data structures that guide the linker and loader. A linker concatenates blocks together, decides on run-time locations for the concatenated blocks, and modifies various locations within the code and data blocks. Linkers have minimal understanding of the target machine. The compilers and assemblers that generate the object files have already done most of the work.
7.3 Object Files
7.4 relocatable object files
7.5 Symbols and Symbol Tables
7.6 Symbol Resolution
the linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.
7.6.1 How Linkers Resolve Duplicate Symbol Names
At compile time, the compiler exports each global symbol to the assembler as either strong or weak, and the assembler encodes this information implicitly in the symbol table of the relocatable object file. Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.
7.6.2 Linking with Static Libraries
In practice, all compilation systems provide a mechanism for packaging related object modules into a single file called a static library, which can then be supplied as input to the linker.When it builds the output executable, the linker copies only the object modules in the library that are referenced by the application program.
7.6.3 How Linkers Use Static Libraries to Resolve References
7.7 Relocation
7.7.1 Relocation Entries
7.7.2 Relocating Symbol References
7.8 Executable Object Files
7.9 Loading Executable Object Files
7.10 Dynamic Linking with Shared Libraries
Shared libraries are modern innovations that address the disadvantages of
static libraries.Ashared library is an object module that, at either run time or load
time, can be loaded at an arbitrary memory address and linked with a program in
memory. This process is known as dynamic linking and is performed by a program
called a dynamic linker. Shared libraries are also referred to as shared objects, and
on Linux systems they are indicated by the .so
suffix. Microsoft operating systems
make heavy use of shared libraries, which they refer to as DLLs
(dynamic link
libraries).
Shared libraries are “shared” in two different ways. First, in any given file
system, there is exactly one .so
file for a particular library. The code and data in
this .so
file are shared by all of the executable object files that reference the library, as opposed to the contents of static libraries, which are copied and embedded in
the executables that reference them. Second, a single copy of the .text
section of
a shared library in memory can be shared by different running processes