Skip to content

Chapter7 Linking

Linking is the process of collecting and combining various pieces of code and data into a single file that can be loaded(copied) into memory and executed.
Linkers play a crucial role in software development because they enable separate compilation. Instead of organizing a large application as one monolithic source file, we can decompose it into smaller, more manageable modules that can be modified and compiled separately.When we change one of these modules, we simply recompile it and relink the application, without having to recompile the other files.

7.1 Compiler Drivers

20241110233211

7.2 Static Linking

Static linkers such as the Linux ld program take as input a collection of relocatable object files and command-line arguments and generate as output a fully linked executable object file that can be loaded and run.

  1. Symbol resolution. Object files define and reference symbols, where each symbol corresponds to a function, a global variable, or a static variable (i.e., any C variable declared with the static attribute). The purpose of symbol resolution is to associate each symbol reference with exactly one symbol definition.
  2. Relocation. Compilers and assemblers generate code and data sections that start at address 0. The linker relocates these sections by associating a memory location with each symbol definition, and then modifying all of the references to those symbols so that they point to this memory location. The linker blindly performs these relocations using detailed instructions, generated by the assembler, called relocation entries.

Object files are merely collections of blocks of bytes. Some of these blocks contain program code, others contain program data, and others contain data structures that guide the linker and loader. A linker concatenates blocks together, decides on run-time locations for the concatenated blocks, and modifies various locations within the code and data blocks. Linkers have minimal understanding of the target machine. The compilers and assemblers that generate the object files have already done most of the work.

7.3 Object Files

20241111003926 20241111005313

7.4 relocatable object files

20241111160208 20241111160225

7.5 Symbols and Symbol Tables

20241112011843 20241112162058

7.6 Symbol Resolution

the linker resolves symbol references by associating each reference with exactly one symbol definition from the symbol tables of its input relocatable object files.

7.6.1 How Linkers Resolve Duplicate Symbol Names

At compile time, the compiler exports each global symbol to the assembler as either strong or weak, and the assembler encodes this information implicitly in the symbol table of the relocatable object file. Functions and initialized global variables get strong symbols. Uninitialized global variables get weak symbols.

20241113003621

7.6.2 Linking with Static Libraries

In practice, all compilation systems provide a mechanism for packaging related object modules into a single file called a static library, which can then be supplied as input to the linker.When it builds the output executable, the linker copies only the object modules in the library that are referenced by the application program.

20241113162549

7.6.3 How Linkers Use Static Libraries to Resolve References

20241114000346

7.7 Relocation

20241114174659 20241114174713

7.7.1 Relocation Entries

20241114174756 20241114174813

7.7.2 Relocating Symbol References

20241114235930 20241115013450 20241115013532 20241115013557

7.8 Executable Object Files

20241115231924 20241117014535

7.9 Loading Executable Object Files

20241117161102 20241117161121

7.10 Dynamic Linking with Shared Libraries

Shared libraries are modern innovations that address the disadvantages of static libraries.Ashared library is an object module that, at either run time or load time, can be loaded at an arbitrary memory address and linked with a program in memory. This process is known as dynamic linking and is performed by a program called a dynamic linker. Shared libraries are also referred to as shared objects, and on Linux systems they are indicated by the .so suffix. Microsoft operating systems make heavy use of shared libraries, which they refer to as DLLs (dynamic link libraries).
Shared libraries are “shared” in two different ways. First, in any given file system, there is exactly one .so file for a particular library. The code and data in this .so file are shared by all of the executable object files that reference the library, as opposed to the contents of static libraries, which are copied and embedded in the executables that reference them. Second, a single copy of the .text section of a shared library in memory can be shared by different running processes

20241119231419 20241119231459