Friday, September 26, 2008

Global Offset Table, why is it needed?

In ELF land, GOT stands for Global Offset Table. It is a very important component of Position Independant Code (PIC), used in shared libraries. In this article, we try to understand why it is needed.

Preface



Understanding linking and loading is difficult. At least it was for me because I mainly read documentation, without writing source which usually helps to stick things in mind. If you are really interested in linking and loading, I can only advice you to read the following documentation:

  • John R Levine's "Linkers and Loaders" book (ISBN 1-55860-496-0). This is the best documentation you can find on this topic. You can find the first chapter here which is a very good reading to understand the history of linkers.

  • Ian Lance Taylor's "Linkers" blog entries contains a overall explanation of the linking process.

  • Also, have a look in the bookmarks on my website there are a couple of interesting pointers too in "Code"/"Linkers/Loaders".



So, why GOT?



Quote from Wikipedia in the Linker article:
In computer science, a linker or link editor is a program that takes one or more objects generated by a compiler and assembles them into a single executable program.

As you may already know when linking binary objects together into a program, the linker has to handle relocations. Relocations are basically hints left by the assembler so the linker knows where there are addresses to fix once everything has been put together. Let's take a simple example to illustrate the need of relocations. Imagine that i is a global variable defined in a source file and you assign it a value in another source file: i = 1. On i386, the compiler translates this in movl $1, i. When the assembler processes this, it doesn't know the actual address of i since it can see only one file at time, so it just set the address to zero; on the other hand the linker sees all files at once and can figure out the final value of a symbol (or, address of a variable). So in this case, the assembler generates a relocation informing the linker to put the value of symbol "i" at such or such offset in the code section, once every binary objects have been combined and the final address of all symbols is known. I won't go futher into the details about relocations, this deserves another blog article I think. Just keep in mind that most types of relocation are used to "fix" the code of the library. And this is not a problem when linking multiple objects files (including static libraries) together into a program.

Contrariwise to static libraries, shared libraries are dynamically linked, which means that the linker's job is defered to runtime and is handled by the so-called dynamic linker. By definition, shared libraries code is shared among multiple processes: thanks to virtual memory, the same physical memory pages holding the library code are mapped into the address space of every processes that need it. The benefit is savings of physical memory. Thus the shared code must obviously stay the same and should not be fixed directly to suit one particular process memory layout or the other.

The difficulty comes from the fact we don't know where the dynamic linker will decide to load the library in the process memory as it depends on a number of factors such as executable size, number and size of libraries loaded previously, not to mention that actual load address might randomized for security reasons on recent systems. Still when the library code refers to a global variable, its address has to be stashed somewhere so the code can access it.

As we saw in the relocation example, this is normally done directly by fixing the code. In order to avoid modifying shared libraries code at runtime and losing memory sharing, we have to make the runtime relocations point to the process private data. This is done by asking the compiler to generate Position-Independant Code (PIC). The idea is simple: instead of having the final absolute address of a symbol in the code, the code refers to a particular entry of a private array containing this address. At runtime, the dynamic linker will stuff this array with the actual symbols value so the same code will access addresses proper to each process. The sharp minded reader may have noticed that this only moves the problem on step further because the address of the private array itself is still unknown. This is resolved by a small calculation. Given that the library is loaded as a single unit in the memory, the distance between the code and the private array is known by the compiler. The code just have to get the address of the current instruction and the address of the private array can be figured out.

Ladies and gentlemen, this private array of addresses is what we call the Global Offset Table, or GOT.

It is actually perfectly valid to create a non-PIC shared library. It will work. But under the hood, the dynamic linker will probably patch most of the code pages of the library which will be duplicated by the copy-on-write mechanism of the virtual memory. Besides wasting memory, it may take more time to start because if a symbol is used thousands time along the code, the dynamic linker will have thousands relocations pointing into the code section to process at startup. On the other hand, with PIC there is one GOT entry per symbol so there is only one relocation to perform. But to be honest this is more a tradeoff than a pure win because using GOT implies using a more expensive function prologue to compute the GOT address and an additional indirection for access through the GOT.

Shared libraries all have their own GOT, and the program would have his if it was compiled with PIC as well (this is called Position Independant Executable by the way). Actually in any shared library, the GOT is used for every data accessed with an absolute address in the code, which includes static variables and global variables defined in the library itself.

For static variables there is no need for a GOT entry: getting the address is straightforward because the distance between the GOT and the variable itself is known, being in the same loadable unit, so the code can directly access it.

One might think this is the same for global variables defined within the library itself. But this is without counting that ELF allows the executable or a previously shared library to redefine a global variable of a shared library. For example, you can perfectly define a stdout variable in your program despite it is already defined in libc, but keep in mind that your variable will also be used by the libc functions referring to stdout such as printf(3) so be careful that it points to a valid FILE object. In short, all global variables used by the shared library unconditionally have a GOT entry.

I hope you better understand what is GOT now :).