How linker allocate memory

Thread Starter

gogo00

Joined Oct 28, 2023
37
I'm currently trying to understand how the linker assigns memory locations in a C project with two source files. One file contains static global, local global, and extern variables, while the other attempts to access an extern variable from the first file.

In terms of linkage, there are three types: external linkage (e.g., extern variables), internal linkage (e.g., static global variables), and variables with no linkage (e.g., local variables).

I'm trying to know whether it's the compiler or the linker that allocates memory. My current understanding is that the compiler initially allocates memory because source files are compiled separately, and then the linker updates the final memory locations. I would appreciate clarification on this process.
 

ApacheKid

Joined Jan 12, 2015
1,608
I'm currently trying to understand how the linker assigns memory locations in a C project with two source files. One file contains static global, local global, and extern variables, while the other attempts to access an extern variable from the first file.

In terms of linkage, there are three types: external linkage (e.g., extern variables), internal linkage (e.g., static global variables), and variables with no linkage (e.g., local variables).

I'm trying to know whether it's the compiler or the linker that allocates memory. My current understanding is that the compiler initially allocates memory because source files are compiled separately, and then the linker updates the final memory locations. I would appreciate clarification on this process.
It varies across platforms and languages. The compiler generates an object file and that contains things called "sections". There are conventions for naming these sections and that enables sections with the same name in different OBJ files, to be gathered together when the linker generates the final EXE file.

The compiler doesn't really "allocate" memory, what it does do is describes what kind of memory is be allocated when the code runs. The compiler will determine things like the offset of variables relative to either some static base address or a stack frame base address, it also knows how big these variables are.

So from the compiler's perspective every variable has an offset and length, it calculates these offsets and lengths itself the programmer doesn't (or rarely) need to worry about it.

The offsets it assigns are based on the offset of ant preceding variable and the length of that preceding variable and the alignment needs for the type of the variable.

All of this descriptive info is stored in a tree called the "symbol table" and this gets built as the compiler analyzes the source code.

A lot goes on when you compile and link code, and it can be daunting to understand the details at first.

You can use tools like this to get some insight into the details of generated object files, eg:

http://www.sunshine2k.de/coding/javascript/onlineelfviewer/onlineelfviewer.html

Also grab a coffee sometime and read this:

https://www.caichinger.com/elf.html

Most programming languages do not define stuff like how to lay out variables in memory or how to determine their address etc. So for that reason studying just the language (say C) isn't enough to understand all this, the language doesn't care and it is up to the compiler designers to work out these details and how to represent it all in an ELF or COFF file.
 
Top