Question about C compilation process with respect to include files

Thread Starter

integral

Joined Aug 19, 2010
22
Hello,
I have a question about the C compilation process.
Please see the image at the link below for a visual depiction of the situation:
http://imgur.com/3l0yOys

From what I understand:
hello.c has #include <stdio.h>
stdio.c has #include <stdio.h>

Will the final binary file have two copies of the zeros and ones that make up “stdio.h”?
 

Attachments

ErnieM

Joined Apr 24, 2011
8,377
Nope.

First of all, a properly written dot h file has no code in it. All it does is define the interface to other code, such as a standard library or a block of code you have written yourself.

The code itself resides in a library or another dot C file.

There is another program that runs after the compiler called the linker. The linker puts together all the pieces of code into one piece that can be run by organizing them into memory and filling in all the blanks (such as jump or call addresses).

Next, if the library is written in small single function pieces the linker will only use the functions that are actually used, and leave the rest aside to make the smallest package. It is typical for a linker to remove any functions that are not actually used, though this may be disabled, or turned off in free versions.
 

Papabravo

Joined Feb 24, 2006
21,225
Including a header file twice is not harmful unless the two header files have the same name and different contents. In this case you'll probably get a warning about symbols being redefined with different values. In many header files you'll see some statements that check for the definition of a symbol. If it is defined the rest of the header file is ignored. If it is not defined then the symbol is defined and the remainder of the header file is included. Thus in any complex collection of header files, each one is included once and only once.
 

Thread Starter

integral

Joined Aug 19, 2010
22
Nope.

First of all, a properly written dot h file has no code in it. All it does is define the interface to other code, such as a standard library or a block of code you have written yourself.

The code itself resides in a library or another dot C file.

There is another program that runs after the compiler called the linker. The linker puts together all the pieces of code into one piece that can be run by organizing them into memory and filling in all the blanks (such as jump or call addresses).

Next, if the library is written in small single function pieces the linker will only use the functions that are actually used, and leave the rest aside to make the smallest package. It is typical for a linker to remove any functions that are not actually used, though this may be disabled, or turned off in free versions.
Thanks for the response and I am still confused. I will rephrase my question.

I agree with what you said but I dont understand why the answer is no.
Referring to the diagram I included in my first post.

Consider hello.c to contain:
Code:
#include <stdio.h>
int main(void)
{
  printf("hello world\n");
}
and stdio.c to contain something like that shown here: http://www.jbox.dk/sanos/source/lib/stdio.c.html

Both stdio.h and stdio.c contain the "#include <stdio.h> line.
If my understanding is correct the follow are true (please correct if I am wrong):
Item 1:
stdio.c is already compiled so that when I write a program (like hello.c) that uses functions that are defined in stdio.c I dont have to compile stdio.c all the time.

Item 2:
If my understanding of the preprocessor step is correct the #include directive just copies the contents of the include file and pastes it into the file that contains the preprocessor directive. This is what I observed when I run a command like "gcc -E hello.c". When I look at the outputed file I can see that the contents of "stdio.h" have been copied into hello.c

Question
So my question is that if hello.c and stdio.h both contain the contents of "stdio.h" after the preprocessing, compilation and assembly step will the final binary how two "copies" of what was in "stdio.h" or will the "linker" step remove one of the "copies" of stdio.h
 

Papabravo

Joined Feb 24, 2006
21,225
The answer is still no. You should not think of the compiler output as containing copies of files. What the compiler output contains is translations of all the information that is seen in the compilation process. In simpler terms. A header file, stdio.h, is translated by the compiler into a symbol table. This symbol table is used to generate code for stdio.c and for main.c. Once the code has been generated the symbol table is discarded since the information from the symbol table has been captured by the code in both stdio.c and main.c
 

ErnieM

Joined Apr 24, 2011
8,377
Question
So my question is that if hello.c and stdio.h both contain the contents of "stdio.h" after the preprocessing, compilation and assembly step will the final binary how two "copies" of what was in "stdio.h" or will the "linker" step remove one of the "copies" of stdio.h
Actually, the linker doesn't remove function at all, it only adds it the first time as the compiler makes a list of unfulfilled symbols for the linker to find. The linker is quite happy when it finds a symbol just once, no matter how many times it gets used. Otherwise you would have N-coppies of the code for a function that is called N times.

Nothing bad happens. Each include file uses an include guard, an #ifndef that surrounds the rest of the file. If the file has already been included elsewhere, the second include results in a blank file being included. See: https://en.wikipedia.org/wiki/Include_guard
Code guards only guard within a single C file. When the next object gets compiled the dot H files get included fresh from the start.
 

vpoko

Joined Jan 5, 2012
267
Code guards only guard within a single C file. When the next object gets compiled the dot H files get included fresh from the start.
Right, include guards work at the level of the source code, but that's really the only place they're relevant. Usually, for each C file (along with the relevant function prototypes in the header files), one object file is generated, and it doesn't include any information on the prototypes (except in the debugging symbols, if included). Everything normally defined in a header file is only relevant to the compiler and is "thrown away" after the compilation step. When the linker sees it, it fills in addresses for all of the external calls that are actually used (as opposed to merely defined via prototypes).
 
Last edited:

Papabravo

Joined Feb 24, 2006
21,225
There is one dangerous situation. If you have on your system multiple header files with the same name and different contents, it is possible to create subtle bugs are devilishly hard to find. ALWAYS make sure your compiler picks it's header files from the CORRECT folder.
 
Top