StackZero
  • Homepage
  • Cryptography and Privacy
  • Ethical Hacking
  • Reverse Engineering
  • Contacts
  • About Me
No Result
View All Result
StackZero
No Result
View All Result

Compilation Process in C: Easy Introduction

June 1, 2022
in Reverse Engineering
0 0
Compilation Process in C: Easy Introduction
0
SHARES
514
VIEWS
Share on FacebookShare on Twitter

Table of Contents

Toggle
  • Definition
  • Preprocessing
  • Compilation
  • Assembly
  • Linking

Definition

The compilation in c (and more in general in compiled languages) is the process that, starting from a human-readable source code, generates an executable Binary.

The full process of compilation, in C, consists of four phases that can be summarized as follows:

  • Preprocessing
  • Compiling
  • Assembly
  • Linking

Many compilers merge some phases.

compilation process in c
Linux compilation process in c

Preprocessing

Except for very simple programs, the compilation process in C must manage many source files. Every source file contains macros (#define) and includes (#include).
The preprocessor is in charge of expanding #define and #include directives, in other words, it simply replaces the directive with the corresponding code.

In order to better understand, we can see a simple HelloWorld program in C and try to preprocess that, using GCC.

int helloworld();


#include <stdio.h>
#include "helloworld.h"


#define HELLOWORLD "Hello World!\n"

int helloworld()
{
    printf("%s", HELLOWORLD);
    return 0;
}

int main(){

    helloworld();
    return 0;
}

Once we have written this code, we can just preprocess without full compilation with GCC in this way:

gcc -E -P helloworld.c

It will print out something like this:

typedef long unsigned int size_t;
typedef __builtin_va_list __gnuc_va_list;
typedef unsigned char __u_char;
typedef unsigned short int __u_short;
typedef unsigned int __u_int;
typedef unsigned long int __u_long;
typedef signed char __int8_t;
typedef unsigned char __uint8_t;
typedef signed short int __int16_t;
typedef unsigned short int __uint16_t;
typedef signed int __int32_t;
....
int helloworld();
int helloworld()
{
    printf("%s", "Hello World!\n");
    return 0;
}
int main(){
    helloworld();
    return 0;
}

The first part skipped for brevity, is the expansion of the stdio header and the second one is exactly the expansion of our header and macro.

Compilation

The next step is the compilation, that is, in a nutshell, the conversion from source code to assembly.
Apparently, it seems to make no sense, why don’t output directly the binary?
This intermediate step allows making easier the next work, it’s just needed to write a unique assembler for all programming languages.

During compilation, the compiler makes some optimizations and keeps symbols (except in the case of stripping).

Just to understand better, we can get the assembly output passing to GCC the option -S.
For readability reasons we won’t make any optimization

gcc -S helloworld.c 

The output will be a .s file that is the default extension for ASM files.

.LC0:
	.string	"Hello World!"
	.text
	.globl	helloworld
	.type	helloworld, @function
helloworld:
.LFB0:
	.cfi_startproc
	endbr64
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	leaq	.LC0(%rip), %rdi
	call	puts@PLT
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc
.LFE0:
	.size	helloworld, .-helloworld
	.globl	main
	.type	main, @function
main:
.LFB1:
	.cfi_startproc
	endbr64
	pushq	%rbp
	.cfi_def_cfa_offset 16
	.cfi_offset 6, -16
	movq	%rsp, %rbp
	.cfi_def_cfa_register 6
	movl	$0, %eax
	call	helloworld
	movl	$0, %eax
	popq	%rbp
	.cfi_def_cfa 7, 8
	ret
	.cfi_endproc

If you prefer Intel syntax instead of AT&T, the right command is:

gcc -S masm=intel 

Assembly

The Assembler’s job is to convert the ASM file into an object file, also called a “module“.
The output is machine code, and to every ASM file, corresponds an object file.

Let’s try that in practice:

gcc -c helloworld.c -o helloworld.o

It will output the object file, and if we want to understand it better, we can use the file command.

file helloworld.out

#OUTPUT
helloworld.out: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

It says that is an ELF file (Executable and Linkable Format) and that is relocatable.

This means that it doesn’t need a fixed memory address, and the compilation runs independently from other objects.
It’s also a clear indicator that we are facing an object and not an executable.
The object can contain references to other objects’ functions, and before linking, they are just replaced with relocation symbols, so obviously an object cannot work before linking to other ones.

But now it’s time to move to the linking phase.

Linking

The linker is in charge of performing the final step: merging all objects in a single executable.

At a glance, it merges all objects and resolves symbolic references even to libraries.
The libraries in Linux can be of two types:

  • Static (An instance for every executable)
  • Shared/Dynamic (An instance shared between all processes)

The linker merges into executable the static libraries but doesn’t know shared/dynamic libraries’ addresses,
so, in this case, leaves symbolic references.

Usually, the linker and the compiler are separated entities, anyway, GCC calls it automatically at the end of the compilation process.

We’ll see all phases, linking included, just by writing this command:

gcc helloworld.c -o helloworld.out

In this case, the output of the file command will be:

file helloworld.out


#OUTPUT
a.out: ELF 64-bit LSB executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=c5cc275bf56a938597e5a05fc410eaa9e519f422, for GNU/Linux 3.2.0, not stripped

So finally we obtained our executable, and maybe we understood better the entire process.


I hope that you liked the content of the article, in this case, I invite you to read the other articles and feel free to make any questions or suggest any improvements.

How to embed shellcode payload into an executable
Trending
How to embed shellcode payload into an executable

Tags: assemblerccompilationcompilergcclibrarieslinker
Previous Post

Python to Exe: Ultimate Guide for Windows

Next Post

How to prank your friends with this hilarious wallpaper locker!

Next Post
How to prank your friends with this hilarious wallpaper locker!

How to prank your friends with this hilarious wallpaper locker!

You might also like

Cryptographic functions

Cryptographic Hash Functions in Python: Secure Your Data Easily

November 3, 2024
Malware Obfuscation Techniques: All That You Need To Know

Malware Obfuscation Techniques: All That You Need To Know

March 25, 2024
How To Do Process Enumeration: An Alternative Way

How To Do Process Enumeration: An Alternative Way

March 4, 2024
How To Do DLL Injection: An In-Depth Cybersecurity Example

How To Do DLL Injection: An In-Depth Cybersecurity Example

February 8, 2024
Process Injection By Example: The Complete Guide

Process Injection By Example: The Complete Guide

January 24, 2024
How To Build Your Own: Python String Analysis for Malware Insights

How To Build Your Own: Python String Analysis for Malware Insights

November 10, 2023

StackZero

StackZero is a specialized technical blog dedicated to the realm of cybersecurity. It primarily provides insightful articles and comprehensive tutorials designed to educate readers on developing security tools. The blog encompasses a broad spectrum of subjects, starting from the foundational principles of cryptography and extending to more sophisticated areas such as exploitation and reverse engineering. This makes StackZero an invaluable resource for both beginners and professionals in the field of cybersecurity.
The blog covers a wide range of topics, from the basics of cryptography to the more advanced topics of exploitation and reverse engineering.

Tags

application security blind sqli blind sql injection bruteforce c cesar cipher command injection cryptography ctf cybersecurity debugging dom-based xss dvwa ethical-hacking ethical hacking exploitation file inclusion gdb hacking injection javascript malware malware analysis malware evasion network-security pentesting lab picoctf pico ctf python reflected xss reverse engineering sql sqli sql injection static analysis stored xss substitution substitution cipher vulnerable application web application security web exploitation web security windows windows api xss
  • About Me
  • Contacts
  • HomePage
  • Opt-out preferences
  • Privacy Policy
  • Terms and Conditions

Welcome Back!

Login to your account below

Forgotten Password?

Retrieve your password

Please enter your username or email address to reset your password.

Log In
Manage Cookie Consent
To provide the best experiences, we use technologies like cookies to store and/or access device information. Consenting to these technologies will allow us to process data such as browsing behavior or unique IDs on this site. Not consenting or withdrawing consent, may adversely affect certain features and functions.
Functional Always active
The technical storage or access is strictly necessary for the legitimate purpose of enabling the use of a specific service explicitly requested by the subscriber or user, or for the sole purpose of carrying out the transmission of a communication over an electronic communications network.
Preferences
The technical storage or access is necessary for the legitimate purpose of storing preferences that are not requested by the subscriber or user.
Statistics
The technical storage or access that is used exclusively for statistical purposes. The technical storage or access that is used exclusively for anonymous statistical purposes. Without a subpoena, voluntary compliance on the part of your Internet Service Provider, or additional records from a third party, information stored or retrieved for this purpose alone cannot usually be used to identify you.
Marketing
The technical storage or access is required to create user profiles to send advertising, or to track the user on a website or across several websites for similar marketing purposes.
Manage options Manage services Manage {vendor_count} vendors Read more about these purposes
View preferences
{title} {title} {title}
No Result
View All Result
  • Homepage
  • Cryptography and Privacy
  • Ethical Hacking
  • Reverse Engineering
  • Contacts
  • About Me