Functions in Assembly Language
In x86-64 assembly, a function is a named block of code that performs a specific task. Functions can be called an unlimited number of times, making them highly efficient whenever a task needs to be repeated.
Functions in assembly have names, which allows developers to enhance readability and ease debugging by choosing a good, descriptive name. As with other programming languages, functions are used to organize code, improve code reusability, and facilitate modular programming.
Understanding the mechanism of function calling and returning is an essential aspect of learning assembly programming.
More specifically, the process of executing a function involves:
(1) pushing a return address onto the stack,
(2) setting the instruction pointer (RIP) to the function address,
(3) execution of function code,
(4) popping the return address off the stack, and
(5) resetting the instruction pointer (RIP) to the next address after the return address.
A helpful way to think of this is in terms of the CALL (call) and RET (return) instructions. The first two steps are executed by the CALL instruction, and the last two are executed by RET:
CALL => Function Execution => RET
In this article, we’ll take a deep look at functions in x86-64 assembly, but this lesson is applicable to other flavors of assembly as well.
Function Declaration in x86-64 Assembly
In x86-64 assembly, a function is declared using a label followed by a colon.
myfunc:
; The function body goes here
For example, my_function:
declares a function named my_function
. The function body contains the code that define its behavior. Functions can also have local variables, which are typically allocated on the stack.
Calling a Function in x86-64 Assembly
To call a function, we can use the CALL
instruction, followed by the name of the function or its address.
CALL my_func
The CALL
instruction does two key things. First, it pushes the return address (the address of the next instruction) onto the stack. Second, it jumps to the function by setting the instruction pointer (RIP) to the address of the function in memory.
Returning From a Function
After the function finishes executing, we need a way to return to ordinary program execution. This is called ‘returning’, and is done using the RET instruction.
my_func:
<snip>
RET
Like CALL, RET does two key things and essentially reverses the steps taken by CALL. First, it pops the return address off the stack. Second, it re-assigns the instruction pointer to the return address, resuming the normal control flow of the program outside of the function.
- To return from a function, you use the
RET
instruction. - The
RET
instruction pops the return address from the stack and jumps to that address, returning control to the caller.
Function Stack Frames in Assembly
When a function gets called, a stack frame is set up in order to support the function. A stack frame is a region of memory on the stack that is used to store information related to a function call. It typically includes the following components:
- Return address: The address of the next instruction to be executed after the function call. The return address is pushed onto the stack by the
CALL
instruction and popped off the stack by theRET
instruction. - Saved base pointer (
RBP
): The base pointer (RBP
) is used as a reference point for accessing function arguments and local variables. It is saved at the beginning of the function and restored before the function returns. - Local variables: Space on the stack reserved for variables that are local to the function. These variables are accessed relative to the base pointer (
RBP
). - Function arguments: Used to store function arguments passed by the caller. Like local variables, function arguments are typically accessed relative to the base pointer (
RBP
). - Temporary values: Space on the stack used for temporary storage during the execution of the function.
The layout of the function stack frame is usually determined by the compiler and the calling convention being used.
The base pointer (RBP
) is used to access the different components of the stack frame, providing a structured way to access function arguments and local variables.
Assembly Function Prologue and Epilogue
A function stack frame is set up by a function prologue, and the process is reversed by a function epilogue.
While the concepts of function prologue and epilogue are based on convention (and not part of the assembly language itself), they are helpful in understanding the underlying process.
Function Prologue
The function prologue is responsible for preparing the stack and registers for use by the function. In order to set up the stack frame, a function prologue will often look like the following:
push rbp
mov rbp, rsp
sub rsp, N
The first instruction ‘push rbp‘ saves the previous base pointer onto the stack so it can be retrieved later.
The second instruction ‘mov rbp, rsp‘ sets the base pointer to the current stack pointer.
The third instruction ‘sub rsp, N‘ decrements the stack pointer by some value ‘N’ to make space for the function to use.
Function Epilogue
The function epilogue reverses the process performed by the prologue and prepares the stack and registers for control to be handed back to the caller function.
A typical function epilogue looks like the following:
mov rsp, rbp
pop rbp
ret
Notice that the first two instructions perform the exact opposite of the prologue! First, ‘mov rsp, rbp’ sets the stack pointer to the value of the (previous) base pointer. Then ‘pop rbp’ pops the value of the previous base pointer off the stack and into rbp. Recall that this value was originally pushed onto the stack at the beginning of the function prologue.
Example Function Call in x86-64 Assembly
This example demonstrates all of the concepts discussed so far:
section .text
global _start
_start:
; Call the foo function
CALL foo
; Exit the program
mov rax, 60 ; syscall number for exit
xor rdi, rdi ; exit code 0
syscall
foo:
; Function prologue
push rbp ; Save the previous base pointer on the stack
mov rbp, rsp ; Set the base pointer to the current stack pointer
; Allocate space for local variables
sub rsp, 8 ; Reserve 8 bytes for two 4-byte local variables
; Access local variables
mov dword [rbp-4], 42 ; Store a value in the first local variable
mov eax, dword [rbp-4] ; Load the value into eax
; Function epilogue
mov rsp, rbp ; Restore the stack pointer
pop rbp ; Restore the base pointer
RET ; Return from the function
In this example, the foo
function demonstrates a typical function prologue and epilogue, where the base pointer (RBP
) is saved and restored, and space is allocated for local variables on the stack.