Functions in Assembly Language

In x86-64 assembly, a function is a named block of code that performs a specific task. Functions can be called an unlimited number of times, making them highly efficient whenever a task needs to be repeated.

Functions in assembly have names, which allows developers to enhance readability and ease debugging by choosing a good, descriptive name. As with other programming languages, functions are used to organize code, improve code reusability, and facilitate modular programming.

Understanding the mechanism of function calling and returning is an essential aspect of learning assembly programming.

More specifically, the process of executing a function involves:

(1) pushing a return address onto the stack,
(2) setting the instruction pointer (RIP) to the function address,
(3) execution of function code,
(4) popping the return address off the stack, and
(5) resetting the instruction pointer (RIP) to the next address after the return address.

A helpful way to think of this is in terms of the CALL (call) and RET (return) instructions. The first two steps are executed by the CALL instruction, and the last two are executed by RET:

CALL => Function Execution => RET

In this article, we’ll take a deep look at functions in x86-64 assembly, but this lesson is applicable to other flavors of assembly as well.

Function Declaration in x86-64 Assembly

In x86-64 assembly, a function is declared using a label followed by a colon.

myfunc:
    ; The function body goes here

For example, my_function: declares a function named my_function. The function body contains the code that define its behavior. Functions can also have local variables, which are typically allocated on the stack.

Calling a Function in x86-64 Assembly

To call a function, we can use the CALL instruction, followed by the name of the function or its address.

CALL my_func

The CALL instruction does two key things. First, it pushes the return address (the address of the next instruction) onto the stack. Second, it jumps to the function by setting the instruction pointer (RIP) to the address of the function in memory.

Returning From a Function

After the function finishes executing, we need a way to return to ordinary program execution. This is called ‘returning’, and is done using the RET instruction.

my_func: 
    <snip>

    RET

Like CALL, RET does two key things and essentially reverses the steps taken by CALL. First, it pops the return address off the stack. Second, it re-assigns the instruction pointer to the return address, resuming the normal control flow of the program outside of the function.

To return from a function, you use the RET instruction.
The RET instruction pops the return address from the stack and jumps to that address, returning control to the caller.

Function Stack Frames in Assembly

When a function gets called, a stack frame is set up in order to support the function. A stack frame is a region of memory on the stack that is used to store information related to a function call. It typically includes the following components:

Return address: The address of the next instruction to be executed after the function call. The return address is pushed onto the stack by the CALL instruction and popped off the stack by the RET instruction.
Saved base pointer (RBP): The base pointer (RBP) is used as a reference point for accessing function arguments and local variables. It is saved at the beginning of the function and restored before the function returns.
Local variables: Space on the stack reserved for variables that are local to the function. These variables are accessed relative to the base pointer (RBP).
Function arguments: Used to store function arguments passed by the caller. Like local variables, function arguments are typically accessed relative to the base pointer (RBP).
Temporary values: Space on the stack used for temporary storage during the execution of the function.

The layout of the function stack frame is usually determined by the compiler and the calling convention being used.

The base pointer (RBP) is used to access the different components of the stack frame, providing a structured way to access function arguments and local variables.

Assembly Function Prologue and Epilogue

A function stack frame is set up by a function prologue, and the process is reversed by a function epilogue.

While the concepts of function prologue and epilogue are based on convention (and not part of the assembly language itself), they are helpful in understanding the underlying process.

Function Prologue

The function prologue is responsible for preparing the stack and registers for use by the function. In order to set up the stack frame, a function prologue will often look like the following:

push rbp
mov rbp, rsp
sub rsp, N

The first instruction ‘push rbp‘ saves the previous base pointer onto the stack so it can be retrieved later.
The second instruction ‘mov rbp, rsp‘ sets the base pointer to the current stack pointer.
The third instruction ‘sub rsp, N‘ decrements the stack pointer by some value ‘N’ to make space for the function to use.

Function Epilogue

The function epilogue reverses the process performed by the prologue and prepares the stack and registers for control to be handed back to the caller function.

A typical function epilogue looks like the following:

mov rsp, rbp
pop rbp
ret

Notice that the first two instructions perform the exact opposite of the prologue! First, ‘mov rsp, rbp’ sets the stack pointer to the value of the (previous) base pointer. Then ‘pop rbp’ pops the value of the previous base pointer off the stack and into rbp. Recall that this value was originally pushed onto the stack at the beginning of the function prologue.

Example Function Call in x86-64 Assembly

This example demonstrates all of the concepts discussed so far:

section .text
global _start

_start:
    ; Call the foo function
    CALL foo

    ; Exit the program
    mov rax, 60         ; syscall number for exit
    xor rdi, rdi        ; exit code 0
    syscall

foo:
    ; Function prologue
    push rbp            ; Save the previous base pointer on the stack
    mov rbp, rsp        ; Set the base pointer to the current stack pointer

    ; Allocate space for local variables
    sub rsp, 8          ; Reserve 8 bytes for two 4-byte local variables

    ; Access local variables
    mov dword [rbp-4], 42     ; Store a value in the first local variable
    mov eax, dword [rbp-4]    ; Load the value into eax

    ; Function epilogue
    mov rsp, rbp        ; Restore the stack pointer
    pop rbp             ; Restore the base pointer
    RET                 ; Return from the function

In this example, the foo function demonstrates a typical function prologue and epilogue, where the base pointer (RBP) is saved and restored, and space is allocated for local variables on the stack.