The Stack
CS 301 Lecture, Dr. Lawlor
The Stack
There's one super-important pointer that you have to use all the time
in assembly language: the "stack pointer", pointed to by register called "rsp"
(Register: Stack Pointer). "The stack" is a frequently-used area
of memory that functions as temporary storage--say, as space for local
variables when a function runs out of room, or to pass parameters to
the next function.
Conceptually, the stack is divided into two areas: on top is the space
that's in use (that you can't change!), and then below it the space
that isn't in use (free space). The stack pointer points to the
last in-use byte of the stack. The standard convention is that
when your function starts up, you can claim some of the stack by moving
the stack pointer down--this indicates to any functions you might call
that you're using those bytes of the stack. You can then use that
memory for anything you want, as long as you move the stack pointer
back up before your function returns.
It's a little weird that the stack starts at high addresses and grows
downward, like a stalactite; whereas everything else on the machine
(arrays, malloc space, strings, even integers) starts at low addresses
and grows upward. The reason is historical: on ancient machines
with only a little memory space to work with, they'd put their data at
one end of memory (near address zero), and the stack as far away as it
could get, near high memory. Then the program's data or stack
space could grow as far as possible without overwriting the
other. Of course, on a 64-bit machine you've got billions of
gigabytes of address space, so you're unlikely to run out no matter
which way the stack grows, but we're stuck with the convention that
"the stack grows down".
Sadly, if you screw up the stack, such as by forgetting to move the
stack pointer back, or overwriting part of the stack that isn't yours,
then the function that called you (such as main) will normally crash
horribly. So be careful with the stack!
Here's how we allocate one integer on the stack, then read and write it:
sub rsp,16 ; I claim the next sixteen bytes in the name of... me!
mov QWORD [rsp],1492 ; store a long integer into our stack space
mov rax,QWORD [rsp] ; read our integer from where we stored it
add rsp,16 ; Hand back the stack space
ret
(Try this in NetRun now!)
Here's how we'd allocate one hundred long integers on the stack, then use just one of them:
sub rsp,800 ; I claim the next eight hundred bytes
mov rdi,rsp ; points to the start of our 100-integer array
add rdi,320 ; jump down to integer 40 in the array
mov QWORD [rdi],1492 ; store an integer into our stack space
mov rax,QWORD [rdi] ; read our integer from where we stored it
add rsp,800 ; Hand back the stack space
ret
(Try this in NetRun now!)
There are even special instructions for putting stuff onto the stack and taking it back off, called "push" and "pop":
- "push thing" makes space for thing on the stack, and copies the value of thing into memory there. It's the same as "sub rsp,8" and then "mov QWORD [rsp],thing".
- "pop thing" copies whatever is on top of the stack into thing, then removes that space from the stack. It's the same as "mov thing,QWORD [rsp]" followed by "add rsp,8".
These are handy if you've only got one integer to stick on or pull off
the stack. In 32-bit mode, push and pop are really useful for function
arguments, which by convention in 32-bit mode are stored on top of the stack when you
call the function:
push 19
extern print_int
call print_int
pop eax ; MUST clean up the stack
ret
(Try this in NetRun now!)
This prints the "19" that's stored on top of the stack. In 32-bit
mode, all function arguments are stored on the stack (unlike registers
for 64-bit code).
This means the stack is a rather funny mix of function arguments, local
and temporary variables, totally unused space for alignment, etc.
Saved Registers
As we've seen, the x64 calling conventions
say that (page 21) eax or rax holds the return value from a function.
You have the right to use rcx, rdx, rsi, rdi, and r8-r11 for anything you like, no questions asked. However, you MUST put
rsp back where you found it, as well as rbp, rbx, and r12-r15.
rsp, rbp, rbx, and r12-r15 are called "saved" or "preserved" registers,
since you *can* use them, but you *must* put them back to where they
were before you return. The standard place to save registers is
on the stack, for example by pushing their old value at the start of
your function, then popping their old value back at the end of your
function. For example:
push r15 ; save old value in r15
mov r15,3 ; use r15 for some computation
add r15,r15
mov rax,r15 ; read our value back out of r15
pop r15 ; restore old value of r15
ret
(Try this in NetRun now!)
If you have several registers to save, be sure to restore them all in
the correct order! Because "pop" pulls its data off the top of
the stack, the first thing you "pop" should be the last thing you
"push"ed:
push r15 ; save registers
push r14
mov r15,3
mov r14,5
add r15,r14
mov rax,r15 ; read our value back out of r15
pop r14 ; restore registers
pop r15
ret
(Try this in NetRun now!)
Annoyingly, you can sometimes get away with using a preserved register
without saving it ("trashing" the register). However, this is
very dangerous, because someday some piece of code that calls your code
might change to store an important value there, and that code will now
crash! NetRun's main function now contains some code to try to
detect modifications to preserved registers.
One big advantage to saved registers: you can call other functions, and
know that they won't change their values. All the scratch
registers, by contrast, are likely to get overwritten.
Stack Frames: rbp
There's one fairly handy saved register called rbp, which means
"extended base pointer". Here's the standard use of rbp: to stash
the value of the stack pointer at the start of the function. This
is sometimes a little easier than indexing from rsp directly, since rsp
changes every time you push or pop--rbp, by contrast, can stay the same
through your entire function.
push rbp; stash old value of rbp on the stack
mov rbp,rsp; rbp == stack pointer at start of function
sub rsp,1000 ; make some room on the stack
mov QWORD[rbp-4],7 ; local variables are at negative offsets from the base pointer
mov eax,QWORD[rbp-4]; same local variable
mov rsp,rbp; restore stack pointer (easier than figuring the correct "add"!)
pop rbp; restore rbp
ret
(Try this in NetRun now!)