push 17You commonly use push and pop to save registers at the start and end of your function. You have the right to use rcx, rdx, rsi, rdi, and r8-r11 for anything you like, no questions asked. However, you MUST put rsp back where you found it, as well as rbp, rbx, and r12-r15. For example, "rbp" is a preserved register, so you need to save its value before you can use it:
push 23
pop rax
pop rcx
ret
push rbp ; save old copy of this registerIf you have multiple registers to save and restore, be sure to pop them in the *opposite* order they were pushed:
mov rbp,23
mov rax,rbp
pop rbp ; restore main's copy from the stack
ret
push rbp ; save old copy of this registerOne big advantage to saved registers: you can call other functions, and know that they won't change their values. All the scratch registers, by contrast, are likely to get overwritten.
push r15
mov rbp,23
mov rax,rbp
pop r15 ; restore main's copy from the stack
pop rbp
ret
Address |
Contents |
|
0x000...000 |
"low memory" |
|
unused stack area |
(you can claim this space) |
|
rsp-> |
end of reserved data |
"top of the stack" |
reserved stack data |
(main's variables) |
|
0xfff...fff |
"high memory" |
sub rsp,16 ; I claim the next sixteen bytes in the name of... me!Here's how we'd allocate one hundred long integers on the stack, then use just one of them:
mov QWORD [rsp],1492 ; store a long integer into our stack space
mov rax,QWORD [rsp] ; read our long from where we stored it
add rsp,16 ; Hand back the stack space
ret
sub rsp,800 ; I claim the next eight hundred bytesThese are handy if you've only got one integer to stick on or pull off the stack. In 32-bit mode, push and pop are really useful for function arguments, which by convention in 32-bit mode are stored on top of the stack when you call the function:
mov rdi,rsp ; points to the start of our 100-integer array
add rdi,320 ; jump down to integer 40 in the array
mov QWORD [rdi],1492 ; store an integer into our stack space
mov rax,QWORD [rdi] ; read our integer from where we stored it
add rsp,800 ; Hand back the stack space
ret
push 19
extern print_int
call print_int
pop eax ; MUST clean up the stack
ret
(Try this in NetRun now!) (32-bit mode)
This prints the "19" that's stored on top of the stack. In 32-bit mode, all function arguments are stored on the stack (unlike registers for 64-bit code). This means the stack is a rather funny mix of function arguments, local and temporary variables, totally unused space for alignment, etc.push rbp; stash old value of rbp on the stackrbp isn't used very often in 64-bit mode, but in 32-bit mode it's almost standard. The piece of the stack around the base pointer is often called the function's "stack frame": negative offsets get to the function's local variables, positive offsets get to the caller's parameters, and directly at rbp is the saved copy of the old rbp. This effectively makes a chain of rbp pointers (assuming every function uses the frame pointer correctly); on some machines you can "unwind the stack" or print a "stack trace" by following this chain of pointers.
mov rbp,rsp; rbp == stack pointer at start of function
sub rsp,1000 ; make some room on the stack
mov QWORD[rbp-4],7 ; local variables are at negative offsets from the base pointer
mov eax,QWORD[rbp-4]; same local variable
mov rsp,rbp; restore stack pointer (easier than figuring the correct "add"!)
pop rbp; restore rbp
ret
(Try this in NetRun now!)
Instruction |
Equivalent Instruction Sequence |
call bar |
push next_instruction jmp bar next_instruction: |
ret |
pop rdx (rdx or some other scratch register; ret doesn't modify any registers) jmp rdx |
Assembly |
C/C++ |
|
Call |
call make_beef
|
int make_beef(void); (Try this in NetRun now!) |
Jump |
jmp make_beef (Try this in NetRun now!) |
int foo(void) { Again, "make_beef" never comes back, so we get 0xBEEF. |
jmp make_beefBut the "call" instruction allows "ret" to jump back to the right place automatically, by pushing the return address on the stack. "ret" then pops the return address and goes there:
come_back:
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
jmp come_back
push come_back ; - simulated "call" -
jmp make_beef ; - continued -
come_back: ; - end of simulated "call" -
mov eax,0xC0FFEE
ret
make_beef:
mov eax,0xBEEF
pop rcx ; - simulated "ret" -
jmp rcx ; - end of simulated "ret" -
There's a very weird hacky way to figure out what address your code
is running from: call the next instruction, and then pop the return
address that "call" pushed!
call nextline
nextline:
pop rax ; rax will store the location in memory of nextline
ret
This is only useful if your code doesn't know where in memory it
will get loaded. This is true for some shared libraries, where
you see exactly the above instruction sequence!
Because calling functions takes some overhead (push return address,
call function, do work, pop return address, return there), recursion is
slower than iteration. For example:
C++ Plain Recursion |
Assembly Plain Recursion |
int sum(int i) { |
mov rdi,10000000 |
Folks who love recursion have found an interesting optimization
called "tail recursion", where you arrange for there to be *nothing*
for the function to do after recursing, so there's no point in your
children returning to you--you just "jmp" to them, not "call", because
you don't want them to ever come back to you. The base case is the only "ret". Here's an example:
C++ Tail Recursion |
Assembly Tail Recursion |
int sum(int i,int partial) { |
mov edi,1000000000 ; sum first argument |
Tail recursion eliminates both the memory used on the stack for the
return addresses, and the time taken for the call and return. It
can make recursion exactly as fast as iteration. (Curiously, you
can always transform an iteration into a recursion, and vice versa,
although the code may get nasty.)
Sadly, my version of the gcc compiler doesn't seem to do the tail
recursion optimization anymore on 64-bit machines, although it used to
do them on 32-bit machines.
int silly_recursive(int i) {The same computation works fine (aside from integer overflow) when written as an iteration, not a recursion, because iteration doesn't touch the stack:
if (i==0) return 0;
else return i+silly_recursive(i-1);
}
int foo(void) {
std::cout<<"Returns: "<<silly_recursive(read_input());
return 2;
}
int silly_iterative(int n) {
int sum=0;
for (int i=0;i<=n;i++) sum+=i;
return sum;
}
int foo(void) {
std::cout<<"Returns: "<<silly_iterative(read_input());
return 2;
}
int happy_innocent_code(void) {The "cin>>str" line in happy can overwrite happy's stack space with whatever's in the read-in string, if the read-in string is longer than 7 bytes. So you can get a horrific crash if you just enter any long string, because the correct return address is overwritten with string data.
char str[8];
cin>>str;
cout<<"I just read a string: "<<str<<"! I'm a big boy!\n";
return 0;
}
void evil_bad_code(void) {
cout<<"Mwa ha ha ha...\n";
cout<<"...er, I can't return. Crashing.\n";
}
int foo(void) {
//void *p=(void *)evil_bad_code; /* address of the bad code */
//printf("evil code is at: '%4s'\n",(char *)&p);
happy_innocent_code();
cout<<"How nice!\n";
return 0;
}